PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design Paper • 2512.04082 • Published 4 days ago • 11
Video Generation Models Are Good Latent Reward Models Paper • 2511.21541 • Published 11 days ago • 45
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper • 2511.23475 • Published 9 days ago • 41
Visual Sync: Multi-Camera Synchronization via Cross-View Object Motion Paper • 2512.02017 • Published 6 days ago • 3
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation Paper • 2511.23127 • Published 9 days ago • 41
MG-Nav: Dual-Scale Visual Navigation via Sparse Spatial Memory Paper • 2511.22609 • Published 10 days ago • 47
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 6 days ago • 172
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model Paper • 2512.01030 • Published 7 days ago • 16
GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation Paper • 2512.01801 • Published 6 days ago • 22
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference Paper • 2512.01031 • Published 7 days ago • 22
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action Paper • 2511.22134 • Published 11 days ago • 21
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published 10 days ago • 65
Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper • 2511.21691 • Published 11 days ago • 32
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Paper • 2511.20937 • Published 12 days ago • 15
G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning Paper • 2511.21688 • Published 11 days ago • 8
iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation Paper • 2511.20635 • Published 12 days ago • 31