Faves - a fysp Collection

fysp 's Collections

Faves

Faves

updated Sep 22

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Paper • 2509.16197 • Published Sep 19 • 56
InternRobotics/VLAC

Robotics • 2B • Updated Sep 16 • 43 • 37
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

Paper • 2509.12203 • Published Sep 15 • 19
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning

Paper • 2509.15937 • Published Sep 19 • 20
PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

Paper • 2509.11362 • Published Sep 14 • 4