PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20 • 62
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 108
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18 • 35
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19 • 127
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks Paper • 2508.08240 • Published Aug 11 • 45
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 68
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 191
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 4
DELTA: Dense Efficient Long-range 3D Tracking for any video Paper • 2410.24211 • Published Oct 31, 2024 • 9
Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use Paper • 2410.24218 • Published Oct 31, 2024 • 6
Learning Video Representations without Natural Videos Paper • 2410.24213 • Published Oct 31, 2024 • 16
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks Paper • 2410.24032 • Published Oct 31, 2024 • 10
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays Paper • 2410.21969 • Published Oct 29, 2024 • 10
Constraint Back-translation Improves Complex Instruction Following of Large Language Models Paper • 2410.24175 • Published Oct 31, 2024 • 18