Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation Paper • 2510.06961 • Published Oct 8 • 10
ShieldGemma Release Collection A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated Jul 10 • 14
Wan-Animate: Unified Character Animation and Replacement with Holistic Replication Paper • 2509.14055 • Published Sep 17 • 16
Mem-Agent Collection Small sized agents from Dria trained on interacting with an obsidian-like memory system using python tools. Trained on Qwen3-4B-Thinking-2507. • 4 items • Updated Sep 5 • 3
view article Article mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL Sep 11 • 25
view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 Apr 16 • 40
view article Article ScreenSuite - The most comprehensive evaluation suite for GUI Agents! Jun 6 • 55
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 72
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 90