Collections
Discover the best community collections!
Collections including paper arxiv:2508.01242
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 518 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 166 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 11 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
Tesslate/UIGEN-X-8B
Text Generation • 8B • Updated • 33 • • 59 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 78 • 99 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 11 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 46
-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
-
Qwen2.5 Omni 7B Demo
🏆363Generate text and speech from text, audio, images, and videos
-
F5-TTS
🗣2.71kF5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
-
Kokoro TTS Zero
🎴308✨[With v1.0.0] Accelerated TTS on Kokoro-82M
-
fixie-ai/ultravox-v0_5-llama-3_2-1b
Audio-Text-to-Text • 0.7B • Updated • 405k • 62
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 28 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 43 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
Tesslate/UIGEN-X-8B
Text Generation • 8B • Updated • 33 • • 59 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 78 • 99 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 11 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 46
-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 518 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Qwen2.5 Omni 7B Demo
🏆363Generate text and speech from text, audio, images, and videos
-
F5-TTS
🗣2.71kF5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
-
Kokoro TTS Zero
🎴308✨[With v1.0.0] Accelerated TTS on Kokoro-82M
-
fixie-ai/ultravox-v0_5-llama-3_2-1b
Audio-Text-to-Text • 0.7B • Updated • 405k • 62
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 166 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 11 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 28 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 43 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 54 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12