DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 5 days ago • 171
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 7 days ago • 224
MDGA Collection Make Diffusion Great Again. The resource list for Super Data Learners, Quokka, and OpenMoE 2. • 16 items • Updated Nov 4 • 8
Large Language Models Do NOT Really Know What They Don't Know Paper • 2510.09033 • Published Oct 10 • 16
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6 • 113
UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG Paper • 2510.03663 • Published Oct 4 • 15
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs Paper • 2509.22582 • Published Sep 26 • 10
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data Paper • 2510.02294 • Published Oct 2 • 45
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks Paper • 2510.02286 • Published Oct 2 • 28
The Rogue Scalpel: Activation Steering Compromises LLM Safety Paper • 2509.22067 • Published Sep 26 • 27
CLUE: Non-parametric Verification from Experience via Hidden-State Clustering Paper • 2510.01591 • Published Oct 2 • 26
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 108
jina-reranker-v3: Last but Not Late Interaction for Document Reranking Paper • 2509.25085 • Published Sep 29 • 6
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30 • 535