Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.12764

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4 • 15
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15 • 19
AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14 • 11
katanemo/Arch-Router-1.5B

Text Generation • 2B • Updated 22 days ago • 4.06k • • 230

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 28
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 22
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 44
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis

Paper • 2404.16754 • Published Apr 25, 2024
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

Paper • 2505.02829 • Published May 5
MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs

Paper • 2510.01691 • Published Oct 2 • 3

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4 • 15
LLM-guided Hierarchical Retrieval

Paper • 2510.13217 • Published Oct 15 • 19
AnyUp: Universal Feature Upsampling

Paper • 2510.12764 • Published Oct 14 • 11
katanemo/Arch-Router-1.5B

Text Generation • 2B • Updated 22 days ago • 4.06k • • 230

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53
RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis

Paper • 2404.16754 • Published Apr 25, 2024
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

Paper • 2505.02829 • Published May 5
MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs

Paper • 2510.01691 • Published Oct 2 • 3

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 28
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 22
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 44
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs