3 20 41

Susant Achary

Susant-Achary

exoplanet's profile picture

ltim's profile picture

akshar189's profile picture

https://huggingface.co/Susant-Achary

SSusantAchary

AI & ML interests

Tiny to Small Language Models, Building from India. Quantization and MLX

Recent Activity

liked a model 21 days ago

mlx-community/medgemma-27b-it-8bit

upvoted an article 22 days ago

We’re open-sourcing our text-to-image model and the process behind it

liked a model 29 days ago

vandijklab/C2S-Scale-Gemma-2-27B

View all activity

Organizations

Susant-Achary 's collections 28

Vision-LM

meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 183k • • 1.54k
mlx-community/dolphin-vision-72b-4bit

Image-Text-to-Text • 11B • Updated Jul 4, 2024 • 293 • 7
mlx-community/Phi-3.5-vision-instruct-4bit

Text Generation • 0.6B • Updated Apr 19 • 105 • 5

<7B Best of MoE 🧠

Collection of Small size big impact MoE.

LiquidAI/LFM2-8B-A1B

Text Generation • 8B • Updated about 21 hours ago • 15.8k • 261
ibm-granite/granite-4.0-h-tiny

Text Generation • 7B • Updated Nov 3 • 28.9k • 169
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated May 1 • 403k • 1.54k
google/gemma-3n-E4B-it

Image-Text-to-Text • 8B • Updated Jul 14 • 67.6k • 824

Audio Features

laion/clap-htsat-fused

Feature Extraction • 0.2B • Updated Mar 28 • 16.4M • • 43

Feature Extraction with 🧠 Text Embeddings

models for turning text, images, audio (and combos) into useful vectors or feature maps. Ideal for search/RAG, clustering, recommendation, retrieval.

BAAI/bge-base-en-v1.5

Feature Extraction • 0.1B • Updated Feb 21, 2024 • 3.44M • • 383
BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 7.61M • • 2.56k
facebook/bart-base

Feature Extraction • 0.1B • Updated Nov 16, 2022 • 1.71M • • 201
sentence-transformers/all-MiniLM-L12-v2

Sentence Similarity • 33.4M • Updated Mar 6 • 3.44M • • 280

🪶 Sept’25 <Text Generation Language Models >(Top Releases)

coding models and pipelines released this month that boost repo-level reasoning, GUI automation, and tool use. Focused on practical editing.

deepseek-ai/DeepSeek-V3.1

Text Generation • 685B • Updated Sep 5 • 82.2k • • 807
deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated 19 days ago • 66.2k • • 899
mistralai/Magistral-Small-2509

24B • Updated 4 days ago • 12.1k • 274
openbmb/MiniCPM4.1-8B

Text Generation • 8B • Updated Oct 24 • 18.9k • 380

🖼️ **Text2Image, i2i ** September ’25 (Top Releases)

Cutting-edge image generation & VLM updates from September ’25. This collection spotlights models that improved text rendering, layout control & more.

tencent/HunyuanImage-3.0

Text-to-Image • 83B • Updated Oct 14 • 55.3k • • 991
Qwen/Qwen-Image-Edit-2509

Image-to-Image • Updated Sep 22 • 536k • • 926
Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated 10 days ago • 6.65k • • 339

📄➡️🔊 Text-to-Speech (TTS)

Speech synthesis models that turn text into natural audio. Includes multilingual TTS, low-latency real-time models, and voice-cloning variants.

coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 5.8M • 3.21k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 3.86M • • 5.36k
ResembleAI/chatterbox

Text-to-Speech • Updated Sep 23 • 704k • • 1.31k
SWivid/F5-TTS

Text-to-Speech • Updated Mar 21 • 764k • 1.13k

📚➡️🎨Text-to-Image

State-of-the-art diffusion and generative models that turn text prompts into detailed images. Includes lightweight CPU-friendly and photorealistic mdl

stable-diffusion-v1-5/stable-diffusion-v1-5

Text-to-Image • Updated Sep 7, 2024 • 2.11M • 933
stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 2.49M • • 7.19k
stabilityai/sd-turbo

Text-to-Image • Updated Jul 10, 2024 • 1.3M • 428
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.22M • • 12k

🎨➡️✍️ Image-to-Text

OCR, captioning, and visual QA models that turn pure images into descriptive or structured text.

Salesforce/blip-image-captioning-base

Image-to-Text • Updated Feb 3 • 2.36M • 820
Salesforce/blip-image-captioning-large

Image-to-Text • 0.5B • Updated Feb 3 • 1.05M • 1.44k
nlpconnect/vit-gpt2-image-captioning

Image-to-Text • Updated Feb 27, 2023 • 1.32M • 920
microsoft/trocr-base-handwritten

Image-to-Text • 0.3B • Updated Feb 11 • 107k • 466

🌀 Any-to-Any Multimodal Models

Models that can flexibly convert across modalities (text, image, audio, video). Ideal for researchers exploring unified multimodal-AI.

Qwen/Qwen2.5-Omni-3B

Any-to-Any • 6B • Updated Apr 30 • 295k • 311
Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30 • 137k • 1.83k
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 8.58k • 463
openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5 • 97.1k • 1.27k

👨‍💻Mathematical Reasoning 🧮

Datasets tackling AI Toughest Challenges

nvidia/OpenMathInstruct-2

Viewer • Updated Nov 25, 2024 • 22M • 15.6k • 214
AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 9.47k • 508
meta-math/MetaMathQA

Viewer • Updated Dec 21, 2023 • 395k • 9.07k • 420

🧩 Long-Context Models (≥128k) CODING

10 CODING models that support ≥128k context (native or via officially documented scaling)

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.08M • • 5.08k
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.04M • 1.01k
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated 3 days ago • 1.14M • • 796
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Text Generation • 16B • Updated Jul 3, 2024 • 229k • • 505

🧩 Long-Context Models (≥128k) under 8B

microsoft/Phi-3-mini-128k-instruct

Text Generation • 4B • Updated Mar 2 • 271k • 1.68k
microsoft/Phi-3-vision-128k-instruct

Text Generation • 4B • Updated Aug 20, 2024 • 18k • 969
Menlo/Jan-nano-128k-gguf

Text Generation • 4B • Updated Jul 1 • 8.85k • 70
unsloth/SmolLM3-3B-128K-GGUF

3B • Updated Jul 8 • 4.4k • 34

Qwen3

Best of Qwen3 Series of Models

Qwen/Qwen3-30B-A3B-Instruct-2507

Text Generation • 31B • Updated Sep 17 • 598k • • 682
Qwen/Qwen3-Next-80B-A3B-Thinking

Text Generation • 81B • Updated Sep 15 • 180k • • 452
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated 3 days ago • 1.14M • • 796
Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22 • 278k • 743

🛩️Qwen3-VL

the most powerful vision-language model in the Qwen series to date. Available in Dense and MoE architectures

Qwen/Qwen3-VL-30B-A3B-Thinking

Image-Text-to-Text • 31B • Updated 10 days ago • 54.3k • • 164
mlx-community/Qwen3-VL-30B-A3B-Instruct-4bit

Image-Text-to-Text • Updated Oct 11 • 205 • 5
mlx-community/Qwen3-VL-30B-A3B-Instruct-8bit

Image-Text-to-Text • Updated Oct 11 • 99 • 2
mlx-community/Qwen3-VL-8B-Instruct-4bit

Image-Text-to-Text • Updated Oct 14 • 296 • 3

🍎 MLX-Quantized Models (3/4/5/6-bit) Mac & iOS

Curated MLX-ready quantized LLMs that run fast on Apple Silicon (and some on iOS). Every card lists Bits · Group size · Peak UM (GB) · Stable context.

mlx-community/Apriel-1.5-15b-Thinker-3bit-MLX

Image-Text-to-Text • Updated Oct 3 • 7
mlx-community/Apriel-1.5-15b-Thinker-6bit-MLX

Image-Text-to-Text • Updated Oct 3 • 71 • 1
mlx-community/granite-4.0-h-tiny-3bit-MLX

Text Generation • 0.9B • Updated Oct 3 • 42 • 2
mlx-community/granite-4.0-tiny-preview-4bit

Text Generation • 1B • Updated Sep 10 • 8

🖼️ Vision Backbones & Image Embeddings

facebook/dinov2-base

Image Feature Extraction • 86.6M • Updated Jan 17, 2024 • 1.39M • 161
openai/clip-vit-large-patch14-336

Zero-Shot Image Classification • Updated Oct 4, 2022 • 5.66M • 281
google/siglip-so400m-patch14-384

Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 2M • 624
BAAI/EVA-CLIP-8B

Feature Extraction • Updated Feb 7, 2024 • 2.76k • 50

🧊Sept 25 <Image-to-3D> [Top Releases]

Models that turn a single image (or image+prompt) into 3D assets meshes, Gaussians, or point clouds suited for AR/VR, product turntables, game props.

tencent/Hunyuan3D-Omni

Image-to-3D • Updated Oct 17 • 956 • 140
tencent/Hunyuan3D-Part

Updated Oct 17 • 1.36k • 509
facebook/VGGT-1B-Commercial

Image-to-3D • Updated Sep 17 • 64 • 50
Stable-X/vggt-object-v0-1

Image-to-3D • Updated Sep 9 • 2.96k • 7

🎬 ✍️ Sept 25 <Video & Text2Video> (Top Releases)

open T2V & animation models emphasizing temporal coherence, controllability, and real-time playback. Great starting point for creative tools, Ads.

kandinskylab/Kandinsky-5.0-T2V-Lite-pretrain-5s

Updated 16 days ago • 72 • 8
Efficient-Large-Model/LongLive-1.3B

Updated Sep 29 • 39
Wan-AI/Wan2.2-Animate-14B

Video-to-Video • Updated Nov 5 • 34.6k • 863
bytedance-research/HuMo

Image-to-Video • Updated Sep 18 • 280 • 252

Top Apache 2.0 License

Free and Open Source provided you don't source model and claim right

openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.95M • • 5.17k
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.96M • 383
openai/whisper-small

Automatic Speech Recognition • 0.2B • Updated Feb 29, 2024 • 4.4M • 491
openai/whisper-tiny

Automatic Speech Recognition • 37.8M • Updated Feb 29, 2024 • 1.09M • 386

✍️➡️🎬 Text-to-Video

Models that create short videos from written prompts. Perfect for experimentation in generative video and creative storytelling.

Wan-AI/Wan2.2-I2V-A14B-Diffusers

Image-to-Video • Updated Aug 9 • 252k • • 168
Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Text-to-Video • Updated Apr 4 • 89.8k • 97
ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6 • 49.7k • 973
genmo/mochi-1-preview

Text-to-Video • Updated Sep 4 • 3.7k • • 1.29k

🖌️ Image-to-Image

Image editing and transformation models :- from style transfer to super-resolution, inpainting, and diffusion-based edits.

stabilityai/stable-diffusion-xl-refiner-1.0

Image-to-Image • Updated Sep 25, 2023 • 466k • 2k
black-forest-labs/FLUX.1-Kontext-dev

Image-to-Image • Updated Jun 27 • 315k • • 2.45k
Qwen/Qwen-Image-Edit

Image-to-Image • Updated Aug 25 • 94.1k • • 2.17k
lllyasviel/sd-controlnet-canny

Image-to-Image • Updated May 1, 2023 • 119k • 236

🖼️➡️📚 Image-Text-to-Text

Multimodal models that take image + text as input and produce natural language output. Use cases: chart QA, visual document reasoning, VQA.

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.44M • • 1.38k
Qwen/Qwen2.5-VL-3B-Instruct

Image-Text-to-Text • 4B • Updated Apr 6 • 7.96M • 566
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.04M • 1.01k
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • Updated 2 days ago • 1.01M • 168

✍️ Text Generation

Collection of top open LLMs for writing, summarization, chat, reasoning, and document drafting. Includes small SLMs for devices and large models .

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 9.46M • 3.04k
facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 4.53M • 226
Qwen/Qwen2.5-3B-Instruct

Text Generation • 3B • Updated Sep 25, 2024 • 8.61M • 339
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.08M • • 5.08k

🧠General Purpose Dataset < 10M samples

Dataset that can 🌐chat, ⚡code and 🧮reasoning

BAAI/Infinity-Instruct

Viewer • Updated 2 days ago • 21.9M • 3.87k • 686
chargoddard/WebInstructSub-prometheus

Viewer • Updated May 15, 2024 • 2.39M • 475 • 25
arcee-ai/The-Tome

Viewer • Updated Aug 15, 2024 • 1.75M • 253 • 103

🍎 MLX-Ready LLMs

MLX weights and proven for MLX inference

mlx-community/gpt-oss-20b-MXFP4-Q8

Text Generation • 21B • Updated Aug 29 • 766k • 20
lmstudio-community/Seed-OSS-36B-Instruct-MLX-4bit

Text Generation • 36B • Updated Aug 26 • 56.1k
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bit

Text Generation • 0.6B • Updated Aug 6 • 98.8k • 9
mlx-community/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • 0.6B • Updated May 10 • 300k • 32

📱 OnDevice -Ready SLMs (≤4B)

Tiny, fast models that run on iPhone/iPad or Mac with very low memory. Great for quick replies, offline note-assist, and routing

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-8bit

Text Generation • 1B • Updated Aug 6 • 96.6k • 7
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bit

Text Generation • 1B • Updated May 29 • 234k • 6
lmstudio-community/gemma-3n-E4B-it-MLX-4bit

Image-Text-to-Text • Updated Jul 21 • 138k • 1
mlx-community/gemma-3-4b-it-qat-4bit

Image-Text-to-Text • 0.9B • Updated Apr 21 • 41.1k • 5

GPT2-JungleBook-from-Scratch-Models

The primary objective of project is to explore & analyze the impact of model size on text generation quality with GPT-2 arch trained from scratch.

Susant-Achary/gpt2-jungle-book-100M

Text Generation • 0.3B • Updated Jan 25 • 4
Susant-Achary/gpt2-jungle-book-59M

Text Generation • 0.2B • Updated Jan 25 • 8
Susant-Achary/gpt2-jungle-book-37M

Text Generation • 0.1B • Updated Jan 25 • 5
Susant-Achary/gpt2-jungle-book-22M

Text Generation • 81.5M • Updated Jan 25 • 4

Vision-LM

meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 183k • • 1.54k
mlx-community/dolphin-vision-72b-4bit

Image-Text-to-Text • 11B • Updated Jul 4, 2024 • 293 • 7
mlx-community/Phi-3.5-vision-instruct-4bit

Text Generation • 0.6B • Updated Apr 19 • 105 • 5

🛩️Qwen3-VL

the most powerful vision-language model in the Qwen series to date. Available in Dense and MoE architectures

Qwen/Qwen3-VL-30B-A3B-Thinking

Image-Text-to-Text • 31B • Updated 10 days ago • 54.3k • • 164
mlx-community/Qwen3-VL-30B-A3B-Instruct-4bit

Image-Text-to-Text • Updated Oct 11 • 205 • 5
mlx-community/Qwen3-VL-30B-A3B-Instruct-8bit

Image-Text-to-Text • Updated Oct 11 • 99 • 2
mlx-community/Qwen3-VL-8B-Instruct-4bit

Image-Text-to-Text • Updated Oct 14 • 296 • 3

<7B Best of MoE 🧠

Collection of Small size big impact MoE.

LiquidAI/LFM2-8B-A1B

Text Generation • 8B • Updated about 21 hours ago • 15.8k • 261
ibm-granite/granite-4.0-h-tiny

Text Generation • 7B • Updated Nov 3 • 28.9k • 169
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated May 1 • 403k • 1.54k
google/gemma-3n-E4B-it

Image-Text-to-Text • 8B • Updated Jul 14 • 67.6k • 824

🍎 MLX-Quantized Models (3/4/5/6-bit) Mac & iOS

Curated MLX-ready quantized LLMs that run fast on Apple Silicon (and some on iOS). Every card lists Bits · Group size · Peak UM (GB) · Stable context.

mlx-community/Apriel-1.5-15b-Thinker-3bit-MLX

Image-Text-to-Text • Updated Oct 3 • 7
mlx-community/Apriel-1.5-15b-Thinker-6bit-MLX

Image-Text-to-Text • Updated Oct 3 • 71 • 1
mlx-community/granite-4.0-h-tiny-3bit-MLX

Text Generation • 0.9B • Updated Oct 3 • 42 • 2
mlx-community/granite-4.0-tiny-preview-4bit

Text Generation • 1B • Updated Sep 10 • 8

Audio Features

laion/clap-htsat-fused

Feature Extraction • 0.2B • Updated Mar 28 • 16.4M • • 43

🖼️ Vision Backbones & Image Embeddings

facebook/dinov2-base

Image Feature Extraction • 86.6M • Updated Jan 17, 2024 • 1.39M • 161
openai/clip-vit-large-patch14-336

Zero-Shot Image Classification • Updated Oct 4, 2022 • 5.66M • 281
google/siglip-so400m-patch14-384

Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 2M • 624
BAAI/EVA-CLIP-8B

Feature Extraction • Updated Feb 7, 2024 • 2.76k • 50

Feature Extraction with 🧠 Text Embeddings

models for turning text, images, audio (and combos) into useful vectors or feature maps. Ideal for search/RAG, clustering, recommendation, retrieval.

BAAI/bge-base-en-v1.5

Feature Extraction • 0.1B • Updated Feb 21, 2024 • 3.44M • • 383
BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 7.61M • • 2.56k
facebook/bart-base

Feature Extraction • 0.1B • Updated Nov 16, 2022 • 1.71M • • 201
sentence-transformers/all-MiniLM-L12-v2

Sentence Similarity • 33.4M • Updated Mar 6 • 3.44M • • 280

🧊Sept 25 <Image-to-3D> [Top Releases]

Models that turn a single image (or image+prompt) into 3D assets meshes, Gaussians, or point clouds suited for AR/VR, product turntables, game props.

tencent/Hunyuan3D-Omni

Image-to-3D • Updated Oct 17 • 956 • 140
tencent/Hunyuan3D-Part

Updated Oct 17 • 1.36k • 509
facebook/VGGT-1B-Commercial

Image-to-3D • Updated Sep 17 • 64 • 50
Stable-X/vggt-object-v0-1

Image-to-3D • Updated Sep 9 • 2.96k • 7

🪶 Sept’25 <Text Generation Language Models >(Top Releases)

coding models and pipelines released this month that boost repo-level reasoning, GUI automation, and tool use. Focused on practical editing.

deepseek-ai/DeepSeek-V3.1

Text Generation • 685B • Updated Sep 5 • 82.2k • • 807
deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated 19 days ago • 66.2k • • 899
mistralai/Magistral-Small-2509

24B • Updated 4 days ago • 12.1k • 274
openbmb/MiniCPM4.1-8B

Text Generation • 8B • Updated Oct 24 • 18.9k • 380

🎬 ✍️ Sept 25 <Video & Text2Video> (Top Releases)

open T2V & animation models emphasizing temporal coherence, controllability, and real-time playback. Great starting point for creative tools, Ads.

kandinskylab/Kandinsky-5.0-T2V-Lite-pretrain-5s

Updated 16 days ago • 72 • 8
Efficient-Large-Model/LongLive-1.3B

Updated Sep 29 • 39
Wan-AI/Wan2.2-Animate-14B

Video-to-Video • Updated Nov 5 • 34.6k • 863
bytedance-research/HuMo

Image-to-Video • Updated Sep 18 • 280 • 252

🖼️ **Text2Image, i2i ** September ’25 (Top Releases)

Cutting-edge image generation & VLM updates from September ’25. This collection spotlights models that improved text rendering, layout control & more.

tencent/HunyuanImage-3.0

Text-to-Image • 83B • Updated Oct 14 • 55.3k • • 991
Qwen/Qwen-Image-Edit-2509

Image-to-Image • Updated Sep 22 • 536k • • 926
Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated 10 days ago • 6.65k • • 339

Top Apache 2.0 License

Free and Open Source provided you don't source model and claim right

openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.95M • • 5.17k
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.96M • 383
openai/whisper-small

Automatic Speech Recognition • 0.2B • Updated Feb 29, 2024 • 4.4M • 491
openai/whisper-tiny

Automatic Speech Recognition • 37.8M • Updated Feb 29, 2024 • 1.09M • 386

📄➡️🔊 Text-to-Speech (TTS)

Speech synthesis models that turn text into natural audio. Includes multilingual TTS, low-latency real-time models, and voice-cloning variants.

coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 5.8M • 3.21k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 3.86M • • 5.36k
ResembleAI/chatterbox

Text-to-Speech • Updated Sep 23 • 704k • • 1.31k
SWivid/F5-TTS

Text-to-Speech • Updated Mar 21 • 764k • 1.13k

✍️➡️🎬 Text-to-Video

Models that create short videos from written prompts. Perfect for experimentation in generative video and creative storytelling.

Wan-AI/Wan2.2-I2V-A14B-Diffusers

Image-to-Video • Updated Aug 9 • 252k • • 168
Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Text-to-Video • Updated Apr 4 • 89.8k • 97
ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6 • 49.7k • 973
genmo/mochi-1-preview

Text-to-Video • Updated Sep 4 • 3.7k • • 1.29k

📚➡️🎨Text-to-Image

State-of-the-art diffusion and generative models that turn text prompts into detailed images. Includes lightweight CPU-friendly and photorealistic mdl

stable-diffusion-v1-5/stable-diffusion-v1-5

Text-to-Image • Updated Sep 7, 2024 • 2.11M • 933
stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 2.49M • • 7.19k
stabilityai/sd-turbo

Text-to-Image • Updated Jul 10, 2024 • 1.3M • 428
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.22M • • 12k

🖌️ Image-to-Image

Image editing and transformation models :- from style transfer to super-resolution, inpainting, and diffusion-based edits.

stabilityai/stable-diffusion-xl-refiner-1.0

Image-to-Image • Updated Sep 25, 2023 • 466k • 2k
black-forest-labs/FLUX.1-Kontext-dev

Image-to-Image • Updated Jun 27 • 315k • • 2.45k
Qwen/Qwen-Image-Edit

Image-to-Image • Updated Aug 25 • 94.1k • • 2.17k
lllyasviel/sd-controlnet-canny

Image-to-Image • Updated May 1, 2023 • 119k • 236

🎨➡️✍️ Image-to-Text

OCR, captioning, and visual QA models that turn pure images into descriptive or structured text.

Salesforce/blip-image-captioning-base

Image-to-Text • Updated Feb 3 • 2.36M • 820
Salesforce/blip-image-captioning-large

Image-to-Text • 0.5B • Updated Feb 3 • 1.05M • 1.44k
nlpconnect/vit-gpt2-image-captioning

Image-to-Text • Updated Feb 27, 2023 • 1.32M • 920
microsoft/trocr-base-handwritten

Image-to-Text • 0.3B • Updated Feb 11 • 107k • 466

🖼️➡️📚 Image-Text-to-Text

Multimodal models that take image + text as input and produce natural language output. Use cases: chart QA, visual document reasoning, VQA.

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.44M • • 1.38k
Qwen/Qwen2.5-VL-3B-Instruct

Image-Text-to-Text • 4B • Updated Apr 6 • 7.96M • 566
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.04M • 1.01k
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • Updated 2 days ago • 1.01M • 168

🌀 Any-to-Any Multimodal Models

Models that can flexibly convert across modalities (text, image, audio, video). Ideal for researchers exploring unified multimodal-AI.

Qwen/Qwen2.5-Omni-3B

Any-to-Any • 6B • Updated Apr 30 • 295k • 311
Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30 • 137k • 1.83k
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1 • 8.58k • 463
openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5 • 97.1k • 1.27k

✍️ Text Generation

Collection of top open LLMs for writing, summarization, chat, reasoning, and document drafting. Includes small SLMs for devices and large models .

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 9.46M • 3.04k
facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 4.53M • 226
Qwen/Qwen2.5-3B-Instruct

Text Generation • 3B • Updated Sep 25, 2024 • 8.61M • 339
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.08M • • 5.08k

👨‍💻Mathematical Reasoning 🧮

Datasets tackling AI Toughest Challenges

nvidia/OpenMathInstruct-2

Viewer • Updated Nov 25, 2024 • 22M • 15.6k • 214
AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 9.47k • 508
meta-math/MetaMathQA

Viewer • Updated Dec 21, 2023 • 395k • 9.07k • 420

🧠General Purpose Dataset < 10M samples

Dataset that can 🌐chat, ⚡code and 🧮reasoning

BAAI/Infinity-Instruct

Viewer • Updated 2 days ago • 21.9M • 3.87k • 686
chargoddard/WebInstructSub-prometheus

Viewer • Updated May 15, 2024 • 2.39M • 475 • 25
arcee-ai/The-Tome

Viewer • Updated Aug 15, 2024 • 1.75M • 253 • 103

🧩 Long-Context Models (≥128k) CODING

10 CODING models that support ≥128k context (native or via officially documented scaling)

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.08M • • 5.08k
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.04M • 1.01k
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated 3 days ago • 1.14M • • 796
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Text Generation • 16B • Updated Jul 3, 2024 • 229k • • 505

🍎 MLX-Ready LLMs

MLX weights and proven for MLX inference

mlx-community/gpt-oss-20b-MXFP4-Q8

Text Generation • 21B • Updated Aug 29 • 766k • 20
lmstudio-community/Seed-OSS-36B-Instruct-MLX-4bit

Text Generation • 36B • Updated Aug 26 • 56.1k
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bit

Text Generation • 0.6B • Updated Aug 6 • 98.8k • 9
mlx-community/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • 0.6B • Updated May 10 • 300k • 32

🧩 Long-Context Models (≥128k) under 8B

microsoft/Phi-3-mini-128k-instruct

Text Generation • 4B • Updated Mar 2 • 271k • 1.68k
microsoft/Phi-3-vision-128k-instruct

Text Generation • 4B • Updated Aug 20, 2024 • 18k • 969
Menlo/Jan-nano-128k-gguf

Text Generation • 4B • Updated Jul 1 • 8.85k • 70
unsloth/SmolLM3-3B-128K-GGUF

3B • Updated Jul 8 • 4.4k • 34

📱 OnDevice -Ready SLMs (≤4B)

Tiny, fast models that run on iPhone/iPad or Mac with very low memory. Great for quick replies, offline note-assist, and routing

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-8bit

Text Generation • 1B • Updated Aug 6 • 96.6k • 7
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bit

Text Generation • 1B • Updated May 29 • 234k • 6
lmstudio-community/gemma-3n-E4B-it-MLX-4bit

Image-Text-to-Text • Updated Jul 21 • 138k • 1
mlx-community/gemma-3-4b-it-qat-4bit

Image-Text-to-Text • 0.9B • Updated Apr 21 • 41.1k • 5

Qwen3

Best of Qwen3 Series of Models

Qwen/Qwen3-30B-A3B-Instruct-2507

Text Generation • 31B • Updated Sep 17 • 598k • • 682
Qwen/Qwen3-Next-80B-A3B-Thinking

Text Generation • 81B • Updated Sep 15 • 180k • • 452
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated 3 days ago • 1.14M • • 796
Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22 • 278k • 743

GPT2-JungleBook-from-Scratch-Models

The primary objective of project is to explore & analyze the impact of model size on text generation quality with GPT-2 arch trained from scratch.

Susant-Achary/gpt2-jungle-book-100M

Text Generation • 0.3B • Updated Jan 25 • 4
Susant-Achary/gpt2-jungle-book-59M

Text Generation • 0.2B • Updated Jan 25 • 8
Susant-Achary/gpt2-jungle-book-37M

Text Generation • 0.1B • Updated Jan 25 • 5
Susant-Achary/gpt2-jungle-book-22M

Text Generation • 81.5M • Updated Jan 25 • 4