YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
DeepSeek-3B-MoE-Decoder
This is the decoder component of DeepSeek-OCR, a 3B parameter Mixture-of-Experts (MoE) language model.
Architecture
- Model: DeepSeek 3B MoE
- Active Parameters: ~570M per token
- Total Parameters: ~3B
- Architecture: Mixture-of-Experts with routing
Usage
This decoder should be used with vision embeddings from the encoder component.
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load decoder
model = AutoModelForCausalLM.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
tokenizer = AutoTokenizer.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
# Use with vision embeddings from encoder
# vision_embeddings = ... (from DeepEncoder)
# outputs = model(inputs_embeds=vision_embeddings, ...)
Source
Extracted from deepseek-ai/DeepSeek-OCR
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support