Upload folder using huggingface_hub
Browse files- README.md +30 -0
- config.json +5 -0
- model.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DeepSeek-3B-MoE-Decoder
|
| 2 |
+
|
| 3 |
+
This is the decoder component of DeepSeek-OCR, a 3B parameter Mixture-of-Experts (MoE) language model.
|
| 4 |
+
|
| 5 |
+
## Architecture
|
| 6 |
+
|
| 7 |
+
- **Model**: DeepSeek 3B MoE
|
| 8 |
+
- **Active Parameters**: ~570M per token
|
| 9 |
+
- **Total Parameters**: ~3B
|
| 10 |
+
- **Architecture**: Mixture-of-Experts with routing
|
| 11 |
+
|
| 12 |
+
## Usage
|
| 13 |
+
|
| 14 |
+
This decoder should be used with vision embeddings from the encoder component.
|
| 15 |
+
|
| 16 |
+
```python
|
| 17 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 18 |
+
|
| 19 |
+
# Load decoder
|
| 20 |
+
model = AutoModelForCausalLM.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
|
| 21 |
+
tokenizer = AutoTokenizer.from_pretrained("junkim100/DeepSeek-3B-MoE-decoder")
|
| 22 |
+
|
| 23 |
+
# Use with vision embeddings from encoder
|
| 24 |
+
# vision_embeddings = ... (from DeepEncoder)
|
| 25 |
+
# outputs = model(inputs_embeds=vision_embeddings, ...)
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
## Source
|
| 29 |
+
|
| 30 |
+
Extracted from [deepseek-ai/DeepSeek-OCR](https://huggingface.co/deepseek-ai/DeepSeek-OCR)
|
config.json
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"DeepseekOCRForCausalLM"
|
| 4 |
+
]
|
| 5 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f85521f2e6c344c36ffa997e4d55b889dcb59284a22ecf0748cc5e32ac283e8e
|
| 3 |
+
size 5869729208
|