Changes from the Original DeepSeek-V3.2-Exp
- Dequantized Indexer to bfloat16
- Compatibility with transformers library (trust_remote_code=True)
test code
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("kishizaki-sci/DeepSeek-V3.2-Exp-FP8", trust_remote_code=True, dtype="auto", device_map='auto')
tokenizer = AutoTokenizer.from_pretrained("kishizaki-sci/DeepSeek-V3.2-Exp-FP8")
# Copied from https://huggingface.co/docs/transformers/model_doc/deepseek_v3
chat = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "I'd like to show off how chat templating works!"},
]
inputs = tokenizer.apply_chat_template(chat, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
import time
start = time.time()
outputs = model.generate(inputs, max_new_tokens=50)
print(tokenizer.batch_decode(outputs))
print(time.time()-start)
- Downloads last month
- 34
Model tree for kishizaki-sci/DeepSeek-V3.2-Exp-FP8
Base model
deepseek-ai/DeepSeek-V3.2-Exp-Base
Finetuned
deepseek-ai/DeepSeek-V3.2-Exp