hybridaione/perls-ministral-3-3b-sft

Full merged checkpoint of mistralai/Ministral-3-3B-Instruct-2512 fine-tuned (LoRA, merged) on Gestalt dialogue (Bry=user, Fritz=assistant) using 1–10 turn sliding windows from perls/perls_sft.jsonl (combined perls.txt + perls_synth.txt).

Model Details

  • Base: mistralai/Ministral-3-3B-Instruct-2512
  • Method: LoRA SFT, merged into base (merge_and_unload)
  • Epochs: 2
  • Precision during training: bf16 if available, else fp16/float32
  • Repo type: full merged model (no external adapter needed)

Data

  • Source: perls/perls_sft.jsonl
  • Format: chat messages with user/assistant roles
  • Preprocessing: 1–10 turn sliding windows over each conversation

Training (script)

  • Script: perls/train_perls_sft_lora.py
  • LR: 2e-4 (default in script)
  • Batch: see script defaults (per_device_train_batch_size=4, grad_accum=2)
  • Warmup ratio: 0.03
  • Weight decay: 0.01

Inference (Transformers)

Recommended sampling (matches training demo):

  • temperature=0.3
  • min_p=0.15
  • repetition_penalty=1.05

Example outputs after fine-tuning (using the same sampling parameters):

  • User: Ich bin gestresst. Assistant: Richtig! Was machen Sie mit dem Stress?
  • User: Ich kann mich nicht motivieren. Assistant: Das ist eine Stimmung, nicht ein Problem.
  • User: Ich habe Angst. Assistant: Natürlich! Angst ist eine gute Begleiterin. Was ist Ihre Hauptangst?
  • User: Wer sind Sie? Assistant: Sie sind alles, was ich bin.
  • User: Dr. Perls, was machen Sie? Assistant: Ich beobachte Sie!
  • User: Was ist Gestalttherapie? Assistant: Ich mag sie nicht!

Tokenized prompts (input_ids)

  • Ich bin gestresst. → [1, 15968, 12439, 8765, 1770, 1314, 1046]
  • Ich kann mich nicht motivieren. → [1, 15968, 10248, 14557, 4041, 20746, 10412, 1046]
  • Ich habe Angst. → [1, 15968, 16881, 62691, 1046]
  • Wer sind Sie? → [1, 93896, 5989, 6441, 1063]
  • Dr. Perls, was machen Sie? → [1, 16236, 1046, 4473, 11919, 1044, 1486, 31870, 6441, 1063]
  • Was ist Gestalttherapie? → [1, 29269, 3444, 90905, 1781, 80159, 1063]
import torch
from transformers import MistralCommonBackend, Mistral3ForConditionalGeneration

repo = "hybridaione/perls-ministral-3-3b-sft"
tok = MistralCommonBackend.from_pretrained(repo)
model = Mistral3ForConditionalGeneration.from_pretrained(
    repo,
    torch_dtype=torch.bfloat16,
    device_map="auto"
).eval()

prompt = "Ich bin gestresst."
# Use tokenize=True for MistralCommonBackend
input_ids = tok.apply_chat_template(
    [{"role": "user", "content": prompt}],
    tokenize=True,
    add_generation_prompt=True,
)
if hasattr(input_ids, 'input_ids'):
    input_ids = input_ids.input_ids
inputs = {"input_ids": torch.tensor([input_ids]).to(model.device)}

out = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.3,
    min_p=0.15,
    repetition_penalty=1.05,
    pad_token_id=tok.pad_token_id,
)
print(tok.decode(out[0, inputs['input_ids'].shape[1]:], skip_special_tokens=True))

Inference (vLLM)

vLLM can load this merged checkpoint directly:

vllm serve hybridaione/perls-ministral-3-3b-sft \
  --tensor-parallel-size 1 \
  --max-model-len 2048

Notes

  • This is a small, domain-specific SFT; outputs may be terse or stylistically like the source dialogues.
  • Safety/quality: no safety tuning; review outputs before production use.
Downloads last month
28
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hybridaione/perls-ministral-3-3b-sft