๐ฃ๏ธ Whisper Large v3 Turbo โ Moroccan Darija (LoRA Fine-tuned)
Model Summary
anaszil/whisper-large-v3-turbo-darija is a fine-tuned LoRA adapter of OpenAIโs Whisper Large v3 Turbo, specialized for Moroccan Darija (ary) speech recognition.
The model was trained using a cleaned and segmented Darija audio corpus (โค30 s chunks), enabling accurate ASR performance for Moroccan Arabic dialects.
๐ฌ Why It Matters
This work matters for all Moroccans who wish to see future AI systems and large language models better understand and represent our culture. Moroccan culture is primarily expressed through Darija, a dialect meant to be spoken rather than written. Because of this, most internet-trained LLMs struggle to comprehend or generate Darija properly โ a challenge shared by many spoken dialects around the world.
Speech is the natural gateway to bridge this gap. Building a strong speech-to-text model for Darija is therefore essential: it allows AI systems to understand the language of everyday life, unlocking the full potential of AI regardless of written standards. By open-sourcing this Darija ASR model, I hope to encourage the creation of more high-quality datasets and research around dialectal Arabic.
Beyond research, this model also has practical business applications โ such as transcribing customer support calls, interviews, and voice messages in Darija โ enabling organizations to create real value from spoken Moroccan Arabic data.
๐งฉ Model Details
| Field | Description |
|---|---|
| Model Type | Encoder-decoder ASR (Whisper Large v3 Turbo, LoRA fine-tune) |
| Language(s) | Moroccan Arabic (Darija) |
| License | MIT |
| Finetuned From | openai/whisper-large-v3-turbo |
| Hardware Used | 1 ร H100 80 GB GPU |
| Frameworks | transformers 4.48.3, peft 0.14.0 |
| Evaluation Metrics | WER = 24.9 %, CER = 8.3 % |
Demo and Resources
- ๐ง Space: anaszil/whisper-darija
- ๐ฆ Collection: Whisper Darija Collection
๐ Quick Start
How to use this model :
import torch
from peft import PeftModel
from transformers import WhisperForConditionalGeneration, WhisperProcessor, pipeline
# Base and LoRA model names
base_model = "openai/whisper-large-v3-turbo"
lora_model = "anaszil/whisper-large-v3-turbo-darija"
# Load base model and apply LoRA adapters
dtype = torch.float16 if torch.cuda.is_available() else torch.float32
device = 0 if torch.cuda.is_available() else "cpu"
base = WhisperForConditionalGeneration.from_pretrained(base_model, torch_dtype=dtype)
model = PeftModel.from_pretrained(base, lora_model)
processor = WhisperProcessor.from_pretrained(base_model, language="Arabic", task="transcribe")
# Build the ASR pipeline
asr = pipeline(
task="automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
chunk_length_s=30,
device=device,
)
# Run inference
output = asr("path/to/audio.wav")
print(output["text"])
๐งฎ Training Configuration
- Base Model:
openai/whisper-large-v3-turbo - Fine-tuning Strategy: LoRA adapters applied to encoder and decoder layers
- Precision: FP16
- Learning Rate: 5e-5 (linearly decayed to ~1.4e-7)
- Epochs: 5
- Batch Size: 16 (train) / 32 (eval)
- Optimizer: AdamW
- Seed: 42
- Training Time: ~4.1 hours on 1 ร H100 80 GB
- Target Modules:
q_proj,k_proj,v_proj,out_proj,fc1,fc2 - Rank (
r): 16 - Alpha (
lora_alpha): 32
๐ง Dataset
- Data Source: Private Moroccan Darija speech corpus (to be released soon).
๐ Evaluation Metrics
| Metric | Result |
|---|---|
| Word Error Rate (WER) | 24.88 % |
| Character Error Rate (CER) | 8.28 % |
| Evaluation Loss | 0.322 |
Evaluation was performed on a held-out subset from the same data distribution. The model achieves a low CER but a relatively higher WER compared to other languages. This difference is mainly due to the absence of a standardized writing system for Darija in Morocco โ many words can be spelled in several valid ways. This variability also reflects a limitation of the dataset used for fine-tuning and highlights the need to establish a consistent orthographic standard for Darija before large-scale data collection efforts.
- Downloads last month
- 414