Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

🗣️ Whisper Large v3 Turbo – Moroccan Darija (LoRA Fine-tuned)

Model Summary

anaszil/whisper-large-v3-turbo-darija is a fine-tuned LoRA adapter of OpenAI’s Whisper Large v3 Turbo, specialized for Moroccan Darija (ary) speech recognition.
The model was trained using a cleaned and segmented Darija audio corpus (≤30 s chunks), enabling accurate ASR performance for Moroccan Arabic dialects.

💬 Why It Matters

This work matters for all Moroccans who wish to see future AI systems and large language models better understand and represent our culture. Moroccan culture is primarily expressed through Darija, a dialect meant to be spoken rather than written. Because of this, most internet-trained LLMs struggle to comprehend or generate Darija properly — a challenge shared by many spoken dialects around the world.

Speech is the natural gateway to bridge this gap. Building a strong speech-to-text model for Darija is therefore essential: it allows AI systems to understand the language of everyday life, unlocking the full potential of AI regardless of written standards. By open-sourcing this Darija ASR model, I hope to encourage the creation of more high-quality datasets and research around dialectal Arabic.

Beyond research, this model also has practical business applications — such as transcribing customer support calls, interviews, and voice messages in Darija — enabling organizations to create real value from spoken Moroccan Arabic data.

🧩 Model Details

Field	Description
Model Type	Encoder-decoder ASR (Whisper Large v3 Turbo, LoRA fine-tune)
Language(s)	Moroccan Arabic (Darija)
License	MIT
Finetuned From	`openai/whisper-large-v3-turbo`
Hardware Used	1 × H100 80 GB GPU
Frameworks	`transformers 4.48.3`, `peft 0.14.0`
Evaluation Metrics	WER = 24.9 %, CER = 8.3 %

Demo and Resources

🎧 Space: anaszil/whisper-darija
📦 Collection: Whisper Darija Collection

🚀 Quick Start

How to use this model :

import torch
from peft import PeftModel
from transformers import WhisperForConditionalGeneration, WhisperProcessor, pipeline

# Base and LoRA model names
base_model = "openai/whisper-large-v3-turbo"
lora_model = "anaszil/whisper-large-v3-turbo-darija"

# Load base model and apply LoRA adapters
dtype = torch.float16 if torch.cuda.is_available() else torch.float32
device = 0 if torch.cuda.is_available() else "cpu"

base = WhisperForConditionalGeneration.from_pretrained(base_model, torch_dtype=dtype)
model = PeftModel.from_pretrained(base, lora_model)
processor = WhisperProcessor.from_pretrained(base_model, language="Arabic", task="transcribe")

# Build the ASR pipeline
asr = pipeline(
    task="automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    chunk_length_s=30,
    device=device,
)

# Run inference
output = asr("path/to/audio.wav")
print(output["text"])

🧮 Training Configuration

Base Model: openai/whisper-large-v3-turbo
Fine-tuning Strategy: LoRA adapters applied to encoder and decoder layers
Precision: FP16
Learning Rate: 5e-5 (linearly decayed to ~1.4e-7)
Epochs: 5
Batch Size: 16 (train) / 32 (eval)
Optimizer: AdamW
Seed: 42
Training Time: ~4.1 hours on 1 × H100 80 GB
Target Modules: q_proj, k_proj, v_proj, out_proj, fc1, fc2
Rank (r): 16
Alpha (lora_alpha): 32

🎧 Dataset

Data Source: Private Moroccan Darija speech corpus (to be released soon).

📊 Evaluation Metrics

Metric	Result
Word Error Rate (WER)	24.88 %
Character Error Rate (CER)	8.28 %
Evaluation Loss	0.322

Evaluation was performed on a held-out subset from the same data distribution. The model achieves a low CER but a relatively higher WER compared to other languages. This difference is mainly due to the absence of a standardized writing system for Darija in Morocco — many words can be spelled in several valid ways. This variability also reflects a limitation of the dataset used for fine-tuning and highlights the need to establish a consistent orthographic standard for Darija before large-scale data collection efforts.

Downloads last month: 414

Model tree for anaszil/whisper-large-v3-turbo-darija

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Adapter

(71)

this model

Spaces using anaszil/whisper-large-v3-turbo-darija 2

Collection including anaszil/whisper-large-v3-turbo-darija

🔊 Whisper Darija

Collection

A HF collection bringing Moroccan Darija into the AI and speech-recognition era. • 3 items • Updated 27 days ago