MEGConformer for Phoneme Classification
Conformer-based MEG decoder for 39-class phoneme classification from ARPAbet phoneme set, trained with 5 different random seeds.
Model Performance
| Seed | Val F1-Macro | Checkpoint |
|---|---|---|
| 7 (best) | 63.92% | seed-7/pytorch_model.ckpt |
| 18 | 63.86% | seed-18/pytorch_model.ckpt |
| 17 | 58.74% | seed-17/pytorch_model.ckpt |
| 1 | 58.64% | seed-1/pytorch_model.ckpt |
| 2 | 58.10% | seed-2/pytorch_model.ckpt |
Note: Individual seeds were not evaluated on the holdout set. The ensemble of all 5 seeds achieved 65.8% F1-macro on the competition holdout.
Quick Start
Single Model Inference
import torch
from huggingface_hub import hf_hub_download
from libribrain_experiments.models.configurable_modules.classification_module import (
ClassificationModule,
)
# Download best checkpoint (seed-7)
checkpoint_path = hf_hub_download(
repo_id="zuazo/megconformer-phoneme-classification",
filename="seed-7/pytorch_model.ckpt",
)
# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load model
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()
# Inference
meg_signal = torch.randn(1, 306, 125, device=device) # (batch, channels, time)
with torch.no_grad():
logits = model(meg_signal)
probabilities = torch.softmax(logits, dim=1)
prediction = torch.argmax(logits, dim=1)
print(f"Predicted phoneme class: {prediction.item()}")
print(f"Confidence: {probabilities[0, prediction].item():.2%}")
Ensemble Inference (Recommended)
The ensemble approach averages predictions from all 5 seeds and achieves the best performance:
import torch
from huggingface_hub import hf_hub_download
from libribrain_experiments.models.configurable_modules.classification_module import (
ClassificationModule,
)
# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Load all available seeds (as in the paper)
seeds = [7, 18, 17, 1, 2]
models = []
for seed in seeds:
checkpoint_path = hf_hub_download(
repo_id="zuazo/megconformer-phoneme-classification",
filename=f"seed-{seed}/pytorch_model.ckpt",
)
model = ClassificationModule.load_from_checkpoint(
checkpoint_path, map_location=device
)
model.eval().to(device)
models.append(model)
# Example MEG input: (batch=1, channels=306, time=125)
meg_signal = torch.randn(1, 306, 125, device=device)
with torch.no_grad():
probs_list = []
preds_list = []
for model in models:
logits = model(meg_signal) # (1, C)
probs = torch.softmax(logits, dim=1) # (1, C)
probs_list.append(probs)
preds_list.append(probs.argmax(dim=1)) # (1,)
# Stack predictions from all models: shape (num_models, batch_size)
preds = torch.stack(preds_list, dim=0) # (M, 1)
# We have a single example in the batch, so index 0
per_model_preds = preds[:, 0] # (M,)
num_classes = probs_list[0].size(1)
# Count votes per class
votes = torch.bincount(per_model_preds, minlength=num_classes).float()
# Majority-vote class (ties resolved by smallest index)
majority_class = int(votes.argmax().item())
# "Confidence" = fraction of models voting for the chosen class
confidence = (votes[majority_class] / votes.sum()).item()
print(f"Ensemble (majority vote) predicted phoneme class: {majority_class}")
print(f"Vote share for that class: {confidence:.2%}")
Model Details
- Architecture: Conformer (custom size)
- Hidden size: 256
- FFN dim: 2048
- Layers: 7
- Attention heads: 12
- Depthwise conv kernel: 31
- Input: 306-channel MEG signals
- Window size: 0.5 seconds (125 samples at 250 Hz)
- Output: 39-class phoneme classification (ARPAbet phoneme set)
- Training: LibriBrain 2025 Standard track
- Grouping: 100 single-trial examples averaged per training sample
Reproducibility
All 5 random seeds are provided. For best results on new data, we recommend using the ensemble approach, which achieved 65.8% F1-macro on the competition holdout set.
Citation
@misc{dezuazo2025megconformerconformerbasedmegdecoder,
title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification},
author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
year={2025},
eprint={2512.01443},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.01443},
}
License
The 3-Clause BSD License
Links
- Paper: arXiv:2512.01443
- Code: GitHub
- Competition: LibriBrain 2025
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train zuazo/megconformer-phoneme-classification
Evaluation results
- F1-macro on LibriBrain 2025 PNPL (Standard track, phoneme task)self-reported0.658