MEGConformer for Phoneme Classification

Conformer-based MEG decoder for 39-class phoneme classification from ARPAbet phoneme set, trained with 5 different random seeds.

Model Performance

Seed	Val F1-Macro	Checkpoint
7 (best)	63.92%	`seed-7/pytorch_model.ckpt`
18	63.86%	`seed-18/pytorch_model.ckpt`
17	58.74%	`seed-17/pytorch_model.ckpt`
1	58.64%	`seed-1/pytorch_model.ckpt`
2	58.10%	`seed-2/pytorch_model.ckpt`

Note: Individual seeds were not evaluated on the holdout set. The ensemble of all 5 seeds achieved 65.8% F1-macro on the competition holdout.

Quick Start

Single Model Inference

import torch
from huggingface_hub import hf_hub_download

from libribrain_experiments.models.configurable_modules.classification_module import (
    ClassificationModule,
)

# Download best checkpoint (seed-7)
checkpoint_path = hf_hub_download(
    repo_id="zuazo/megconformer-phoneme-classification",
    filename="seed-7/pytorch_model.ckpt",
)

# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()

# Inference
meg_signal = torch.randn(1, 306, 125, device=device)  # (batch, channels, time)

with torch.no_grad():
    logits = model(meg_signal)
    probabilities = torch.softmax(logits, dim=1)
    prediction = torch.argmax(logits, dim=1)

print(f"Predicted phoneme class: {prediction.item()}")
print(f"Confidence: {probabilities[0, prediction].item():.2%}")

Ensemble Inference (Recommended)

The ensemble approach averages predictions from all 5 seeds and achieves the best performance:

import torch
from huggingface_hub import hf_hub_download

from libribrain_experiments.models.configurable_modules.classification_module import (
    ClassificationModule,
)

# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load all available seeds (as in the paper)
seeds = [7, 18, 17, 1, 2]
models = []

for seed in seeds:
    checkpoint_path = hf_hub_download(
        repo_id="zuazo/megconformer-phoneme-classification",
        filename=f"seed-{seed}/pytorch_model.ckpt",
    )
    model = ClassificationModule.load_from_checkpoint(
        checkpoint_path, map_location=device
    )
    model.eval().to(device)
    models.append(model)

# Example MEG input: (batch=1, channels=306, time=125)
meg_signal = torch.randn(1, 306, 125, device=device)

with torch.no_grad():
    probs_list = []
    preds_list = []

    for model in models:
        logits = model(meg_signal)  # (1, C)
        probs = torch.softmax(logits, dim=1)  # (1, C)
        probs_list.append(probs)
        preds_list.append(probs.argmax(dim=1))  # (1,)

    # Stack predictions from all models: shape (num_models, batch_size)
    preds = torch.stack(preds_list, dim=0)  # (M, 1)

    # We have a single example in the batch, so index 0
    per_model_preds = preds[:, 0]  # (M,)

    num_classes = probs_list[0].size(1)
    # Count votes per class
    votes = torch.bincount(per_model_preds, minlength=num_classes).float()

    # Majority-vote class (ties resolved by smallest index)
    majority_class = int(votes.argmax().item())

    # "Confidence" = fraction of models voting for the chosen class
    confidence = (votes[majority_class] / votes.sum()).item()

print(f"Ensemble (majority vote) predicted phoneme class: {majority_class}")
print(f"Vote share for that class: {confidence:.2%}")

Model Details

Architecture: Conformer (custom size)
- Hidden size: 256
- FFN dim: 2048
- Layers: 7
- Attention heads: 12
- Depthwise conv kernel: 31
Input: 306-channel MEG signals
Window size: 0.5 seconds (125 samples at 250 Hz)
Output: 39-class phoneme classification (ARPAbet phoneme set)
Training: LibriBrain 2025 Standard track
Grouping: 100 single-trial examples averaged per training sample

Reproducibility

All 5 random seeds are provided. For best results on new data, we recommend using the ensemble approach, which achieved 65.8% F1-macro on the competition holdout set.

Citation

@misc{dezuazo2025megconformerconformerbasedmegdecoder,
      title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, 
      author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
      year={2025},
      eprint={2512.01443},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.01443}, 
}

License

The 3-Clause BSD License

Dataset used to train zuazo/megconformer-phoneme-classification

Evaluation results

F1-macro on LibriBrain 2025 PNPL (Standard track, phoneme task)
self-reported

0.658

View on Papers With Code

zuazo
/

megconformer-phoneme-classification