M2V-BGE-M3-1024d
A high-performance Model2Vec distilled embedding model based on BAAI/bge-m3.
Key Features:
- Ultra-fast inference (<1ms for 4 sentences on CPU)
- 1024-dimensional embeddings
- Multilingual support (100+ languages from BGE-M3)
- 8192 token context window (inherited from BGE-M3)
- ~500MB model size
MTEB Benchmark Results
| Task Category | Score |
|---|---|
| STS (Semantic Similarity) | 0.5831 |
| - STSBenchmark | 0.5714 |
| - SICK-R | 0.5947 |
| Classification (kNN) | 0.6564 |
| - Banking77 | 0.8027 |
| - Emotion | 0.5101 |
| Clustering | 0.1771 |
| - TwentyNewsgroups | 0.1771 |
| Overall MTEB | 0.4722 |
Comparison with Other Models
| Model | STS | Classification | Size | Latency |
|---|---|---|---|---|
| M2V-BGE-M3-1024d | 0.5831 | 0.6564 | 499 MB | <1ms |
| M2V-Qwen3-0.6B | 0.4845 | 0.5949 | 302 MB | ~1ms |
| POTION-base-8M | ~0.52 | ~0.55 | 30 MB | <1ms |
Installation
pip install model2vec
# or
pip install sentence-transformers
Usage
Using Model2Vec (Fastest)
from model2vec import StaticModel
# Load the model
model = StaticModel.from_pretrained("tss-deposium/m2v-bge-m3-1024d")
# Compute embeddings
embeddings = model.encode(["Hello world", "Bonjour le monde"])
print(embeddings.shape) # (2, 1024)
Using Sentence Transformers
from sentence_transformers import SentenceTransformer
# Load the model
model = SentenceTransformer("tss-deposium/m2v-bge-m3-1024d")
# Compute embeddings
embeddings = model.encode(["Hello world", "Bonjour le monde"])
Semantic Similarity Example
from model2vec import StaticModel
import numpy as np
model = StaticModel.from_pretrained("tss-deposium/m2v-bge-m3-1024d")
# Similar sentences
sent1 = "I want to find financial documents"
sent2 = "Looking for finance-related files"
# Different sentence
sent3 = "The weather is nice today"
emb1, emb2, emb3 = model.encode([sent1, sent2, sent3])
def cosine_sim(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
print(f"Similar: {cosine_sim(emb1, emb2):.3f}") # ~0.85
print(f"Different: {cosine_sim(emb1, emb3):.3f}") # ~0.55
Model Details
- Base Model: BAAI/bge-m3
- Distillation Method: Model2Vec with PCA (1024 dimensions)
- Embedding Dimension: 1024
- Max Sequence Length: 8192 tokens
- Languages: 100+ (multilingual)
- Model Size: ~499 MB
Use Cases
- Semantic search and retrieval
- Document similarity
- Text classification (via kNN)
- Clustering
- RAG (Retrieval Augmented Generation) pipelines
- Real-time applications requiring ultra-low latency
Limitations
- Static embeddings don't capture context as well as transformer models
- Lower quality than full BGE-M3 (~58% vs ~80% on STS benchmarks)
- Best suited for applications where speed is critical
How It Works
Model2Vec distills a Sentence Transformer by:
- Passing vocabulary through the base model (BGE-M3)
- Reducing dimensionality with PCA (to 1024)
- Applying SIF weighting
- During inference: mean pooling of token embeddings
This results in embeddings that are 500x faster with only moderate quality loss.
Citation
@article{minishlab2024model2vec,
author = {Tulkens, Stephan and {van Dongen}, Thomas},
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
year = {2024},
url = {https://github.com/MinishLab/model2vec}
}
@misc{bge-m3,
title={BGE M3-Embedding},
author={Chen, Jianlv and Xiao, Shitao and Zhang, Peitian and Luo, Kun and Lian, Defu and Liu, Zheng},
year={2024},
publisher={Hugging Face}
}
License
MIT License - Same as base BGE-M3 model.
Created by: The Seed Ship / Deposium Project
- Downloads last month
- 17
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for tss-deposium/m2v-bge-m3-1024d
Base model
BAAI/bge-m3