mdeberta-v3-base-name-classifier / README.md

Update README.md

e8d3ae7 verified 10 days ago

8.62 kB

	---
	library_name: transformers
	license: mit
	base_model: microsoft/mdeberta-v3-base
	tags:
	- generated_from_trainer
	- name
	- person
	- company
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	model-index:
	- name: mdeberta-v3-base-name-classifier
	results: []
	datasets:
	- ele-sage/person-company-names-classification
	language:
	- fr
	- en
	new_version: ele-sage/mdeberta-v3-base-name-classifier-v2
	---



	# ⚠️ DEPRECATED MODEL ⚠️

	Please do not use this model for new projects.

	This model has been superseded by a newer, more accurate version trained on a larger, cleaner dataset.
	It is maintained here for archival purposes only.

	### ✅ Recommended Replacement:
	Please switch to [ele-sage/mdeberta-v3-base-name-classifier-v2](https://huggingface.co/ele-sage/mdeberta-v3-base-name-classifier-v2) (Higher Accuracy).

	---

	# mdeberta-v3-base-name-classifier

	This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on [ele-sage/person-company-names-classification](https://huggingface.co/datasets/ele-sage/person-company-names-classification) dataset.


	It achieves the following results on the evaluation set:
	- Loss: 0.0305
	- Accuracy: 0.9922
	- Precision: 0.9957
	- Recall: 0.9906
	- F1: 0.9931

	## Model description

	This model is a high-performance binary text classifier, fine-tuned from `mdeberta-v3-base`.
	Its purpose is to distinguish between a person's name and a company/organization name with high accuracy.

	### Direct Use

	This model is intended to be used for text classification. Given a string, it will return a label indicating whether the string is a `Person` or a `Company`.

	```python
	from transformers import pipeline

	classifier = pipeline("text-classification", model="ele-sage/mdeberta-v3-base-name-classifier")

	results = classifier([
	"Satya Nadella",
	"Global Innovations Inc.",
	"Martinez, Alonso"
	])

	for result in results:
	print(f"Text: '{result['text']}', Prediction: {result['label']}, Score: {result['score']:.4f}")
	```

	### Downstream Use

	This model is a key component of a two-stage name processing pipeline. It is designed to be used as a fast, efficient "gatekeeper" to first identify person names before passing them to a more complex parsing model, such as `ele-sage/distilbert-base-uncased-name-splitter`.

	### Out-of-Scope Use

	- This model is not a general-purpose classifier. It is highly specialized for distinguishing persons from companies and will not perform well on other classification tasks (e.g., sentiment analysis).

	## Bias, Risks, and Limitations

	- Geographic & Cultural Bias: The training data is heavily biased towards North American (Canadian) person names and Quebec-based company names. The model will be less accurate when classifying names from other cultural or geographic origins.
	- Ambiguity: Certain names can legitimately be both a person's name and a company's name (e.g., "Ford"). In these cases, the model makes a statistical guess based on its training data, which may not always align with the specific context.
	- Data Source: The person name data is derived from a Facebook data leak and contains noise. While a rigorous cleaning process was applied, the model may have learned from some spurious data.


	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 8e-06
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 2000
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision \| Recall \| F1 \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:--------:\|:---------:\|:------:\|:------:\|
	\| 0.0592 \| 0.0203 \| 2000 \| 0.0526 \| 0.9877 \| 0.9912 \| 0.9872 \| 0.9892 \|
	\| 0.0473 \| 0.0406 \| 4000 \| 0.0429 \| 0.9891 \| 0.9940 \| 0.9868 \| 0.9904 \|
	\| 0.0491 \| 0.0610 \| 6000 \| 0.0407 \| 0.9893 \| 0.9949 \| 0.9863 \| 0.9906 \|
	\| 0.0383 \| 0.0813 \| 8000 \| 0.0386 \| 0.9898 \| 0.9954 \| 0.9868 \| 0.9911 \|
	\| 0.0415 \| 0.1016 \| 10000 \| 0.0378 \| 0.9904 \| 0.9950 \| 0.9881 \| 0.9915 \|
	\| 0.0315 \| 0.1219 \| 12000 \| 0.0410 \| 0.9905 \| 0.9955 \| 0.9877 \| 0.9916 \|
	\| 0.0416 \| 0.1422 \| 14000 \| 0.0387 \| 0.9908 \| 0.9950 \| 0.9888 \| 0.9919 \|
	\| 0.0292 \| 0.1625 \| 16000 \| 0.0383 \| 0.9908 \| 0.9964 \| 0.9874 \| 0.9919 \|
	\| 0.0381 \| 0.1829 \| 18000 \| 0.0357 \| 0.9907 \| 0.9959 \| 0.9878 \| 0.9918 \|
	\| 0.0266 \| 0.2032 \| 20000 \| 0.0395 \| 0.9909 \| 0.9938 \| 0.9902 \| 0.9920 \|
	\| 0.035 \| 0.2235 \| 22000 \| 0.0392 \| 0.9909 \| 0.9956 \| 0.9885 \| 0.9920 \|
	\| 0.0333 \| 0.2438 \| 24000 \| 0.0356 \| 0.9910 \| 0.9935 \| 0.9907 \| 0.9921 \|
	\| 0.0321 \| 0.2641 \| 26000 \| 0.0343 \| 0.9909 \| 0.9947 \| 0.9894 \| 0.9920 \|
	\| 0.0308 \| 0.2845 \| 28000 \| 0.0360 \| 0.9912 \| 0.9954 \| 0.9892 \| 0.9923 \|
	\| 0.0317 \| 0.3048 \| 30000 \| 0.0348 \| 0.9912 \| 0.9941 \| 0.9905 \| 0.9923 \|
	\| 0.0359 \| 0.3251 \| 32000 \| 0.0346 \| 0.9913 \| 0.9959 \| 0.9889 \| 0.9924 \|
	\| 0.0437 \| 0.3454 \| 34000 \| 0.0333 \| 0.9912 \| 0.9957 \| 0.9889 \| 0.9923 \|
	\| 0.0401 \| 0.3657 \| 36000 \| 0.0334 \| 0.9914 \| 0.9954 \| 0.9895 \| 0.9924 \|
	\| 0.0419 \| 0.3861 \| 38000 \| 0.0321 \| 0.9915 \| 0.9957 \| 0.9895 \| 0.9926 \|
	\| 0.032 \| 0.4064 \| 40000 \| 0.0339 \| 0.9914 \| 0.9947 \| 0.9902 \| 0.9925 \|
	\| 0.0367 \| 0.4267 \| 42000 \| 0.0314 \| 0.9916 \| 0.9948 \| 0.9904 \| 0.9926 \|
	\| 0.0276 \| 0.4470 \| 44000 \| 0.0355 \| 0.9915 \| 0.9954 \| 0.9897 \| 0.9925 \|
	\| 0.0373 \| 0.4673 \| 46000 \| 0.0321 \| 0.9916 \| 0.9954 \| 0.9899 \| 0.9926 \|
	\| 0.0364 \| 0.4876 \| 48000 \| 0.0327 \| 0.9915 \| 0.9966 \| 0.9885 \| 0.9925 \|
	\| 0.0317 \| 0.5080 \| 50000 \| 0.0311 \| 0.9914 \| 0.9934 \| 0.9915 \| 0.9924 \|
	\| 0.0355 \| 0.5283 \| 52000 \| 0.0307 \| 0.9917 \| 0.9957 \| 0.9898 \| 0.9927 \|
	\| 0.0276 \| 0.5486 \| 54000 \| 0.0321 \| 0.9918 \| 0.9952 \| 0.9904 \| 0.9928 \|
	\| 0.0342 \| 0.5689 \| 56000 \| 0.0319 \| 0.9918 \| 0.9956 \| 0.9900 \| 0.9928 \|
	\| 0.0316 \| 0.5892 \| 58000 \| 0.0314 \| 0.9918 \| 0.9949 \| 0.9906 \| 0.9928 \|
	\| 0.0322 \| 0.6096 \| 60000 \| 0.0315 \| 0.9916 \| 0.9942 \| 0.9912 \| 0.9927 \|
	\| 0.0357 \| 0.6299 \| 62000 \| 0.0309 \| 0.9921 \| 0.9955 \| 0.9905 \| 0.9930 \|
	\| 0.0296 \| 0.6502 \| 64000 \| 0.0326 \| 0.9919 \| 0.9955 \| 0.9903 \| 0.9929 \|
	\| 0.0324 \| 0.6705 \| 66000 \| 0.0312 \| 0.9919 \| 0.9958 \| 0.9900 \| 0.9929 \|
	\| 0.0266 \| 0.6908 \| 68000 \| 0.0319 \| 0.9920 \| 0.9958 \| 0.9902 \| 0.9930 \|
	\| 0.028 \| 0.7112 \| 70000 \| 0.0321 \| 0.9920 \| 0.9961 \| 0.9899 \| 0.9930 \|
	\| 0.0276 \| 0.7315 \| 72000 \| 0.0319 \| 0.9919 \| 0.9963 \| 0.9895 \| 0.9929 \|
	\| 0.0288 \| 0.7518 \| 74000 \| 0.0316 \| 0.9920 \| 0.9952 \| 0.9908 \| 0.9930 \|
	\| 0.0295 \| 0.7721 \| 76000 \| 0.0304 \| 0.9920 \| 0.9955 \| 0.9904 \| 0.9930 \|
	\| 0.0305 \| 0.7924 \| 78000 \| 0.0309 \| 0.9920 \| 0.9963 \| 0.9896 \| 0.9929 \|
	\| 0.0298 \| 0.8127 \| 80000 \| 0.0312 \| 0.9921 \| 0.9962 \| 0.9899 \| 0.9930 \|
	\| 0.0241 \| 0.8331 \| 82000 \| 0.0312 \| 0.9921 \| 0.9954 \| 0.9907 \| 0.9930 \|
	\| 0.0332 \| 0.8534 \| 84000 \| 0.0308 \| 0.9920 \| 0.9955 \| 0.9906 \| 0.9930 \|
	\| 0.0281 \| 0.8737 \| 86000 \| 0.0301 \| 0.9922 \| 0.9957 \| 0.9905 \| 0.9931 \|
	\| 0.0274 \| 0.8940 \| 88000 \| 0.0305 \| 0.9921 \| 0.9952 \| 0.9908 \| 0.9930 \|
	\| 0.0263 \| 0.9143 \| 90000 \| 0.0300 \| 0.9922 \| 0.9958 \| 0.9905 \| 0.9931 \|
	\| 0.0215 \| 0.9347 \| 92000 \| 0.0304 \| 0.9921 \| 0.9952 \| 0.9909 \| 0.9931 \|
	\| 0.0367 \| 0.9550 \| 94000 \| 0.0297 \| 0.9922 \| 0.9956 \| 0.9907 \| 0.9931 \|
	\| 0.0298 \| 0.9753 \| 96000 \| 0.0302 \| 0.9922 \| 0.9955 \| 0.9908 \| 0.9931 \|
	\| 0.0202 \| 0.9956 \| 98000 \| 0.0305 \| 0.9922 \| 0.9957 \| 0.9906 \| 0.9931 \|


	### Framework versions

	- Transformers 4.57.1
	- Pytorch 2.9.0+cu128
	- Datasets 4.4.1
	- Tokenizers 0.22.1