a0fb5bd9589673e8b2b9f4cc46cb7532
This model is a fine-tuned version of albert/albert-base-v1 on the nyu-mll/glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.5031
- Data Size: 1.0
- Epoch Runtime: 6.8837
- Mse: 0.5033
- Mae: 0.5415
- R2: 0.7749
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Mse | Mae | R2 |
|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 6.4215 | 0 | 1.0263 | 6.4227 | 2.1112 | -1.8731 |
| No log | 1 | 179 | 2.9569 | 0.0078 | 1.3420 | 2.9579 | 1.4588 | -0.3232 |
| No log | 2 | 358 | 2.6181 | 0.0156 | 1.1635 | 2.6188 | 1.3277 | -0.1715 |
| No log | 3 | 537 | 2.1869 | 0.0312 | 1.3046 | 2.1877 | 1.2718 | 0.0214 |
| No log | 4 | 716 | 1.9238 | 0.0625 | 1.5139 | 1.9242 | 1.1393 | 0.1392 |
| No log | 5 | 895 | 1.1499 | 0.125 | 1.9057 | 1.1502 | 0.8530 | 0.4855 |
| 0.1031 | 6 | 1074 | 1.0166 | 0.25 | 2.6710 | 1.0172 | 0.7894 | 0.5450 |
| 0.703 | 7 | 1253 | 0.6153 | 0.5 | 4.0954 | 0.6156 | 0.6150 | 0.7246 |
| 0.5178 | 8.0 | 1432 | 0.5484 | 1.0 | 7.2831 | 0.5485 | 0.5788 | 0.7546 |
| 0.4075 | 9.0 | 1611 | 0.5447 | 1.0 | 7.1745 | 0.5450 | 0.5561 | 0.7562 |
| 0.2893 | 10.0 | 1790 | 0.5207 | 1.0 | 7.1285 | 0.5208 | 0.5477 | 0.7670 |
| 0.2434 | 11.0 | 1969 | 0.5114 | 1.0 | 7.0334 | 0.5116 | 0.5470 | 0.7711 |
| 0.203 | 12.0 | 2148 | 0.4930 | 1.0 | 7.0510 | 0.4932 | 0.5373 | 0.7794 |
| 0.1697 | 13.0 | 2327 | 0.5077 | 1.0 | 6.9619 | 0.5079 | 0.5490 | 0.7728 |
| 0.142 | 14.0 | 2506 | 0.5123 | 1.0 | 6.9411 | 0.5124 | 0.5403 | 0.7708 |
| 0.1304 | 15.0 | 2685 | 0.5690 | 1.0 | 6.8833 | 0.5693 | 0.5855 | 0.7453 |
| 0.1063 | 16.0 | 2864 | 0.5031 | 1.0 | 6.8837 | 0.5033 | 0.5415 | 0.7749 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Model tree for contemmcm/a0fb5bd9589673e8b2b9f4cc46cb7532
Base model
albert/albert-base-v1