flan-t5-rouge-durga-q5-clean-4f
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0021
- Rouge1: 0.7371
- Rouge2: 0.7114
- Rougel: 0.7373
- Rougelsum: 0.7377
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 60
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
2.0584 | 1.0 | 9 | 1.6093 | 0.2821 | 0.0871 | 0.2751 | 0.2760 |
1.9958 | 2.0 | 18 | 1.1569 | 0.3267 | 0.1036 | 0.3184 | 0.3195 |
1.174 | 3.0 | 27 | 0.8836 | 0.3765 | 0.1667 | 0.3660 | 0.3668 |
1.1673 | 4.0 | 36 | 0.6420 | 0.3653 | 0.1586 | 0.3574 | 0.3582 |
1.0302 | 5.0 | 45 | 0.4727 | 0.3987 | 0.2228 | 0.3942 | 0.3944 |
0.6135 | 6.0 | 54 | 0.3187 | 0.4170 | 0.2446 | 0.4106 | 0.4112 |
0.5838 | 7.0 | 63 | 0.2294 | 0.4530 | 0.2996 | 0.4465 | 0.4472 |
0.4479 | 8.0 | 72 | 0.1891 | 0.4614 | 0.3185 | 0.4572 | 0.4574 |
0.3936 | 9.0 | 81 | 0.1373 | 0.4651 | 0.3179 | 0.4619 | 0.4622 |
0.3307 | 10.0 | 90 | 0.1073 | 0.5070 | 0.3895 | 0.5066 | 0.5076 |
0.3624 | 11.0 | 99 | 0.0845 | 0.5060 | 0.3903 | 0.5062 | 0.5063 |
0.1817 | 12.0 | 108 | 0.0702 | 0.5443 | 0.4428 | 0.5447 | 0.5450 |
0.2335 | 13.0 | 117 | 0.0705 | 0.5125 | 0.4081 | 0.5116 | 0.5119 |
0.1604 | 14.0 | 126 | 0.0650 | 0.5452 | 0.4443 | 0.5461 | 0.5451 |
0.1306 | 15.0 | 135 | 0.0540 | 0.5463 | 0.4521 | 0.5474 | 0.5474 |
0.1194 | 16.0 | 144 | 0.0489 | 0.5922 | 0.5120 | 0.5932 | 0.5917 |
0.2133 | 17.0 | 153 | 0.0441 | 0.5739 | 0.4873 | 0.5728 | 0.5737 |
0.1035 | 18.0 | 162 | 0.0425 | 0.5791 | 0.4981 | 0.5784 | 0.5789 |
0.1049 | 19.0 | 171 | 0.0333 | 0.6326 | 0.5635 | 0.6334 | 0.6332 |
0.1165 | 20.0 | 180 | 0.0287 | 0.6387 | 0.5769 | 0.6380 | 0.6388 |
0.1197 | 21.0 | 189 | 0.0300 | 0.5980 | 0.5240 | 0.5990 | 0.5998 |
0.0607 | 22.0 | 198 | 0.0245 | 0.6445 | 0.5833 | 0.6455 | 0.6451 |
0.1443 | 23.0 | 207 | 0.0238 | 0.6438 | 0.5828 | 0.6456 | 0.6462 |
0.0727 | 24.0 | 216 | 0.0188 | 0.6747 | 0.6253 | 0.6774 | 0.6764 |
0.0462 | 25.0 | 225 | 0.0177 | 0.6914 | 0.6391 | 0.6921 | 0.6912 |
0.0804 | 26.0 | 234 | 0.0132 | 0.6967 | 0.6520 | 0.6967 | 0.6985 |
0.0337 | 27.0 | 243 | 0.0135 | 0.6955 | 0.6475 | 0.6961 | 0.6961 |
0.0459 | 28.0 | 252 | 0.0131 | 0.7002 | 0.6584 | 0.7019 | 0.7020 |
0.0233 | 29.0 | 261 | 0.0102 | 0.7074 | 0.6665 | 0.7080 | 0.7095 |
0.0228 | 30.0 | 270 | 0.0112 | 0.7040 | 0.6644 | 0.7044 | 0.7052 |
0.0435 | 31.0 | 279 | 0.0080 | 0.7115 | 0.6724 | 0.7119 | 0.7123 |
0.0364 | 32.0 | 288 | 0.0114 | 0.7082 | 0.6666 | 0.7100 | 0.7095 |
0.0112 | 33.0 | 297 | 0.0086 | 0.7165 | 0.6787 | 0.7177 | 0.7174 |
0.0325 | 34.0 | 306 | 0.0068 | 0.7251 | 0.6931 | 0.7262 | 0.7262 |
0.0173 | 35.0 | 315 | 0.0052 | 0.7310 | 0.7015 | 0.7315 | 0.7319 |
0.0599 | 36.0 | 324 | 0.0058 | 0.7276 | 0.6972 | 0.7289 | 0.7291 |
0.0125 | 37.0 | 333 | 0.0044 | 0.7328 | 0.7057 | 0.7331 | 0.7332 |
0.0155 | 38.0 | 342 | 0.0054 | 0.7218 | 0.6882 | 0.7227 | 0.7234 |
0.0199 | 39.0 | 351 | 0.0050 | 0.7275 | 0.6965 | 0.7287 | 0.7292 |
0.0109 | 40.0 | 360 | 0.0035 | 0.7334 | 0.7064 | 0.7339 | 0.7347 |
0.0229 | 41.0 | 369 | 0.0034 | 0.7334 | 0.7064 | 0.7339 | 0.7347 |
0.0353 | 42.0 | 378 | 0.0033 | 0.7334 | 0.7064 | 0.7339 | 0.7347 |
0.0124 | 43.0 | 387 | 0.0035 | 0.7352 | 0.7084 | 0.7357 | 0.7354 |
0.0147 | 44.0 | 396 | 0.0033 | 0.7319 | 0.7036 | 0.7322 | 0.7327 |
0.0055 | 45.0 | 405 | 0.0032 | 0.7310 | 0.7026 | 0.7312 | 0.7320 |
0.0183 | 46.0 | 414 | 0.0031 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.004 | 47.0 | 423 | 0.0033 | 0.7342 | 0.7067 | 0.7344 | 0.7349 |
0.0195 | 48.0 | 432 | 0.0032 | 0.7311 | 0.7018 | 0.7318 | 0.7323 |
0.0112 | 49.0 | 441 | 0.0031 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0186 | 50.0 | 450 | 0.0029 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0043 | 51.0 | 459 | 0.0028 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.011 | 52.0 | 468 | 0.0023 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0203 | 53.0 | 477 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0099 | 54.0 | 486 | 0.0021 | 0.7367 | 0.7113 | 0.7367 | 0.7376 |
0.0095 | 55.0 | 495 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.021 | 56.0 | 504 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0191 | 57.0 | 513 | 0.0022 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0033 | 58.0 | 522 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0264 | 59.0 | 531 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
0.0034 | 60.0 | 540 | 0.0021 | 0.7371 | 0.7114 | 0.7373 | 0.7377 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.0+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 107
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4f
Base model
google/flan-t5-base