--- library_name: transformers license: apache-2.0 base_model: google/flan-t5-base tags: - generated_from_trainer metrics: - rouge model-index: - name: flan-t5-rouge-squad-qg-90 results: [] --- # flan-t5-rouge-squad-qg-90 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2170 - Rouge1: 0.2213 - Rouge2: 0.0920 - Rougel: 0.2137 - Rougelsum: 0.2180 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0003 - train_batch_size: 24 - eval_batch_size: 24 - seed: 42 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: linear - num_epochs: 90 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:| | 21.488 | 1.0 | 3 | 20.7368 | 0.2042 | 0.1174 | 0.2047 | 0.2051 | | 10.791 | 2.0 | 6 | 5.6741 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 4.4366 | 3.0 | 9 | 4.3138 | 0.2532 | 0.1159 | 0.2031 | 0.2057 | | 4.0326 | 4.0 | 12 | 3.8146 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 3.325 | 5.0 | 15 | 2.8803 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 2.3167 | 6.0 | 18 | 1.2409 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 1.5534 | 7.0 | 21 | 0.7715 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 1.0706 | 8.0 | 24 | 0.4026 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 0.6125 | 9.0 | 27 | 0.2829 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 0.4704 | 10.0 | 30 | 0.2124 | 0.2934 | 0.2197 | 0.2950 | 0.2944 | | 0.3628 | 11.0 | 33 | 0.1742 | 0.2558 | 0.1184 | 0.2018 | 0.2043 | | 0.1192 | 12.0 | 36 | 0.1349 | 0.4942 | 0.1956 | 0.4725 | 0.4879 | | 0.1087 | 13.0 | 39 | 0.1188 | 0.4942 | 0.1956 | 0.4725 | 0.4879 | | 0.2558 | 14.0 | 42 | 0.1191 | 0.4942 | 0.1956 | 0.4725 | 0.4879 | | 0.0827 | 15.0 | 45 | 0.1235 | 0.4942 | 0.1956 | 0.4725 | 0.4879 | | 0.2771 | 16.0 | 48 | 0.1274 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.1262 | 17.0 | 51 | 0.1296 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.1077 | 18.0 | 54 | 0.1318 | 0.4531 | 0.0977 | 0.3336 | 0.4069 | | 0.1027 | 19.0 | 57 | 0.1348 | 0.4531 | 0.0977 | 0.3336 | 0.4069 | | 0.0681 | 20.0 | 60 | 0.1410 | 0.4531 | 0.0977 | 0.3336 | 0.4069 | | 0.0911 | 21.0 | 63 | 0.1481 | 0.4717 | 0.1557 | 0.4089 | 0.4485 | | 0.1843 | 22.0 | 66 | 0.1520 | 0.4717 | 0.1557 | 0.4089 | 0.4485 | | 0.1799 | 23.0 | 69 | 0.1516 | 0.4717 | 0.1557 | 0.4089 | 0.4485 | | 0.0582 | 24.0 | 72 | 0.1513 | 0.3301 | 0.0907 | 0.2903 | 0.3101 | | 0.0346 | 25.0 | 75 | 0.1528 | 0.3301 | 0.0907 | 0.2903 | 0.3101 | | 0.0802 | 26.0 | 78 | 0.1574 | 0.4531 | 0.0977 | 0.3336 | 0.4069 | | 0.0354 | 27.0 | 81 | 0.1645 | 0.3962 | 0.0936 | 0.3291 | 0.3680 | | 0.0278 | 28.0 | 84 | 0.1729 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.0167 | 29.0 | 87 | 0.1828 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.0119 | 30.0 | 90 | 0.1898 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.0255 | 31.0 | 93 | 0.1940 | 0.4321 | 0.0736 | 0.3182 | 0.3811 | | 0.0554 | 32.0 | 96 | 0.1910 | 0.2851 | 0.0990 | 0.2713 | 0.2719 | | 0.0448 | 33.0 | 99 | 0.1836 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.0305 | 34.0 | 102 | 0.1771 | 0.3301 | 0.0907 | 0.2903 | 0.3101 | | 0.0353 | 35.0 | 105 | 0.1756 | 0.3301 | 0.0907 | 0.2903 | 0.3101 | | 0.0348 | 36.0 | 108 | 0.1781 | 0.3301 | 0.0907 | 0.2903 | 0.3101 | | 0.0394 | 37.0 | 111 | 0.1823 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.0259 | 38.0 | 114 | 0.1861 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.024 | 39.0 | 117 | 0.1905 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.0148 | 40.0 | 120 | 0.1938 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.0385 | 41.0 | 123 | 0.1953 | 0.2756 | 0.0907 | 0.2671 | 0.2706 | | 0.0238 | 42.0 | 126 | 0.1976 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0176 | 43.0 | 129 | 0.2005 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0235 | 44.0 | 132 | 0.2023 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0229 | 45.0 | 135 | 0.2050 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0254 | 46.0 | 138 | 0.2066 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0269 | 47.0 | 141 | 0.2081 | 0.3582 | 0.0653 | 0.2956 | 0.3260 | | 0.0193 | 48.0 | 144 | 0.2093 | 0.3582 | 0.0653 | 0.2956 | 0.3260 | | 0.0326 | 49.0 | 147 | 0.2107 | 0.3612 | 0.0784 | 0.3019 | 0.3324 | | 0.0214 | 50.0 | 150 | 0.2109 | 0.3612 | 0.0784 | 0.3019 | 0.3324 | | 0.0223 | 51.0 | 153 | 0.2110 | 0.3612 | 0.0784 | 0.3019 | 0.3324 | | 0.0293 | 52.0 | 156 | 0.2113 | 0.3612 | 0.0784 | 0.3019 | 0.3324 | | 0.0338 | 53.0 | 159 | 0.2117 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0487 | 54.0 | 162 | 0.2092 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0166 | 55.0 | 165 | 0.2048 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0362 | 56.0 | 168 | 0.2031 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0285 | 57.0 | 171 | 0.2036 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0169 | 58.0 | 174 | 0.2050 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0269 | 59.0 | 177 | 0.2056 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0301 | 60.0 | 180 | 0.2068 | 0.4027 | 0.0936 | 0.3320 | 0.3588 | | 0.0203 | 61.0 | 183 | 0.2084 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0243 | 62.0 | 186 | 0.2105 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0247 | 63.0 | 189 | 0.2132 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0175 | 64.0 | 192 | 0.2168 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0166 | 65.0 | 195 | 0.2203 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.033 | 66.0 | 198 | 0.2213 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0221 | 67.0 | 201 | 0.2211 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0377 | 68.0 | 204 | 0.2211 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0224 | 69.0 | 207 | 0.2195 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0173 | 70.0 | 210 | 0.2190 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0153 | 71.0 | 213 | 0.2191 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0142 | 72.0 | 216 | 0.2196 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0136 | 73.0 | 219 | 0.2199 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0294 | 74.0 | 222 | 0.2194 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0143 | 75.0 | 225 | 0.2180 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0112 | 76.0 | 228 | 0.2168 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0136 | 77.0 | 231 | 0.2165 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0219 | 78.0 | 234 | 0.2159 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0135 | 79.0 | 237 | 0.2157 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0183 | 80.0 | 240 | 0.2158 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0245 | 81.0 | 243 | 0.2157 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0192 | 82.0 | 246 | 0.2155 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0208 | 83.0 | 249 | 0.2157 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0266 | 84.0 | 252 | 0.2159 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0168 | 85.0 | 255 | 0.2162 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0154 | 86.0 | 258 | 0.2164 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0307 | 87.0 | 261 | 0.2166 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0136 | 88.0 | 264 | 0.2168 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0148 | 89.0 | 267 | 0.2169 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | | 0.0182 | 90.0 | 270 | 0.2170 | 0.2213 | 0.0920 | 0.2137 | 0.2180 | ### Framework versions - Transformers 4.47.1 - Pytorch 2.5.1+cu121 - Datasets 3.2.0 - Tokenizers 0.21.0