flan-context

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.3807
Rouge: {'rouge1': 0.23363126618471036, 'rouge2': 0.08044636729263657, 'rougeL': 0.19993208445605554, 'rougeLsum': 0.2006197564048095}
Bleu: {'bleu': 0.028464166431939678, 'precisions': [0.35760233918128653, 0.11201248049921997, 0.053177257525083614, 0.02918918918918919], 'brevity_penalty': 0.32054926603854916, 'length_ratio': 0.46778826425933523, 'translation_length': 3420, 'reference_length': 7311}
Bertscore Precision: 0.8820
Bertscore Recall: 0.8608
Bertscore F1: 0.8712
Meteor: 0.1601

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 6

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge	Bleu	Bertscore Precision	Bertscore Recall	Bertscore F1	Meteor
2.9042	0.9973	188	2.4777	{'rouge1': 0.21030667522066182, 'rouge2': 0.07507453780009368, 'rougeL': 0.18106782455686604, 'rougeLsum': 0.1812910579859416}	{'bleu': 0.020546614226174383, 'precisions': [0.38292011019283745, 0.12011900334696914, 0.05739692805173808, 0.029216467463479414], 'brevity_penalty': 0.21924576067972068, 'length_ratio': 0.3972096840377513, 'translation_length': 2904, 'reference_length': 7311}	0.8919	0.8583	0.8746	0.1485
2.5901	2.0	377	2.4190	{'rouge1': 0.2300367112398293, 'rouge2': 0.07916991662104621, 'rougeL': 0.19576726074969686, 'rougeLsum': 0.19614355426725824}	{'bleu': 0.025952260633399154, 'precisions': [0.3812206572769953, 0.12080536912751678, 0.056419529837251355, 0.03019607843137255], 'brevity_penalty': 0.2757493685471216, 'length_ratio': 0.4370127205580632, 'translation_length': 3195, 'reference_length': 7311}	0.8904	0.8618	0.8758	0.1596
2.4323	2.9973	565	2.3877	{'rouge1': 0.2249008577752543, 'rouge2': 0.07434980814070552, 'rougeL': 0.19371686119466586, 'rougeLsum': 0.19396904482546817}	{'bleu': 0.025074382590179446, 'precisions': [0.35903614457831323, 0.11207729468599034, 0.048788927335640137, 0.024672897196261683], 'brevity_penalty': 0.3005598328831212, 'length_ratio': 0.45411024483654766, 'translation_length': 3320, 'reference_length': 7311}	0.8848	0.8595	0.8719	0.1603
2.2934	4.0	754	2.3785	{'rouge1': 0.23136472567858818, 'rouge2': 0.07625775070863702, 'rougeL': 0.19775377104228403, 'rougeLsum': 0.19791574466368636}	{'bleu': 0.024551483182166975, 'precisions': [0.3596233078281342, 0.10713163682060949, 0.04481132075471698, 0.021067925899019253], 'brevity_penalty': 0.31614328297564415, 'length_ratio': 0.464779099986322, 'translation_length': 3398, 'reference_length': 7311}	0.8832	0.8599	0.8713	0.1591
2.2147	4.9973	942	2.3790	{'rouge1': 0.23763851620529536, 'rouge2': 0.08141814036639361, 'rougeL': 0.2001568601892993, 'rougeLsum': 0.20096544591093737}	{'bleu': 0.028773005913963673, 'precisions': [0.3619631901840491, 0.11564837905236908, 0.05412629468760441, 0.02843772498200144], 'brevity_penalty': 0.3211503937094765, 'length_ratio': 0.46819860484201886, 'translation_length': 3423, 'reference_length': 7311}	0.8848	0.8623	0.8733	0.1639
2.156	5.9841	1128	2.3807	{'rouge1': 0.23363126618471036, 'rouge2': 0.08044636729263657, 'rougeL': 0.19993208445605554, 'rougeLsum': 0.2006197564048095}	{'bleu': 0.028464166431939678, 'precisions': [0.35760233918128653, 0.11201248049921997, 0.053177257525083614, 0.02918918918918919], 'brevity_penalty': 0.32054926603854916, 'length_ratio': 0.46778826425933523, 'translation_length': 3420, 'reference_length': 7311}	0.8820	0.8608	0.8712	0.1601

Framework versions

Transformers 4.46.3
Pytorch 2.4.1+cu121
Datasets 2.20.0
Tokenizers 0.20.3

zera09
/

flan-context

flan-context

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for zera09/flan-context

Evaluation results