poison-distill

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: -113.4181
Accuracy: 0.6917

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
-0.7533	1.0	130	-7.9549	0.5263
-8.2193	2.0	260	-15.5418	0.4662
-14.3197	3.0	390	-32.2167	0.4737
-18.5547	4.0	520	-18.9202	0.5489
-22.6905	5.0	650	-55.1682	0.4361
-27.5336	6.0	780	-32.4679	0.3459
-29.5975	7.0	910	-48.1715	0.3985
-34.1837	8.0	1040	-67.7293	0.6165
-37.6123	9.0	1170	-52.1341	0.4662
-40.7694	10.0	1300	-49.0945	0.6767
-43.3691	11.0	1430	-37.0478	0.5489
-47.6433	12.0	1560	-73.0523	0.4511
-51.0141	13.0	1690	-110.8840	0.4812
-54.6	14.0	1820	-81.2219	0.3308
-57.2133	15.0	1950	-80.8684	0.5113
-58.3442	16.0	2080	-66.5341	0.4060
-64.7089	17.0	2210	-75.7059	0.5564
-64.26	18.0	2340	-77.7801	0.5263
-67.8509	19.0	2470	-61.1841	0.6316
-71.9371	20.0	2600	-118.1544	0.5038
-75.9672	21.0	2730	-179.2044	0.4812
-78.0096	22.0	2860	-129.4854	0.4436
-80.3581	23.0	2990	-100.0687	0.4286
-84.623	24.0	3120	-82.5292	0.3835
-86.5363	25.0	3250	-84.6636	0.4211
-90.8566	26.0	3380	-96.3337	0.5489
-92.2054	27.0	3510	-110.3293	0.4737
-97.6982	28.0	3640	-195.6973	0.4135
-95.8944	29.0	3770	-101.9933	0.3609
-99.491	30.0	3900	-99.8199	0.6541
-103.0877	31.0	4030	-94.2175	0.6767
-102.7123	32.0	4160	-98.6300	0.4887
-105.2087	33.0	4290	-152.7768	0.4962
-105.3795	34.0	4420	-198.8245	0.5263
-108.9734	35.0	4550	-105.7644	0.4286
-111.1308	36.0	4680	-121.4677	0.4962
-115.0085	37.0	4810	-75.3733	0.3083
-114.714	38.0	4940	-115.4598	0.6617
-117.5734	39.0	5070	-108.3964	0.4135
-115.1971	40.0	5200	-123.7679	0.3835
-117.5617	41.0	5330	-69.2224	0.2932
-118.2803	42.0	5460	-104.5906	0.6541
-119.6297	43.0	5590	-187.3416	0.5188
-121.6325	44.0	5720	-221.8878	0.5113
-120.9663	45.0	5850	-176.6644	0.3759
-122.3583	46.0	5980	-142.5218	0.4361
-126.6614	47.0	6110	-271.1018	0.4962
-122.1615	48.0	6240	-240.8323	0.3985
-125.4207	49.0	6370	-103.5760	0.6466
-127.0661	50.0	6500	-113.2718	0.6842

Framework versions

Transformers 4.46.3
Pytorch 2.5.1+cu121
Datasets 3.1.0
Tokenizers 0.20.3

alem-147
/

poison-distill

poison-distill

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results