TianyiQ's picture
Upload folder using huggingface_hub
59bfb93 verified
|
raw
history blame
2.6 kB
metadata
license: other
base_model: meta-llama/Meta-Llama-3-8B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C018_random_sample_llama3-8b-base_pretrain_20240504_182259
    results: []

C018_random_sample_llama3-8b-base_pretrain_20240504_182259

This model is a fine-tuned version of /data/pro-align/progressalign/shared_storage/downloaded_models/llama3-8b-base on the C018_random_sample_data dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2706

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 4.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.3701 0.2186 200 2.3702
2.3183 0.4372 400 2.3160
2.2634 0.6557 600 2.2863
2.2522 0.8743 800 2.2706
2.0306 1.0929 1000 2.2777
2.0095 1.3115 1200 2.2760
2.0539 1.5301 1400 2.2746
2.0338 1.7486 1600 2.2743
2.0648 1.9672 1800 2.2737
2.0297 2.1858 2000 2.2766
2.0487 2.4044 2200 2.2767
2.0329 2.6230 2400 2.2770
2.0213 2.8415 2600 2.2766
2.0559 3.0601 2800 2.2771
2.0543 3.2787 3000 2.2773
2.0317 3.4973 3200 2.2772
1.988 3.7158 3400 2.2770
2.0355 3.9344 3600 2.2772

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0
  • Datasets 2.19.0
  • Tokenizers 0.19.1