Summarization with FLAN-T5
This project demonstrates a workflow to fine-tune the FLAN-T5 model for dialogue summarization using the DialogSum
dataset. It includes data preprocessing, tokenization, training, and evaluation of the model.
Installation
- Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate
- Run the below command
pip install -r requirements.txt
Model Features:
- Dataset: Uses the DialogSum dataset for summarization tasks.
- Model: Fine-tunes the
FLAN-T5
model (google/flan-t5-base) for generating conversation summaries. - GPU Support: Automatically detects and utilizes CUDA or MPS if available for faster computation.
- Tokenization: Prepares data for training using the transformers library.
- Training: Implements a training pipeline with Trainer for fine-tuning the model.
- Evaluation: Compares summaries generated by the baseline, the fine-tuned model, and human-written summaries.
Customization
- Training Configuration: Modify
TrainingArguments
to customize learning rate, number of epochs, and other parameters. - Dataset: Replace the
DialogSum
dataset with your dataset for different summarization tasks.
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for dhrumeen/small_summarization_model
Base model
google/flan-t5-base