Samzy17 commited on
Commit
d64b82e
·
verified ·
1 Parent(s): 6fcbb7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -6,6 +6,29 @@ base_model:
6
  pipeline_tag: text-generation
7
  license: mit
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/671ad995ca9561981190dbb4/ndneRnA3jP563cKMEtMth.png)
11
 
 
6
  pipeline_tag: text-generation
7
  license: mit
8
  ---
9
+ # Purpose of this finetuning
10
+
11
+ <!-- Provide a quick summary of what the model is/does. -->
12
+
13
+ Finetune base model [GPT2-IMDB](https://huggingface.co/lvwerra/gpt2-imdb) using a using [this BERT sentiment classifier](https://huggingface.co/lvwerra/distilbert-imdb) as a reward function.
14
+
15
+ - The goal is to train the GPT2 model to extrapolate on a movie review and generate negative sentiment.
16
+ - There is a separate training done to generate positive movie reviews. The eventual goal would be to interpolate the weight spaces of the 'positively fintuned' and 'negatively finetuned' models as per the [rewarded-soups paper](https://arxiv.org/abs/2306.04488) and test if it results in (qualitatively) neutral reviews.
17
+
18
+ ## Model Params
19
+
20
+ Here are the traning parameters
21
+ - base_model ='lvwerra/gpt2-imdb'
22
+ - dataset = stanfordnlp/imdb
23
+ - batch_size = 16
24
+ - learning_rate = 1.41e-5
25
+ - output_max_length = 16
26
+ - output_min_length = 4
27
+
28
+ Not sure how long it took, but less than a couple hours on a single A6000 GPU
29
+
30
+
31
+ ### Results
32
 
33
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/671ad995ca9561981190dbb4/ndneRnA3jP563cKMEtMth.png)
34