PrasannSinghal
/

length-biases-wgpt-policy

Model card Files Files and versions Community

PrasannSinghal commited on Apr 3

Commit

6293ed3

•

1 Parent(s): deed324

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

step_100/README.md +42 -0
step_100/adapter_config.json +20 -0
step_100/adapter_model.bin +3 -0
step_100/added_tokens.json +3 -0
step_100/pytorch_model.bin +3 -0
step_100/special_tokens_map.json +24 -0
step_100/tokenizer.json +0 -0
step_100/tokenizer.model +3 -0
step_100/tokenizer_config.json +32 -0
step_125/README.md +42 -0
step_125/adapter_config.json +20 -0
step_125/adapter_model.bin +3 -0
step_125/added_tokens.json +3 -0
step_125/pytorch_model.bin +3 -0
step_125/special_tokens_map.json +24 -0
step_125/tokenizer.json +0 -0
step_125/tokenizer.model +3 -0
step_125/tokenizer_config.json +32 -0
step_150/README.md +42 -0
step_150/adapter_config.json +20 -0
step_150/adapter_model.bin +3 -0
step_150/added_tokens.json +3 -0
step_150/pytorch_model.bin +3 -0
step_150/special_tokens_map.json +24 -0
step_150/tokenizer.json +0 -0
step_150/tokenizer.model +3 -0
step_150/tokenizer_config.json +32 -0
step_25/README.md +42 -0
step_25/adapter_config.json +20 -0
step_25/adapter_model.bin +3 -0
step_25/added_tokens.json +3 -0
step_25/pytorch_model.bin +3 -0
step_25/special_tokens_map.json +24 -0
step_25/tokenizer.json +0 -0
step_25/tokenizer.model +3 -0
step_25/tokenizer_config.json +32 -0
step_50/README.md +42 -0
step_50/adapter_config.json +20 -0
step_50/adapter_model.bin +3 -0
step_50/added_tokens.json +3 -0
step_50/pytorch_model.bin +3 -0
step_50/special_tokens_map.json +24 -0
step_50/tokenizer.json +0 -0
step_50/tokenizer.model +3 -0
step_50/tokenizer_config.json +32 -0
step_75/README.md +42 -0
step_75/adapter_config.json +20 -0
step_75/adapter_model.bin +3 -0
step_75/added_tokens.json +3 -0
step_75/pytorch_model.bin +3 -0

step_100/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_100")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_100")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_100")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_100/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_100/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:488f3a044499e9f6617123fbd1190650067d3da1b0a93037c58ec3fcd4939f27
+size 33600461

step_100/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_100/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fa3ed707b9c86f2a6b946d007a615d62ea606948f658ebcd97ae0a46c96f46f5
+size 17471

step_100/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "[PAD]",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_100/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

step_100/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

step_100/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "model_max_length": 512,
+  "pad_token": null,
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_125/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_125")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_125")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_125")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_125/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_125/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3cf2d8f6e72ce5362a0708190ce7ec94b48323837a4cdf08ccd4fab1ca1e70ed
+size 33600461

step_125/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_125/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:46dbe338fd863c09e2b1884fad40e1e22fea72606c7effa8256ef463b57a6dae
+size 17471

step_125/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "[PAD]",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_125/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

step_125/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

step_125/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "model_max_length": 512,
+  "pad_token": null,
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_150/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_150")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_150")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_150")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_150/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_150/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:697f7f23cb74b2cae637ab7f9b5f3f8ddf40bdb36689969ba876492aed45e607
+size 33600461

step_150/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_150/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8124c555fdd63bd4126f2f6a18d7197d0d46ad878d5d5e55e426921a49a85b0e
+size 17471

step_150/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "[PAD]",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_150/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

step_150/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

step_150/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "model_max_length": 512,
+  "pad_token": null,
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_25/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_25")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_25")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_25")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_25/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_25/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:385d3fd31bee7c4883f33ffd445e029eb238d425a45f0692110167ba5333302e
+size 33600461

step_25/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_25/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca73c48660d7b5c160bb55ef364e2444a1b4484494fc2354d9aeb966667beca6
+size 17471

step_25/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "[PAD]",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_25/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

step_25/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

step_25/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "model_max_length": 512,
+  "pad_token": null,
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_50/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_50")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_50")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_50")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_50/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_50/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cfb926aa939d2c1f52fb443f645dfba085dbec97fd55207722821182f77e7c69
+size 33600461

step_50/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_50/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:063e98fda7df2e2a2363ce90ca5b450e3de78a51bf7f554ec0b7a3d54bf02139
+size 17471

step_50/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "[PAD]",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_50/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

step_50/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
+size 499723

step_50/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "clean_up_tokenization_spaces": false,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "model_max_length": 512,
+  "pad_token": null,
+  "padding_side": "right",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

step_75/README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: apache-2.0
+tags:
+- trl
+- transformers
+- reinforcement-learning
+---
+# TRL Model
+This is a [TRL language model](https://github.com/lvwerra/trl) that has been fine-tuned with reinforcement learning to
+ guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
+## Usage
+To use this model for inference, first install the TRL library:
+```bash
+python -m pip install trl
+```
+You can then generate text as follows:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="PrasannSinghal/checkpoints/wgptapsft/step_75")
+outputs = generator("Hello, my llama is cute")
+```
+If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
+```python
+from transformers import AutoTokenizer
+from trl import AutoModelForCausalLMWithValueHead
+tokenizer = AutoTokenizer.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_75")
+model = AutoModelForCausalLMWithValueHead.from_pretrained("PrasannSinghal/checkpoints/wgptapsft/step_75")
+inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
+outputs = model(**inputs, labels=inputs["input_ids"])
+```

step_75/adapter_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "base_model_name_or_path": "/home/prasann/Projects/tfr-decoding/apfarm_models/sft10k/",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 16,
+  "revision": null,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

step_75/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6079337c680545591dd960522e4d2daa52a702eaabc4111ea54c03318cc489c9
+size 33600461

step_75/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[PAD]": 32000
+}

step_75/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:874ad170e94bbb931333b2bdeda4956aabfcb5f9bedb2f88aaf71c8a228835af
+size 17471