first

Files changed (10) hide show

__pycache__/partitions.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/partitions.cpython-38.pyc and b/__pycache__/partitions.cpython-38.pyc differ

added_tokens.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"<\|endoftext\|>": 50265}

gpt-neo-1.3B/config.json → config.json RENAMED Viewed

File without changes

gpt-neo-1.3B/flax_model.msgpack → flax_model.msgpack RENAMED Viewed

File without changes

run.sh ADDED Viewed

+python run_clm_mp.py \
+    	--model_name_or_path /mnt/disks/flaxdisk/norwegian-gptneo-red/ \
+  	--tokenizer_name /mnt/disks/flaxdisk/norwegian-gptneo-red/ \
+   	--train_file /mnt/disks/flaxdisk/corpus/social_train.json \
+       	--validation_file /mnt/disks/flaxdisk/corpus/social_validation.json \
+     	--do_train \
+       	--do_eval  \
+     	--block_size 1024 \
+   	--num_train_epochs 10 \
+   	--learning_rate 4e-6 \
+	--per_device_train_batch_size 3 \
+       	--per_device_eval_batch_size 3 \
+    	--overwrite_output_dir \
+	--output_dir /mnt/disks/flaxdisk/norwegian-gptneo-red \
+	--cache_dir /mnt/disks/flaxdisk/cache/ \
+       	--dtype bfloat16 \
+   	--logging_steps 97 \
+       	--eval_steps 96\
+	--push_to_hub

setup_devices.py DELETED Viewed

@@ -1,18 +0,0 @@
-import jax
-import jax.numpy as jnp
-from transformers import FlaxGPTNeoForCausalLM, GPTNeoConfig
-model = FlaxGPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
-emb = jnp.zeros((50264, model.config.hidden_size))
-# update the first 50257 weights using pre-trained weights
-emb = jax.ops.index_update(emb, jax.ops.index[:50257, :], model.params["transformer"]["wte"]["embedding"])
-params = model.params
-params["transformer"]["wte"]["embedding"] = emb
-# initialize a random model with the right vocab_size
-config = GPTNeoConfig.from_pretrained("EleutherAI/gpt-neo-1.3B", vocab_size=50264)
-model = FlaxGPTNeoForCausalLM(config)
-# assign the pre-trained weights and save the model.
-model.params = params
-model.save_pretrained("gpt-neo-1.3B")

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "unk_token": "<\|endoftext\|>"}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"unk_token": "<\|endoftext\|>", "bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "add_prefix_space": false, "special_tokens_map_file": null, "name_or_path": "norwegian-gpt2", "tokenizer_class": "GPT2Tokenizer"}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff