Pavithra commited on Mar 19, 2022

Commit

64138d9

1 Parent(s): 1d86c26

My first ever screeching parrot!

Browse files

Files changed (41) hide show

.gitattributes +4 -0
README.md +62 -0
codeparrot_training.py +208 -0
config.json +39 -0
log/debug_0.log +3 -0
log/debug_1.log +1 -0
log/debug_10.log +1 -0
log/debug_11.log +1 -0
log/debug_12.log +1 -0
log/debug_13.log +1 -0
log/debug_14.log +1 -0
log/debug_15.log +1 -0
log/debug_2.log +1 -0
log/debug_3.log +1 -0
log/debug_4.log +1 -0
log/debug_5.log +1 -0
log/debug_6.log +1 -0
log/debug_7.log +1 -0
log/debug_8.log +1 -0
log/debug_9.log +1 -0
merges.txt +0 -0
pytorch_model.bin +3 -0
requirements.txt +6 -0
runs/Nov06_21-16-12_leandro-16x-v100/1636233372.3289735/events.out.tfevents.1636233372.leandro-16x-v100.4368.1 +3 -0
runs/Nov06_21-16-12_leandro-16x-v100/events.out.tfevents.1636233372.leandro-16x-v100.4368.0 +3 -0
special_tokens_map.json +1 -0
tokenizer.json +0 -0
tokenizer_config.json +1 -0
vocab.json +0 -0
wandb/debug-internal.log +1 -0
wandb/debug.log +1 -0
wandb/latest-run +1 -0
wandb/run-20211106_211610-dtkf2u0m/files/conda-environment.yaml +131 -0
wandb/run-20211106_211610-dtkf2u0m/files/config.yaml +92 -0
wandb/run-20211106_211610-dtkf2u0m/files/output.log +0 -0
wandb/run-20211106_211610-dtkf2u0m/files/requirements.txt +81 -0
wandb/run-20211106_211610-dtkf2u0m/files/wandb-metadata.json +24 -0
wandb/run-20211106_211610-dtkf2u0m/files/wandb-summary.json +1 -0
wandb/run-20211106_211610-dtkf2u0m/logs/debug-internal.log +3 -0
wandb/run-20211106_211610-dtkf2u0m/logs/debug.log +23 -0
wandb/run-20211106_211610-dtkf2u0m/run-dtkf2u0m.wandb +3 -0

.gitattributes CHANGED Viewed

@@ -25,3 +25,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zstandard filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zstandard filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+log/debug_0.log filter=lfs diff=lfs merge=lfs -text
+wandb/debug-internal.log filter=lfs diff=lfs merge=lfs -text
+wandb/run-20211106_211610-dtkf2u0m/logs/debug-internal.log filter=lfs diff=lfs merge=lfs -text
+wandb/run-20211106_211610-dtkf2u0m/run-dtkf2u0m.wandb filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,62 @@

+# CodeParrot 🦜 (small)
+CodeParrot 🦜 is a GPT-2 model (110M parameters) trained to generate Python code.
+## Usage
+You can load the CodeParrot model and tokenizer directly in `transformers`:
+```Python
+from transformers import AutoTokenizer, AutoModelWithLMHead
+tokenizer = AutoTokenizer.from_pretrained("lvwerra/codeparrot-small")
+model = AutoModelWithLMHead.from_pretrained("lvwerra/codeparrot-small")
+inputs = tokenizer("def hello_world():", return_tensors="pt")
+outputs = model(**inputs)
+```
+or with a `pipeline`:
+```Python
+from transformers import pipeline
+pipe = pipeline("text-generation", model="lvwerra/codeparrot-small")
+outputs = pipe("def hello_world():")
+```
+## Training
+The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/lvwerra/codeparrot-clean) with the following settings:
+|Config|Value|
+|-------|-----|
+|Batch size| 192 |
+|Context size| 1024 |
+|Training steps| 150'000|
+|Gradient accumulation| 1|
+|Gradient checkpointing| False|
+|Learning rate| 5e-4 |
+|Weight decay | 0.1 |
+|Warmup steps| 2000 |
+|Schedule| Cosine |
+The training was executed on 16 x A100 (40GB) GPUs. This setting amounts to roughly 29 billion tokens.
+## Performance
+We evaluated the model on OpenAI's [HumanEval](https://huggingface.co/datasets/openai_humaneval) benchmark which consists of programming challenges:
+| Metric | Value |
+|-------|-----|
+|pass@1 | 3.80% |
+|pass@10 | 6.57%	 |
+|pass@100 | 12.78% |
+The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests.
+## Resources
+- Dataset: [full](https://huggingface.co/datasets/lvwerra/codeparrot-clean), [train](https://huggingface.co/datasets/lvwerra/codeparrot-clean-train), [valid](https://huggingface.co/datasets/lvwerra/codeparrot-clean-valid)
+- Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
+- Spaces: [generation](), [highlighting]()

codeparrot_training.py ADDED Viewed

	@@ -0,0 +1,208 @@

+from transformers import GPT2LMHeadModel, AutoTokenizer
+from transformers import AdamW, get_scheduler, set_seed
+from datasets import load_dataset
+from accelerate import Accelerator
+import datasets, transformers
+from huggingface_hub import Repository
+from torch.utils.data import IterableDataset
+from torch.utils.data.dataloader import DataLoader
+from torch.utils.tensorboard import SummaryWriter
+from argparse import Namespace
+import torch
+import logging
+import wandb
+class ConstantLengthDataset(IterableDataset):
+    def __init__(self, tokenizer, dataset, infinite=False, seq_length=1024,
+                 num_of_sequences=1024, chars_per_token=3.6):
+        self.tokenizer = tokenizer
+        self.concat_token_id = tokenizer.bos_token_id
+        self.dataset = dataset
+        self.seq_length = seq_length
+        self.input_characters = seq_length * chars_per_token * num_of_sequences
+        self.epoch = 0
+        self.infinite = infinite
+    def __iter__(self):
+        iterator = iter(self.dataset)
+        more_examples = True
+        while more_examples:
+            buffer, buffer_len = [], 0
+            while True:
+                if buffer_len >= self.input_characters:
+                    break
+                try:
+                    buffer.append(next(iterator)['content'])
+                    buffer_len += len(buffer[-1])
+                except StopIteration:
+                    if self.infinite:
+                        iterator = iter(self.dataset)
+                        self.epoch += 1
+                        logger.info(f"Dataset epoch: {self.epoch}")
+                    else:
+                        more_examples = False
+                        break
+            tokenized_inputs = tokenizer(buffer, truncation=False)['input_ids']
+            all_token_ids = []
+            for tokenized_input in tokenized_inputs:
+                all_token_ids.extend(tokenized_input + [self.concat_token_id])
+            for i in range(0, len(all_token_ids), self.seq_length):
+                input_ids = all_token_ids[i : i + self.seq_length]
+                if len(input_ids) == self.seq_length:
+                    yield torch.tensor(input_ids)
+def setup_logging(project_name):
+    logger = logging.getLogger(__name__)
+    logging.basicConfig(
+        format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
+        datefmt="%m/%d/%Y %H:%M:%S", level=logging.INFO, handlers=[
+        logging.FileHandler(f"log/debug_{accelerator.process_index}.log"),
+        logging.StreamHandler()])
+    if accelerator.is_main_process: # we only want to setup logging once
+        wandb.init(project=project_name, config=args)
+        run_name = wandb.run.name
+        tb_writer = SummaryWriter()
+        tb_writer.add_hparams(vars(args), {'0': 0})
+        logger.setLevel(logging.INFO)
+        datasets.utils.logging.set_verbosity_info()
+        transformers.utils.logging.set_verbosity_info()
+    else:
+        tb_writer = None
+        run_name = ''
+        logger.setLevel(logging.ERROR)
+        datasets.utils.logging.set_verbosity_error()
+        transformers.utils.logging.set_verbosity_error()
+    return logger, tb_writer, run_name
+def create_dataloaders(dataset_name, args):
+    ds_kwargs = {"streaming":True}
+    train_data = load_dataset(dataset_name+'-train', split='train', **ds_kwargs)
+    train_data = train_data.shuffle(buffer_size=args.shuffle_buffer,
+                                    seed=args.seed)
+    valid_data = load_dataset(dataset_name+'-valid', split="train", **ds_kwargs)
+    train_dataset = ConstantLengthDataset(tokenizer, train_data, infinite=True,
+                                          seq_length=args.seq_length)
+    valid_dataset = ConstantLengthDataset(tokenizer, valid_data, infinite=False,
+                                          seq_length=args.seq_length)
+    train_dataloader=DataLoader(train_dataset, batch_size=args.train_batch_size)
+    eval_dataloader=DataLoader(valid_dataset, batch_size=args.valid_batch_size)
+    return train_dataloader, eval_dataloader
+def get_grouped_params(model, args, no_decay=["bias", "LayerNorm.weight"]):
+    params_with_wd, params_without_wd = [], []
+    for n, p in model.named_parameters():
+        if any(nd in n for nd in no_decay): params_without_wd.append(p)
+        else: params_with_wd.append(p)
+    return [{'params': params_with_wd, 'weight_decay': args.weight_decay},
+            {'params': params_without_wd, 'weight_decay': 0.0}]
+def log_metrics(step, metrics):
+    logger.info(f"Step {step}: {metrics}")
+    if accelerator.is_main_process:
+        wandb.log(metrics)
+        [tb_writer.add_scalar(k, v, step) for k, v in metrics.items()]
+def evaluate(args):
+    model.eval()
+    losses = []
+    for step, batch in enumerate(eval_dataloader):
+        with torch.no_grad():
+            outputs = model(batch, labels=batch)
+        loss = outputs.loss.repeat(args.valid_batch_size)
+        losses.append(accelerator.gather(loss))
+        if args.max_eval_steps > 0 and step >= args.max_eval_steps: break
+    loss = torch.mean(torch.cat(losses))
+    try: perplexity = torch.exp(loss)
+    except OverflowError: perplexity = float("inf")
+    return loss.item(), perplexity.item()
+# Accelerator
+accelerator = Accelerator()
+acc_state = {str(k): str(v) for k, v in accelerator.state.__dict__.items()}
+# Hyperparameters
+project_name = 'lvwerra/codeparrot-small'
+dataset_name = '../codeparrot-clean'
+config = {"train_batch_size": 12,
+          "valid_batch_size": 12,
+          "weight_decay": 0.1,
+          "shuffle_buffer": 1_000,
+          "learning_rate": 5e-4,
+          "lr_scheduler_type": "cosine",
+          "num_warmup_steps": 2_000,
+          "gradient_accumulation_steps": 1,
+          "gradient_checkpointing": False,
+          "max_train_steps": 150_000,
+          "max_eval_steps": -1,
+          "seq_length": 1024,
+          "seed": 1,
+          "save_checkpoint_steps": 15_000}
+args = Namespace(**config, **acc_state)
+samples_per_step = accelerator.state.num_processes * args.train_batch_size
+set_seed(args.seed)
+# Logging
+logger, tb_writer, run_name = setup_logging(project_name.split("/")[1])
+logger.info(accelerator.state)
+# Load model and tokenizer
+if accelerator.is_main_process:
+    hf_repo = Repository("./", clone_from=project_name, revision=run_name)
+model = GPT2LMHeadModel.from_pretrained("./")
+if args.gradient_checkpointing:
+    model.gradient_checkpointing_enable()
+tokenizer = AutoTokenizer.from_pretrained("./")
+# Load dataset and dataloader
+train_dataloader, eval_dataloader = create_dataloaders(dataset_name, args)
+# Prepare the optimizer and learning rate scheduler
+optimizer = AdamW(get_grouped_params(model, args), lr=args.learning_rate)
+lr_scheduler = get_scheduler(name=args.lr_scheduler_type, optimizer=optimizer,
+                             num_warmup_steps=args.num_warmup_steps,
+                             num_training_steps=args.max_train_steps,)
+def get_lr(): return optimizer.param_groups[0]['lr']
+# Prepare everything with our `accelerator`.
+model, optimizer, train_dataloader, eval_dataloader = accelerator.prepare(
+    model, optimizer, train_dataloader, eval_dataloader)
+# Train model
+model.train()
+completed_steps = 0
+for step, batch in enumerate(train_dataloader, start=1):
+    loss = model(batch, labels=batch, use_cache=False).loss
+    log_metrics(step, {'lr': get_lr(), 'samples': step*samples_per_step,
+                       'steps': completed_steps, 'loss/train': loss.item()})
+    loss = loss / args.gradient_accumulation_steps
+    accelerator.backward(loss)
+    if step % args.gradient_accumulation_steps == 0:
+        accelerator.clip_grad_norm_(model.parameters(), 1.0)
+        optimizer.step()
+        lr_scheduler.step()
+        optimizer.zero_grad()
+        completed_steps += 1
+    if step % args.save_checkpoint_steps == 0:
+        logger.info('Evaluating and saving model checkpoint')
+        eval_loss, perplexity = evaluate(args)
+        log_metrics(step, {'loss/eval': eval_loss, 'perplexity': perplexity})
+        accelerator.wait_for_everyone()
+        unwrapped_model = accelerator.unwrap_model(model)
+        unwrapped_model.save_pretrained("./", save_function=accelerator.save)
+        if accelerator.is_main_process:
+            hf_repo.push_to_hub(commit_message=f'step {step}')
+        model.train()
+    if completed_steps >= args.max_train_steps:
+        break
+# Evaluate and save the last checkpoint
+logger.info('Evaluating and saving model after training')
+eval_loss, perplexity = evaluate(args)
+log_metrics(step, {'loss/eval': eval_loss, 'perplexity': perplexity})
+accelerator.wait_for_everyone()
+unwrapped_model = accelerator.unwrap_model(model)
+unwrapped_model.save_pretrained("./", save_function=accelerator.save)
+if accelerator.is_main_process:
+    hf_repo.push_to_hub(commit_message=f'final model')

config.json ADDED Viewed

	@@ -0,0 +1,39 @@

+{
+  "_name_or_path": "/content/transformers/examples/research_projects/autopilot/",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": true,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": true,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.15.0",
+  "use_cache": true,
+  "vocab_size": 32768
+}

log/debug_0.log ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:520cc0fff75aac5dff6577b9f784f98afab8680bbe6519c28dae22170a041e7a
+size 25193867

log/debug_1.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_10.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_11.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_12.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_13.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_14.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_15.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_2.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_3.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_4.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_5.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_6.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_7.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_8.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

log/debug_9.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ 11/06/2021 21:16:43 - INFO - root - Reducer buckets have been rebuilt in this iteration.

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b80f4ce2e9776f1fb7caa630928736a5d37f0cc21d34a213e65a8176faccafd3
+size 456677609

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+torch==1.9.0
+wandb
+tensorboard
+transformers==4.12.2
+datasets==1.13.0
+accelerate==0.5.1

runs/Nov06_21-16-12_leandro-16x-v100/1636233372.3289735/events.out.tfevents.1636233372.leandro-16x-v100.4368.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0e4c545c8c00c8dd32e5dadf4db351c0ee2811281dd482a24268755c1c39c00
+size 1438

runs/Nov06_21-16-12_leandro-16x-v100/events.out.tfevents.1636233372.leandro-16x-v100.4368.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2cd9716067e9f59fda4a7b67e25db6034c5b4465db63524decb1c80001219215
+size 27535087

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "unk_token": "<\|endoftext\|>"}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"unk_token": "<\|endoftext\|>", "bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "transformersbook/codeparrot", "tokenizer_class": "GPT2Tokenizer"}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

wandb/debug-internal.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ run-20211106_211610-dtkf2u0m/logs/debug-internal.log

wandb/debug.log ADDED Viewed

	@@ -0,0 +1 @@


1	+ run-20211106_211610-dtkf2u0m/logs/debug.log

wandb/latest-run ADDED Viewed

	@@ -0,0 +1 @@


1	+ run-20211106_211610-dtkf2u0m

wandb/run-20211106_211610-dtkf2u0m/files/conda-environment.yaml ADDED Viewed

	@@ -0,0 +1,131 @@

+name: codeparrot
+channels:
+  - pytorch
+  - nvidia
+  - defaults
+dependencies:
+  - _libgcc_mutex=0.1=main
+  - _openmp_mutex=4.5=1_gnu
+  - blas=1.0=mkl
+  - bzip2=1.0.8=h7b6447c_0
+  - ca-certificates=2021.7.5=h06a4308_1
+  - certifi=2021.5.30=py38h06a4308_0
+  - cudatoolkit=11.1.74=h6bb024c_0
+  - ffmpeg=4.3=hf484d3e_0
+  - freetype=2.10.4=h5ab3b9f_0
+  - gmp=6.2.1=h2531618_2
+  - gnutls=3.6.15=he1e5248_0
+  - intel-openmp=2021.3.0=h06a4308_3350
+  - jpeg=9b=h024ee3a_2
+  - lame=3.100=h7b6447c_0
+  - lcms2=2.12=h3be6417_0
+  - ld_impl_linux-64=2.35.1=h7274673_9
+  - libffi=3.3=he6710b0_2
+  - libgcc-ng=9.3.0=h5101ec6_17
+  - libgomp=9.3.0=h5101ec6_17
+  - libiconv=1.15=h63c8f33_5
+  - libidn2=2.3.2=h7f8727e_0
+  - libpng=1.6.37=hbc83047_0
+  - libstdcxx-ng=9.3.0=hd4cf53a_17
+  - libtasn1=4.16.0=h27cfd23_0
+  - libtiff=4.2.0=h85742a9_0
+  - libunistring=0.9.10=h27cfd23_0
+  - libuv=1.40.0=h7b6447c_0
+  - libwebp-base=1.2.0=h27cfd23_0
+  - lz4-c=1.9.3=h295c915_1
+  - mkl=2021.3.0=h06a4308_520
+  - mkl-service=2.4.0=py38h7f8727e_0
+  - mkl_fft=1.3.0=py38h42c9631_2
+  - mkl_random=1.2.2=py38h51133e4_0
+  - ncurses=6.2=he6710b0_1
+  - nettle=3.7.3=hbbd107a_1
+  - numpy=1.20.3=py38hf144106_0
+  - numpy-base=1.20.3=py38h74d4b33_0
+  - olefile=0.46=pyhd3eb1b0_0
+  - openh264=2.1.0=hd408876_0
+  - openjpeg=2.4.0=h3ad879b_0
+  - openssl=1.1.1l=h7f8727e_0
+  - pillow=8.3.1=py38h2c7a002_0
+  - pip=21.0.1=py38h06a4308_0
+  - python=3.8.11=h12debd9_0_cpython
+  - pytorch=1.9.0=py3.8_cuda11.1_cudnn8.0.5_0
+  - readline=8.1=h27cfd23_0
+  - setuptools=52.0.0=py38h06a4308_0
+  - six=1.16.0=pyhd3eb1b0_0
+  - sqlite=3.36.0=hc218d9a_0
+  - tk=8.6.10=hbc83047_0
+  - torchaudio=0.9.0=py38
+  - torchvision=0.10.0=py38_cu111
+  - typing_extensions=3.10.0.0=pyhca03da5_0
+  - wheel=0.37.0=pyhd3eb1b0_1
+  - xz=5.2.5=h7b6447c_0
+  - zlib=1.2.11=h7b6447c_3
+  - zstd=1.4.9=haebb681_0
+  - pip:
+    - absl-py==0.13.0
+    - accelerate==0.5.1
+    - aiohttp==3.7.4.post0
+    - async-timeout==3.0.1
+    - attrs==21.2.0
+    - cachetools==4.2.2
+    - chardet==4.0.0
+    - charset-normalizer==2.0.5
+    - click==8.0.1
+    - configparser==5.0.2
+    - datasets==1.13.0
+    - deepspeed==0.5.2
+    - dill==0.3.4
+    - docker-pycreds==0.4.0
+    - filelock==3.0.12
+    - fsspec==2021.8.1
+    - gitdb==4.0.7
+    - gitpython==3.1.18
+    - google-auth==1.35.0
+    - google-auth-oauthlib==0.4.6
+    - grpcio==1.40.0
+    - huggingface-hub==0.0.19
+    - idna==3.2
+    - joblib==1.0.1
+    - markdown==3.3.4
+    - multidict==5.1.0
+    - multiprocess==0.70.12.2
+    - ninja==1.10.2
+    - oauthlib==3.1.1
+    - packaging==21.0
+    - pandas==1.3.3
+    - pathtools==0.1.2
+    - promise==2.3
+    - protobuf==3.18.0
+    - psutil==5.8.0
+    - pyarrow==5.0.0
+    - pyasn1==0.4.8
+    - pyasn1-modules==0.2.8
+    - pyparsing==2.4.7
+    - python-dateutil==2.8.2
+    - pytz==2021.1
+    - pyyaml==5.4.1
+    - regex==2021.8.28
+    - requests==2.26.0
+    - requests-oauthlib==1.3.0
+    - rsa==4.7.2
+    - sacremoses==0.0.45
+    - sentry-sdk==1.3.1
+    - shortuuid==1.0.1
+    - smmap==4.0.0
+    - subprocess32==3.5.4
+    - tensorboard==2.6.0
+    - tensorboard-data-server==0.6.1
+    - tensorboard-plugin-wit==1.8.0
+    - tensorboardx==1.8
+    - termcolor==1.1.0
+    - tokenizers==0.10.3
+    - tqdm==4.62.2
+    - transformers==4.12.2
+    - triton==1.0.0
+    - urllib3==1.26.6
+    - wandb==0.12.2
+    - werkzeug==2.0.1
+    - xxhash==2.0.2
+    - yarl==1.6.3
+    - yaspin==2.1.0
+prefix: /home/leandro/miniconda3/envs/codeparrot

wandb/run-20211106_211610-dtkf2u0m/files/config.yaml ADDED Viewed

	@@ -0,0 +1,92 @@

+wandb_version: 1
+_wandb:
+  desc: null
+  value:
+    cli_version: 0.12.2
+    framework: huggingface
+    huggingface_version: 4.12.2
+    is_jupyter_run: false
+    is_kaggle_kernel: false
+    python_version: 3.8.11
+    start_time: 1636233370
+    t:
+      1:
+      - 1
+      - 11
+      3:
+      - 16
+      4: 3.8.11
+      5: 0.12.2
+      6: 4.12.2
+      8:
+      - 5
+backend:
+  desc: null
+  value: nccl
+deepspeed_plugin:
+  desc: null
+  value: None
+device:
+  desc: null
+  value: cuda:0
+distributed_type:
+  desc: null
+  value: DistributedType.MULTI_GPU
+gradient_accumulation_steps:
+  desc: null
+  value: 1
+gradient_checkpointing:
+  desc: null
+  value: false
+initialized:
+  desc: null
+  value: 'True'
+learning_rate:
+  desc: null
+  value: 0.0005
+local_process_index:
+  desc: null
+  value: '0'
+lr_scheduler_type:
+  desc: null
+  value: cosine
+max_eval_steps:
+  desc: null
+  value: -1
+max_train_steps:
+  desc: null
+  value: 150000
+num_processes:
+  desc: null
+  value: '16'
+num_warmup_steps:
+  desc: null
+  value: 2000
+process_index:
+  desc: null
+  value: '0'
+save_checkpoint_steps:
+  desc: null
+  value: 15000
+seed:
+  desc: null
+  value: 1
+seq_length:
+  desc: null
+  value: 1024
+shuffle_buffer:
+  desc: null
+  value: 1000
+train_batch_size:
+  desc: null
+  value: 12
+use_fp16:
+  desc: null
+  value: 'True'
+valid_batch_size:
+  desc: null
+  value: 12
+weight_decay:
+  desc: null
+  value: 0.1

wandb/run-20211106_211610-dtkf2u0m/files/output.log ADDED Viewed

The diff for this file is too large to render. See raw diff

wandb/run-20211106_211610-dtkf2u0m/files/requirements.txt ADDED Viewed

	@@ -0,0 +1,81 @@

+absl-py==0.13.0
+accelerate==0.5.1
+aiohttp==3.7.4.post0
+async-timeout==3.0.1
+attrs==21.2.0
+cachetools==4.2.2
+certifi==2021.5.30
+chardet==4.0.0
+charset-normalizer==2.0.5
+click==8.0.1
+configparser==5.0.2
+datasets==1.13.0
+deepspeed==0.5.2
+dill==0.3.4
+docker-pycreds==0.4.0
+filelock==3.0.12
+fsspec==2021.8.1
+gitdb==4.0.7
+gitpython==3.1.18
+google-auth-oauthlib==0.4.6
+google-auth==1.35.0
+grpcio==1.40.0
+huggingface-hub==0.0.19
+idna==3.2
+joblib==1.0.1
+markdown==3.3.4
+mkl-fft==1.3.0
+mkl-random==1.2.2
+mkl-service==2.4.0
+multidict==5.1.0
+multiprocess==0.70.12.2
+ninja==1.10.2
+numpy==1.20.3
+oauthlib==3.1.1
+olefile==0.46
+packaging==21.0
+pandas==1.3.3
+pathtools==0.1.2
+pillow==8.3.1
+pip==21.0.1
+promise==2.3
+protobuf==3.18.0
+psutil==5.8.0
+pyarrow==5.0.0
+pyasn1-modules==0.2.8
+pyasn1==0.4.8
+pyparsing==2.4.7
+python-dateutil==2.8.2
+pytz==2021.1
+pyyaml==5.4.1
+regex==2021.8.28
+requests-oauthlib==1.3.0
+requests==2.26.0
+rsa==4.7.2
+sacremoses==0.0.45
+sentry-sdk==1.3.1
+setuptools==52.0.0.post20210125
+shortuuid==1.0.1
+six==1.16.0
+smmap==4.0.0
+subprocess32==3.5.4
+tensorboard-data-server==0.6.1
+tensorboard-plugin-wit==1.8.0
+tensorboard==2.6.0
+tensorboardx==1.8
+termcolor==1.1.0
+tokenizers==0.10.3
+torch==1.9.0
+torchaudio==0.9.0a0+33b2469
+torchvision==0.10.0
+tqdm==4.62.2
+transformers==4.12.2
+triton==1.0.0
+typing-extensions==3.10.0.0
+urllib3==1.26.6
+wandb==0.12.2
+werkzeug==2.0.1
+wheel==0.37.0
+xxhash==2.0.2
+yarl==1.6.3
+yaspin==2.1.0

wandb/run-20211106_211610-dtkf2u0m/files/wandb-metadata.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+    "os": "Linux-5.4.0-1056-gcp-x86_64-with-glibc2.17",
+    "python": "3.8.11",
+    "heartbeatAt": "2021-11-06T21:16:11.096309",
+    "startedAt": "2021-11-06T21:16:10.355683",
+    "docker": null,
+    "gpu": "NVIDIA A100-SXM4-40GB",
+    "gpu_count": 16,
+    "cpu_count": 96,
+    "cuda": "10.1.243",
+    "args": [],
+    "state": "running",
+    "program": "codeparrot_training.py",
+    "codePath": "codeparrot_training.py",
+    "git": {
+        "remote": "https://huggingface.co/lvwerra/codeparrot-small",
+        "commit": "61c58c14c5a962d6f8a01bb8ce31737bb4092922"
+    },
+    "email": "leandro.vonwerra@gmail.com",
+    "root": "/home/leandro/codeparrot-small",
+    "host": "leandro-16x-v100",
+    "username": "leandro",
+    "executable": "/home/leandro/miniconda3/envs/codeparrot/bin/python"
+}

wandb/run-20211106_211610-dtkf2u0m/files/wandb-summary.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"lr": 5.63230573291662e-14, "samples": 28800000, "steps": 149999, "loss/train": 1.469771146774292, "_runtime": 76270, "_timestamp": 1636309640, "_step": 150010, "loss/eval": 1.2280396223068237, "perplexity": 3.414529323577881}

wandb/run-20211106_211610-dtkf2u0m/logs/debug-internal.log ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:53d06c5a631b6282f504e2581be60edaf25b6922e2ee3189c9140c2e3460d2c4
+size 67421355

wandb/run-20211106_211610-dtkf2u0m/logs/debug.log ADDED Viewed

	@@ -0,0 +1,23 @@

+2021-11-06 21:16:10,357 INFO    MainThread:4368 [wandb_setup.py:_flush():69] setting env: {}
+2021-11-06 21:16:10,357 INFO    MainThread:4368 [wandb_setup.py:_flush():69] setting login settings: {}
+2021-11-06 21:16:10,357 INFO    MainThread:4368 [wandb_init.py:_log_setup():348] Logging user logs to /home/leandro/codeparrot-small/wandb/run-20211106_211610-dtkf2u0m/logs/debug.log
+2021-11-06 21:16:10,358 INFO    MainThread:4368 [wandb_init.py:_log_setup():349] Logging internal logs to /home/leandro/codeparrot-small/wandb/run-20211106_211610-dtkf2u0m/logs/debug-internal.log
+2021-11-06 21:16:10,358 INFO    MainThread:4368 [wandb_init.py:init():381] calling init triggers
+2021-11-06 21:16:10,358 INFO    MainThread:4368 [wandb_init.py:init():386] wandb.init called with sweep_config: {}
+config: {'train_batch_size': 12, 'valid_batch_size': 12, 'weight_decay': 0.1, 'shuffle_buffer': 1000, 'learning_rate': 0.0005, 'lr_scheduler_type': 'cosine', 'num_warmup_steps': 2000, 'gradient_accumulation_steps': 1, 'gradient_checkpointing': False, 'max_train_steps': 150000, 'max_eval_steps': -1, 'seq_length': 1024, 'seed': 1, 'save_checkpoint_steps': 15000, 'backend': 'nccl', 'deepspeed_plugin': 'None', 'distributed_type': 'DistributedType.MULTI_GPU', 'num_processes': '16', 'process_index': '0', 'local_process_index': '0', 'device': 'cuda:0', 'use_fp16': 'True', 'initialized': 'True'}
+2021-11-06 21:16:10,358 INFO    MainThread:4368 [wandb_init.py:init():430] starting backend
+2021-11-06 21:16:10,358 INFO    MainThread:4368 [backend.py:_multiprocessing_setup():70] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
+2021-11-06 21:16:10,378 INFO    MainThread:4368 [backend.py:ensure_launched():135] starting backend process...
+2021-11-06 21:16:10,389 INFO    MainThread:4368 [backend.py:ensure_launched():139] started backend process with pid: 4634
+2021-11-06 21:16:10,391 INFO    MainThread:4368 [wandb_init.py:init():435] backend started and connected
+2021-11-06 21:16:10,396 INFO    MainThread:4368 [wandb_init.py:init():494] updated telemetry
+2021-11-06 21:16:10,397 INFO    MainThread:4368 [wandb_init.py:init():517] communicating current version
+2021-11-06 21:16:10,957 INFO    MainThread:4368 [wandb_init.py:init():522] got version response upgrade_message: "wandb version 0.12.6 is available!  To upgrade, please run:\n $ pip install wandb --upgrade"
+2021-11-06 21:16:10,957 INFO    MainThread:4368 [wandb_init.py:init():530] communicating run to backend with 30 second timeout
+2021-11-06 21:16:11,044 INFO    MainThread:4368 [wandb_init.py:init():557] starting run threads in backend
+2021-11-06 21:16:12,320 INFO    MainThread:4368 [wandb_run.py:_console_start():1605] atexit reg
+2021-11-06 21:16:12,320 INFO    MainThread:4368 [wandb_run.py:_redirect():1479] redirect: SettingsConsole.REDIRECT
+2021-11-06 21:16:12,320 INFO    MainThread:4368 [wandb_run.py:_redirect():1484] Redirecting console.
+2021-11-06 21:16:12,322 INFO    MainThread:4368 [wandb_run.py:_redirect():1540] Redirects installed.
+2021-11-06 21:16:12,323 INFO    MainThread:4368 [wandb_init.py:init():582] run started, returning control to user process

wandb/run-20211106_211610-dtkf2u0m/run-dtkf2u0m.wandb ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65fd3ed9e21bd18b6b28e3984baf05f344d8d2728f5429650ae99a1c9aae34c8
+size 51195135