--- library_name: peft tags: - meta-llama - code - instruct - databricks-dolly-15k - Llama-2-70b-hf datasets: - databricks/databricks-dolly-15k base_model: meta-llama/Llama-2-70b-hf --- For our finetuning process, we utilized the meta-llama/Llama-2-70b-hf model and the Databricks-dolly-15k dataset. This dataset, a meticulous compilation of over 15,000 records, was a result of the dedicated work of thousands of Databricks professionals. It was specifically designed to further improve the interactive capabilities of ChatGPT-like systems. The dataset contributors crafted prompt / response pairs across eight distinct instruction categories. Besides the seven categories mentioned in the InstructGPT paper, they also ventured into an open-ended, free-form category. The contributors, emphasizing genuine and original content, refrained from sourcing information online, except in special cases where Wikipedia was the source for certain instruction categories. There was also a strict directive against the use of generative AI for crafting instructions or responses. The contributors could address questions from their peers. Rephrasing the original question was encouraged, and there was a clear preference to answer only those queries they were certain about. In some categories, the data comes with reference texts sourced from Wikipedia. Users might find bracketed Wikipedia citation numbers (like [42]) within the context field of the dataset. For smoother downstream applications, it's advisable to exclude these. Our finetuning leveraged the [MonsterAPI](https://monsterapi.ai)'s intuitive, no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm). This efficient process, surprisingly cost-effective, was completed in just 17.5 hours for 3 epochs, running on an A100 80GB GPU. Breaking it down further, each epoch took only 5.8 hours and cost a mere `$19.25`. The total cost for all 3 epochs came to `$57.75`. #### Hyperparameters & Run details: - Epochs: 3 - Cost per epoch: $19.25 - Total finetuning Cost: $57.75 - Model Path: meta-llama/Llama-2-70b-hf - Dataset: databricks/databricks-dolly-15k - Learning rate: (not provided in the original data) - Number of epochs: 3 - Data split: (not provided in the original data, assuming Training: 90% / Validation: 10%) - Gradient accumulation steps: (not provided in the original data) license: apache-2.0 --- ###### Prompt Used: ``` ### INSTRUCTION: [instruction] [context] ### RESPONSE: [response] ```