--- library_name: peft tags: - tiiuae - code - instruct - databricks-dolly-15k - falcon-40b datasets: - databricks/databricks-dolly-15k base_model: tiiuae/falcon-40b license: apache-2.0 --- ### Finetuning Overview: **Model Used:** tiiuae/falcon-40b **Dataset:** Databricks-dolly-15k #### Dataset Insights: The Databricks-dolly-15k dataset, comprising over 15,000 records, stands as a testament to the dedication of numerous Databricks professionals. Aimed at refining the interactive capabilities of systems like ChatGPT, the dataset offers: - Prompt/response pairs across eight distinct instruction categories. - A blend of the seven categories from the InstructGPT paper and an open-ended category. - Original content, devoid of generative AI influence and primarily offline-sourced, with exceptions for Wikipedia references. - Interactive sessions where contributors could address and rephrase peer questions. Note: Some data categories incorporate Wikipedia references, evident from bracketed citation numbers, e.g., [42]. Exclusion is recommended for downstream applications. #### Finetuning Details: Leveraging [MonsterAPI](https://monsterapi.ai)'s no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm), our finetuning emphasized: - **Cost-Effectiveness:** A complete run at just `$11.8`. - **Efficiency:** Using an A6000 48GB GPU, the session concluded in 5 hours and 40 minutes. #### Hyperparameters & Additional Details: - **Epochs:** 1 - **Learning Rate:** 0.0002 - **Data Split:** Training 90% / Validation 10% - **Gradient Accumulation Steps:** 4 --- ### Prompt Structure: ``` ### INSTRUCTION: [instruction] [context] ### RESPONSE: [response] ``` Loss metrics Training loss: ![training loss](train-loss.png "Training loss") --- license: apache-2.0