--- library_name: transformers tags: - unsloth - llama3 - indonesia license: llama3 datasets: - catinthebag/TumpengQA language: - id inference: false --- Document Title

Introducing the Kancil family of open models

Kancil

Kancil is a fine-tuned version of Llama 3 8B using synthetic QA dataset generated with Llama 3 70B. Version zero of Kancil is the first generative Indonesian LLM gain functional instruction performance using solely synthetic data.

Go straight to the colab demo

Selamat datang! I am ultra-overjoyed to introduce you... the 🦌 Kancil! It's a fine-tuned version of Llama 3 8B with the TumpengQA, an instruction dataset of 6.7 million words. Both the model and dataset is openly available in Huggingface. 📚 The dataset was synthetically generated from Llama 3 70B. A big problem with existing Indonesian instruction dataset is they're in reality not-very-good-translations of English datasets. Llama 3 70B can generate fluent Indonesian! (with minor caveats 😔) 🦚 This follows previous efforts for collection of open, fine-tuned Indonesian models, like Merak and Cendol. However, Kancil solely leverages synthetic data in a very creative way, which makes it a very unique contribution! ### Version 0.0 This is the very first working prototype, Kancil V0. It supports basic QA functionalities only. Currently, you cannot chat with it. This model was fine-tuned with QLoRA using the amazing Unsloth framework! It was built on top of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) and subsequently merged with the adapter back to 4 bit (no visible difference with merging back to fp 16). ### Uses ## Direct Use This model is developed with research purposes for researchers or general AI hobbyists. However, it has one big application: You can have lots of fun with it! ## Out-of-Scope Use This is a research preview model with minimal safety curation. Do not use this model for commercial or practical applications. You are also not allowed to use this model without having fun. ## Getting started As mentioned, this model was trained with Unsloth. Please use its code for better experience. ``` # Install dependencies. You need GPU to run this (at least T4) %%capture !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes ``` ``` # Load the model from unsloth import FastLanguageModel import torch model, tokenizer = FastLanguageModel.from_pretrained( model_name = "catinthebag/Kancil-V0-llama3", max_seq_length = max_seq_length, dtype = torch.bfloat16, # Will default to float 16 if not available load_in_4bit = True, ) ``` ``` # This model was trained on this specific prompt template. Changing it might lead to performance degradations. prompt_template = """User: {prompt} Asisten: {response}""" EOS_TOKEN = tokenizer.eos_token def formatting_prompts_func(examples): inputs = examples["prompt"] outputs = examples["response"] texts = [] for input, output in zip(inputs, outputs): text = prompt_template.format(prompt=input, response=output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass ``` ``` # Start generating! FastLanguageModel.for_inference(model) inputs = tokenizer( [ prompt_template.format( prompt="Bagaimana canting dan lilin digunakan untuk menggambar pola batik?", response="",) ], return_tensors = "pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens = 600, temperature=.8, use_cache = True) print(tokenizer.batch_decode(outputs)[0].replace('\\n', '\n')) ``` **Note:** For Version 0 there is an issue with the dataset where the newline characters are interpreted as literal strings. Very sorry about this! 😔 Please keep the .replace() method to fix newline errors. ## Acknowledgments - **Developed by:** Afrizal Hasbi Azizy - **Funded by:** [DF Labs](dflabs.id) - **License:** Llama 3 Community License Agreement