--- license: other datasets: - blip-solutions/SlovAlpaca language: - sk --- # SlovAlpaca This repository contains the LORA weights finetuned on the translated version of the original Alpaca dataset (more info on the dataset card) ## Training procedure The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters: ``` MICRO_BATCH_SIZE = 3 BATCH_SIZE = 128 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE EPOCHS = 2 # paper uses 3 LEARNING_RATE = 2e-5 # from the original paper CUTOFF_LEN = 256 LORA_R = 4 LORA_ALPHA = 16 LORA_DROPOUT = 0.05 ``` The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results @misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, } ## How to use: ### Prerequisites ``` !pip install datasets loralib sentencepiece !pip uninstall -y transformers !pip install git+https://github.com/zphang/transformers@c3dc391#egg=transformers !pip install git+https://github.com/huggingface/peft.git !pip install bitsandbytes ``` ### Load model: ``` from peft import PeftModel from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf") model = LLaMAForCausalLM.from_pretrained( "decapoda-research/llama-7b-hf", load_in_8bit=True, device_map="auto", ) model = PeftModel.from_pretrained(model, "blip-solutions/SlovAlpaca") ``` ### Generation Here is a colab notebook for inference: https://colab.research.google.com/drive/1z4aMG7tGjchLBlg_iXDuqt3sH6bQRuQk?usp=sharing ``` PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: Kde žijú lamy? ### Response:""" inputs = tokenizer( PROMPT, return_tensors="pt", ) input_ids = inputs["input_ids"].cuda() generation_config = GenerationConfig( temperature=0.6, top_p=0.95, repetition_penalty=1.15, ) print("Generating...") generation_output = model.generate( input_ids=input_ids, generation_config=generation_config, return_dict_in_generate=True, output_scores=True, max_new_tokens=128, ) for s in generation_output.sequences: print(tokenizer.decode(s)) ``` ### Response: ``` Generating... Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: Kde žijú lamy? ### Response: Lamy žiju v horách, na poli, alebo v lesoch. ```