--- license: apache-2.0 datasets: - mlsum - squad_es language: - es library_name: adapter-transformers pipeline_tag: text2text-generation --- # Model Card for opt-6.7b-lora-sag-t12000-v1200-v1 ## Adapter Description This adapter was created as part of the SomosNLP Hackathon 2023 with the [PEFT](https://github.com/huggingface/peft) library and allowed the base model [facebook/opt-6.7b](https://huggingface.co/facebook/opt-6.7b) to be fine-tuned on the [SQUAD_ES](https://huggingface.co/datasets/squad_es) (v1.1.0) and [MLSUM](https://huggingface.co/datasets/mlsum) by using the method *LoRA*. - **Developed by:** - 🇵🇪 Enrique Ubaldo - 🇵🇪 Fernando Alva-Manchego - 🇵🇪 @Levi111 - 🇲🇽 @IvanHU - 🇨🇺 [Alberto Carmona Barthelemy](https://huggingface.co/milyiyo) - **Model type:** Text2Text Generation - **Language(s) (NLP):** Spanish - **License:** apache-2.0 ## Uses This model is designed for use in Spanish language instruction, specifically for tasks such as generating summaries, creating questions, and generating answers based on a given context and question. ## Bias, Risks, and Limitations Please note that this model inherits biases from its original base model. You can review these biases by visiting the following [link](https://huggingface.co/facebook/opt-6.7b#limitations-and-bias). ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. ```py import torch from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig peft_model_id = "hackathon-somos-nlp-2023/opt-6.7b-lora-sag-t14000-v1400-v1" config = PeftConfig.from_pretrained(peft_model_id) model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto') tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) # Load the Lora model model = PeftModel.from_pretrained(model, peft_model_id) model.config.use_cache = True generation_config = GenerationConfig(temperature=.8, top_p=0.75, top_k=40) def gen_summary(text): input_text = f'Instruction: Elabora un resume del siguiente texto.\nInput: {text}\nOutput: ' batch = tokenizer(input_text, return_tensors='pt') with torch.cuda.amp.autocast(): output_tokens = model.generate(**batch, max_new_tokens=256, generation_config=generation_config) output = tokenizer.decode(output_tokens[0], skip_special_tokens=True) return output def gen_question(text): input_text = f'Instruction: Dado el siguiente texto quiero que generes una pregunta cuya respuesta se encuentre en él.\nInput: {text}\nOutput: ' batch = tokenizer(input_text, return_tensors='pt') with torch.cuda.amp.autocast(): output_tokens = model.generate(**batch, max_new_tokens=256, generation_config=generation_config) output = tokenizer.decode(output_tokens[0], skip_special_tokens=True) return output def gen_qna(context, question): input_text = f"""Instruction: Te voy a proporcionar un texto del cual deseo que me respondas una pregunta. El texto es el siguiente: `{context}`\nInput: {question}\nOutput: """ batch = tokenizer(input_text, return_tensors='pt') with torch.cuda.amp.autocast(): output_tokens = model.generate(**batch, max_new_tokens=256, generation_config=generation_config) output = tokenizer.decode(output_tokens[0], skip_special_tokens=True) return output ``` ## Training Details ### Training Data - [SQUAD_ES](https://huggingface.co/datasets/squad_es) (Subset: v1.1.0) - [MLSUM](https://huggingface.co/datasets/mlsum) (Subset: es) ### Training Procedure We selected 4000 examples for each of the three tasks in the training dataset, and 400 examples for each task in the validation dataset. This resulted in a total of 12000 examples for training and 1200 examples for validation. The Colab used for training is [here](https://colab.research.google.com/drive/1mwLUNgsSLrRbHUiXJuaU_6EQaYrYg10A?usp=sharing). #### Training Hyperparameters - **Training regime:** fp16 - **Steps:**: 1240 - **Learning rate:**: 2e-4 - **Training loss:**: 1.013 - **Validation loss:**: 1.53 - **Training duration:**: 7.5 hours WandB link: [here](https://wandb.ai/milyiyo/huggingface/reports/eval-loss-train-loss-23-04-08-19-14-44---Vmlldzo0MDEwMzQ5?accessToken=o5f3pldc7np25ch8123rcjz4195vrcc9nl31r26i130jhhpvuabie3ezrw6dcs6r)