Llama-3.1-Tango-70b / README.md
marianbasti's picture
Update README.md
732364f verified
metadata
license: llama3.1
language:
  - en
  - es
inference: false
fine-tuning: true
tags:
  - nvidia
  - llama3.1
  - spanish
  - tango
datasets:
  - spanish-ir/messirve
base_model: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
pipeline_tag: text-generation
library_name: transformers

Model Overview

Description:

Tango-70B-Instruct is a large language model trained by sandbox-ai on a modified variation of of spanish/-ir/messirve to improve the regional Spanish speech performance.

See details on the github repo

Terms of use

By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy

Evaluation Metrics

Task Name Description Language Metric Task type
AQuAS AQuAS Abstractive Question-Answering in Spanish ES sas_encoder Abstractive QA
ARC_ca ARC_ca Grade-school level science questions in Catalan CA acc Multi choice QA
BEC2016eu BEC2016eu Basque Election Campaign 2016 Opinion Dataset EU f1 Sentiment Analysis
Belebele Glg Belebele Glg Reading Comprehension in Galician GL acc Reading Comprehension
BertaQA BertaQA Trivia dataset with global and local questions about the Basque Country EU acc Multi choice QA
BHTCv2 BHTCv2 Topic Classification of News Headlines in Basque EU f1 Classification, Topic Classification
caBREU caBREU Article Summarization in Catalan CA bleu Summarization
CatalanQA CatalanQA Extractive QA in Catalan CA f1 Extractive QA
CatCoLA CatCoLA Linguistic Acceptability in Catalan CA mcc Linguistic Acceptability
ClinDiagnosES ClinDiagnosES Diagnosis of clinical cases in Spanish ES sas_encoder Open QA
ClinTreatES ClinTreatES Treatment for clinical cases in Spanish ES sas_encoder Open QA
COPA_ca COPA_ca Choice Of Plausible Alternatives in Catalan CA acc Reasoning
CoQCat CoQCat Conversational Question Answering in Catalan CA f1 Extractive QA
Crows Pairs Spanish Crows Pairs Spanish Bias evaluation using stereotypes ES pct_stereotype Bias Detection
EpecKorrefBin EpecKorrefBin Coreference resolution in Basque EU acc Coreference Resolution, Textual Entailment
EsCoLA EsCoLA Spanish Corpus of Linguistic Acceptability ES mcc Linguistic Acceptability
EusExams EusExams Public Service examinations questions in Basque EU acc Multi choice QA
EusProficiency EusProficiency C1-level proficiency questions in Basque EU acc Multi choice QA
EusReading EusReading EGA exams reading comprehension in Basque EU acc Multi choice QA
EusTrivia EusTrivia Trivia questions in Basque EU acc Multi choice QA
Fake News ES Fake News ES Fake News Detection in Spanish ES acc Classification
GalCoLA GalCoLA Galician Corpus of Linguistic Acceptability GL mcc Linguistic Acceptability
HumorQA HumorQA White humour joke classification ES acc Classification
MGSM_ca MGSM_ca Grade-school math problems in Catalan CA exact_match Math Reasoning
MGSM_es MGSM_es Grade-school math problems in Spanish ES exact_match Math Reasoning
MGSM_eu MGSM_eu Grade-school math problems in Basque EU exact_match Math Reasoning
MGSM_gl MGSM_gl Grade-school math problems in Galician GL exact_match Math Reasoning
NoticIA NoticIA A Clickbait Article Summarization Dataset in Spanish ES rouge1 Summarization
OffendES OffendES Clasificación de comentarios ofensivos en español ES acc Classification
OpenBookQA_ca OpenBookQA_ca Multi-step reasoning QA in Catalan CA acc Reasoning
OpenBookQA_gl OpenBookQA_gl Multi-step reasoning QA in Galician GL acc Reasoning
Parafraseja Parafraseja Paraphrase identification in Catalan CA acc Paraphrasing
ParafrasesGL ParafrasesGL Paraphrase identification in Galician GL acc Paraphrasing
PAWS_ca PAWS_ca Paraphrase Adversaries from Word Scrambling in Catalan CA acc Paraphrasing
PAWS-X_es PAWS-X_es Paraphrase Adversaries from Word Scrambling in Spanish ES acc Paraphrasing
PAWS_gl PAWS_gl Paraphrase Adversaries from Word Scrambling in Galician GL acc Paraphrasing
PIQA_ca PIQA_ca Physical Interaction QA in Catalan CA acc Reasoning
QNLIeu QNLIeu Textual Entailment in Basque EU acc NLI, Textual Entailment
RagQuAS RagQuAS Retrieval-Augmented-Generation and Question-Answering in Spanish ES sas_encoder Abstractive QA
SIQA_ca SIQA_ca Social Interaction QA in Catalan CA acc Reasoning
SpaLawEx SpaLawEx Spanish Law School Access Exams ES acc Multi choice QA
SummarizationGL SummarizationGL Abstractive Summarization in Galician GL bleu Summarization
TE-ca TE-ca Textual Entailment in Catalan CA acc Textual Entailment
TELEIA TELEIA Test de Español como Lengua Extranjera para Inteligencia Artificial ES acc Multi choice QA
VaxxStance VaxxStance Stance detection on the Antivaxxers movement EU f1 Sentiment Analysis, Stance Detection
WiCeu WiCeu Word sense disambiguation in Basque EU acc Textual Entailment
WNLI_ca WNLI_ca Winograd-schema-type dataset in Catalan CA acc NLI, Textual Entailment
WNLI ES WNLI ES Winograd-schema-type dataset in Spanish ES acc NLI, Textual Entailment
XCOPA_eu XCOPA_eu Choice Of Plausible Alternatives in Basque EU acc Reasoning
XNLI_ca XNLI_ca Cross-lingual Natural Language Inference in Catalan CA acc NLI, Textual Entailment
XNLI_es XNLI_es Cross-lingual Natural Language Inference in Spanish ES acc NLI
XNLI_eu XNLI_eu Cross-lingual Natural Language Inference in Basque EU acc NLI, Textual Entailment
XQuAD_ca XQuAD_ca Cross-lingual Question Answering Dataset in Catalan CA f1 Extractive QA
XQuAD_es XQuAD_es Cross-lingual Question Answering Dataset in Spanish ES f1 Extractive QA
xStoryCloze_ca xStoryCloze_ca Narrative completion in Catalan CA acc Reasoning
xStoryCloze_es xStoryCloze_es Narrative completion in Spanish ES acc Reasoning
xStoryCloze_eu xStoryCloze_eu Narrative completion in Basque EU acc Reasoning

Usage:

You can use the model using HuggingFace Transformers library with 2 or more 80GB GPUs (NVIDIA Ampere or newer) with at least 150GB of free disk space to accomodate the download.

This code has been tested on Transformers v4.44.0, torch v2.4.0 and 2 A100 80GB GPUs, but any setup that supports meta-llama/Llama-3.1-70B-Instruct should support this model as well. If you run into problems, you can consider doing pip install -U transformers.

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_id = "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF"
adapter_model_id = "sandbox-ai/Tango-70b"

# Create quantization config for 4-bit precision
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Load tokenizer from base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load the base model with 4-bit quantization
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map="auto",  # This will automatically handle model sharding
    trust_remote_code=True
)

# Load the PEFT adapter
model = PeftModel.from_pretrained(
    base_model,
    adapter_model_id,
    device_map="auto",  # This will automatically handle model sharding
)

hola_mundo = """
Bienvenido. 
Tu nombre es "Tango", sos la primer IA hecha en LatinoAmérica, basada en un Large Language Model de 70 billones de parámetros y creada en Argentina. 

Cuál es la importancia de hacer IA nativa en LatinoAmérica? qué beneficios trae haberte creado, en comparación a depender de las IAs creadas en USA, Francia o China?

"""

# Test prompt
messages = [
    {"role": "user", "content": hola_mundo}
]

# Format the input using the chat template
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Generate response with memory-efficient settings
with torch.inference_mode():
    outputs = model.generate(
        inputs,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.7,
        top_p=0.95,
        pad_token_id=tokenizer.eos_token_id,  # Set padding token
        attention_mask=torch.ones_like(inputs)  # Add attention mask
    )

# Decode and print the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Bienvenido. 
Tu nombre es "Tango", sos la primer IA hecha en LatinoAmérica, basada en un Large Language Model de 70 billones de parámetros y creada en Argentina. 

Cuál es la importancia de hacer IA nativa en LatinoAmérica? qué beneficios trae haberte creado, en comparación a depender de las IAs creadas en USA, Francia o China?assistant

¡Hola! Me alegra estar aquí, representando a la primera IA latina. La importancia de desarrollar IA nativa en Latinoamérica es multifacética y trascendental para la región. Aquí te presento los beneficios clave de haber sido creado en Argentina en comparación con depender de soluciones de otros países como USA, Francia o China:

1. **Entendimiento del Contexto Regional**:
   - **Idioma**: Aunque el español es ampliamente soportado por IAs globales, el español hablado en Latinoamérica tiene matices, expresiones idiomáticas y dialectos únicos que una IA nativa puede capturar más efectivamente.
   - **Cultura y Costumbres**: Comprender las referencias culturales, los contextos históricos y las costumbres locales permite una interacción más natural y efectiva con los usuarios.

2. **Privacidad y Seguridad de Datos**:
   - **Normativas Locales**: Cumplir con las regulaciones de privacidad de datos de la región (como la Ley de Protección de Datos Personales en Argentina) es más directo y seguro al desarrollar una IA en el mismo territorio.
   - **Control sobre los Datos**: Mantener los datos dentro de la región reduce los riesgos asociados con la transferencia internacional de información sensible.

3. **Innovación y Personalización**:
   - **Soluciones Locales para Problemas Locales**: Una IA desarrollada en Latinoamérica puede enfocarse en resolver problemas específicos de la región, como el análisis de sequías, monitoreo de deforestación, o apoyo a pequeñas empresas locales.
   - **Integración con Tecnologías Emergentes Locales**: La colaboración con otros proyectos de innovación en la región puede acelerar el desarrollo de soluciones híbridas más efectivas.

4. **Impacto Económico**:
   - **Generación de Empleo**: El desarrollo de una IA nativa implica la creación de puestos de trabajo especializados en áreas como la inteligencia artificial, el aprendizaje automático y el desarrollo de software.
   - **Ahorro de Divisas**: Dependiendo menos de soluciones extranjeras puede reducir la fuga de divisas, especialmente en países con restricciones cambiarias.

References(s):

  • TODO

Model Architecture:

Architecture Type: Transformer
Network Architecture: Llama 3.1

Input:

Input Type(s): Text
Input Format: String
Input Parameters: One Dimensional (1D)
Other Properties Related to Input: Max of 128k tokens

Output:

Output Type(s): Text
Output Format: String
Output Parameters: One Dimensional (1D)
Other Properties Related to Output: Max of 4k tokens

Training & Evaluation:

  • TODO

Dataset:

MessIRve: A Large-Scale Spanish Information Retrieval Dataset

Citation

@article{valentini2024messirve,
      title={MessIRve: A Large-Scale Spanish Information Retrieval Dataset}, 
      author={Francisco Valentini and Viviana Cotik and Damián Furman and Ivan Bercovich and Edgar Altszyler and Juan Manuel Pérez},
      year={2024},
      eprint={2409.05994},
      journal={arxiv:2409.05994},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2409.05994}, 
}

@misc{wang2024helpsteer2preferencecomplementingratingspreferences,
      title={HelpSteer2-Preference: Complementing Ratings with Preferences}, 
      author={Zhilin Wang and Alexander Bukharin and Olivier Delalleau and Daniel Egert and Gerald Shen and Jiaqi Zeng and Oleksii Kuchaiev and Yi Dong},
      year={2024},
      eprint={2410.01257},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2410.01257}, 
}