Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for HausaLlama

Model Details

HausaLlama3 is a 8B Parameters language model that builds upon the foundation of meta-llama/Meta-Llama-3-8B.It has been specifically enhanced to excel in processing and generating text in Hausa language. This model aims to improve natural language understanding and generation capabilities for Hausa speaking users and researchers.

Model Description

Key features:

  • Improved performance on Hausa language tasks
  • Maintains general language capabilities of the original Llama 3 model
  • Optimized for both understanding and generating Hausatext

Training The training process for HausaLlama involved two main stages:

1. LoRA-based Continual Pre-training: We conducted continuous pre-training using publicly available Hausa corpi, which we pre-processed using the Meta/Llama3 tokenizer. The primary focus was on causal language modeling,specifically training the model to predict the next Hausa tokens based on preceding Hausa tokens. Our continuous pre- training involved implementing the LoRA technique, where-in we froze the base model parameters of the foundation Meta/Llama3 model and introduced additional lightweight components(adpaters). adapters. These adapters were specifically trained to capture the intricacies, terminologies, and nuances of the Hausa language. This approach facilitated a balance between leveraging the knowledge embedded in the pre-trained Meta/Llama3 model and optimizing it for Hausa language, all without incurring the computational costs associated with retraining the entire Llama3 model.

2. LoRA-based Instruction Tuning:

Fine-tuned on a curated dataset of Hausa instructions and responses Included task-specific data to improve performance on common language tasks Emphasized maintaining coherence and contextual understanding in Hausa

Incorporated safety datasets to improve the model's ability to generate safe and ethical responses Included examples of harmful content and appropriate non-harmful alternatives Focused on reducing biases and improving the model's understanding of cultural sensitivities in the Hausa context Approximate dataset sizes:

Continual pre-training:["8.4 GB of text"] Instruction tuning:[66,280 instruction-response pairs"]

  • Developed by: Jacaranda Health
  • Model type: Llama
  • Language(s) (NLP): Hausa and English
  • License: CC BY-NC-SA 4.0 DEED
  • Model Developers: Stanslaus Mwongela, Jay Patel, Sathy Rajasekharan, Lyvia Lusiji, Francesco Piccino, Mfoniso Ukwak, Ellen Sebastian

Uses

HausaLlama is optimized for downstream tasks, notably those demanding instructional datasets in Hausa, English, or both. Organizations can further fine-tune it for their specific domains. Potential areas include:

  • Question-answering within specific domains.
  • Assistant-driven chat capabilities: healthcare, agriculture, legal, education, tourism and hospitality, public services, financial sectors, communication, customer assistance, commerce, etcpublic services, financial sectors, communication, customer assistance, commerce, etc.

--

Out-of-Scope Use

The use of the developed Large Language Model (LLM) capabilities is for research,social good and internal use purposes only. For commercial use and distribution, organisations/individuals are encouraged to contactJacaranda Health. To ensure the ethical and responsible use of HausaLlama, we have outlined a set of guidelines. These guidelines categorize activities and practices into three main areas: prohibited actions, high-risk activities, and deceptive practices. By understanding and adhering to these directives, users can contribute to a safer and more trustworthy environment.

  1. Prohibited Actions:
  • Illegal Activities: Avoid promoting violence, child exploitation, human trafficking, and other crimes.
  • Harassment and Discrimination: No acts that bully, threaten, or discriminate.
  • Unauthorized Professions: No unlicensed professional activities.
  • Data Misuse: Handle personal data with proper consents.
  • Rights Violations: Respect third-party rights.
  • Malware Creation: Avoid creating harmful software.
  1. High-Risk Activities:
  • Dangerous Industries: No usage in military, nuclear, or espionage domains.
  • Weapons and Drugs: Avoid illegal arms or drug activities.
  • Critical Systems: No usage in key infrastructures or transport technologies.
  • Promotion of Harm: Avoid content advocating self-harm or violence.
  1. Deceptive Practices:
  • Misinformation: Refrain from creating/promoting fraudulent or misleading info.
  • Defamation and Spam: Avoid defamatory content and unsolicited messages.
  • Impersonation: No pretending to be someone without authorization.
  • Misrepresentation: No false claims about HauaLlama outputs.
  • Fake Online Engagement: No promotion of false online interactions.

Bias, Risks, and Limitations

HauaLlama is a cutting-edge technology brimming with possibilities, yet is not without inherent risks. The extensive testing conducted thus far has been predominantly in Hausa and English, however leaving an expansive terrain of uncharted scenarios. Consequently, like its LLM counterparts, HauaLlama outcome predictability remains elusive, and there's the potential for it to occasionally generate responses that are either inaccurate, biased, or otherwise objectionable in nature when prompted by users. With this in mind, the responsible course of action dictates that, prior to deploying HausaLlama in any applications, developers must embark on a diligent journey of safety testing and meticulous fine-tuning, customized to the unique demands of their specific use cases.

Contact-Us

For any questions, feedback, or commercial inquiries, please reach out at ai@jacarandahealth.org

Downloads last month
5
Safetensors
Model size
8.03B params
Tensor type
BF16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.