PEFT
PyTorch
Hindi
mixtral

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Description

Mixtral Fine-Tuned for Hindi and Hinglish as part of ongoing experiments by bb deep learning systems

Model Sources [optional]

  • Paper: [More Information Coming Soon]

Training Details

Training Data

A mix of [Ultrachat200k] and [rohansolo/BB_HindiHinglishV2] were used for a total of 573,014,566 tokens in Hindi, Romanised Hindi and English.

Training Procedure

Training Loss at the end was

0.8977639613123988

Model was trained using the follwoing Hyperparameters:

warmup_steps: 100 weight_decay: 0.05

num_epochs: 1 optimizer: paged_adamw_8bit lr_scheduler: cosine learning_rate: 0.0002

lora_r: 32 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules:

  • q_proj
  • k_proj
  • v_proj
  • o_proj
  • w1
  • w2
  • w3 lora_target_linear: lora_fan_in_fan_out: lora_modules_to_save:
  • embed_tokens
  • lm_head

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: bfloat16

Environmental Impact

Experiments were conducted using a private infrastructure, which has a carbon efficiency of 0.432 kgCO$_2$eq/kWh. A cumulative of 94 hours of computation was performed on hardware of type A100 SXM4 80 GB (TDP of 400W).

Total emissions are estimated to be 16.24 kgCO$_2$eq of which 0 percents were directly offset.

- **Hardware Type:** [8 x A100 SXM4 80 GB]
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for

Datasets used to train rohansolo/BB-Mixtral-HindiHinglish-8x7B-v0.1