Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Aria Code Light is based on llama 2 chat. Training procedure

Aria Code light is a finetuned llama 2 chat HF on python dataset with over 18.000 tokens of coding prompts and answers. Our goal was to create a model which can run on a single GPU with more language skills and better coding performance than the initial LLama 2 especially in Python. ..

GPU used for training : NVIDIA A100.

....

Timing: Less than 24 hours.

.....

Method : Lora + PEFT

Update following LLAMA CODE 7B release.

As Meta just released LLAMA CODE 7B, and even a LLAMA CODE PYTHON, which is trained on a larger python dataset than Aria Code light, we still believe Aria Code light has a more user friendly approach by adding coding skills to a chat model. It has been noticed by many community users that specialized models in Coding often loose "non-coding" and natural language performance. That being said,we encourage you to try both and use the model which fit your needs better,everything done for the open source community is always useful. Congratulations to Meta Team for achieving this new milestone in "Coding LLMS" area.

Contact

Support : contact@faradaylab.fr

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: True
  • load_in_4bit: False
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.6.0.dev0
Downloads last month
0

Dataset used to train Faradaylab/aria-code-light-llama2-python