AdrianBZG
/

falcon-7b-spanish-8bit

Model card Files Files and versions Community

AdrianBZG commited on Oct 3, 2023

Commit

e90b537

•

1 Parent(s): 95837d3

Upload model

Files changed (2) hide show

README.md +15 -22
adapter_config.json +8 -2

README.md CHANGED Viewed

@@ -1,28 +1,21 @@
 ---
-license: apache-2.0
-language:
-- es
-library_name: transformers
-tags:
-- falcon
-- alpaca
-- Transformers
-- gpt
-- PyTorch
-- llm
-- llm spanish
-pipeline_tag: text-generation
-datasets:
-- bertin-project/alpaca-spanish
 ---
-<strong><span style="font-size: larger;">FALCON 7B Spanish Fine-tuned 8bit 🤗</span></strong>
-**Dataset**
-The dataset is a translation to Spanish of alpaca_data_cleaned.json (a clean version of the Alpaca dataset made at Stanford) using OpenAI's gpt-3.5-turbo model. This translation was made by bertin-project. It was translated using a full-sample prompt instead of per strings, which resulted in more coherent tuples of (instruction, input, output).
-Dataset link: [here](https://huggingface.co/datasets/bertin-project/alpaca-spanish)
-**Finetuning details**
-To fine-tune the FALCON-7B model we used the [following code](https://github.com/AdrianBZG/LLM-distributed-finetune) to run it on a distributed cluster on AWS. You are free to use such code as a fingerprint to finetune any model as you please, as it is easily customizable.

 ---
+library_name: peft
 ---
+## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- load_in_8bit: True
+- load_in_4bit: False
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: fp4
+- bnb_4bit_use_double_quant: False
+- bnb_4bit_compute_dtype: float32
+### Framework versions
+- PEFT 0.6.0.dev0

adapter_config.json CHANGED Viewed

@@ -1,14 +1,20 @@
 {
   "base_model_name_or_path": "tiiuae/falcon-7b",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
   "init_lora_weights": true,
-  "lora_alpha": 8,
   "lora_dropout": 0.05,
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 4,
   "target_modules": [
     "query_key_value"
   ],

 {
+  "alpha_pattern": {},
+  "auto_mapping": null,
   "base_model_name_or_path": "tiiuae/falcon-7b",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
   "init_lora_weights": true,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "lora_alpha": 32,
   "lora_dropout": 0.05,
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
   "target_modules": [
     "query_key_value"
   ],