Initial Upload

Browse files

Files changed (10) hide show

.gitignore +1 -0
LICENSE.md +12 -0
README.md +78 -3
config.json +52 -0
model.safetensors +3 -0
quantize_config.json +20 -0
special_tokens_map.json +23 -0
tokenizer.json +0 -0
tokenizer.model +3 -0
tokenizer_config.json +43 -0

.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ .DS_Store

LICENSE.md ADDED Viewed

	@@ -0,0 +1,12 @@

+iGenius
+Copyright (c) 2024, iGenius S.p.A.
+MIT License
+È concessa l'autorizzazione, gratuitamente, a chiunque di ottenere una copia di Modello Italia e dei file di documentazione associati, di utilizzare Modello Italia senza restrizioni, inclusi senza limitazione i diritti di utilizzare, copiare, modificare, unire, pubblicare, distribuire, concedere in sublicenza e/o vendere copie di Modello Italia, e di consentire alle persone a cui Modello Italia è fornito di farlo, nelle condizioni seguenti:
+Il presente avviso di copyright e il presente avviso di autorizzazione saranno inclusi in tutte le copie o parti sostanziali di Modello Italia.
+IL MODELLO VIENE FORNITO "COSÌ COM'È", SENZA GARANZIE DI ALCUN TIPO, ESPRESSE O IMPLICITE, INCLUSO MA NON LIMITATO A GARANZIE DI COMMERCIABILITÀ, IDONEITÀ PER UN PARTICOLARE SCOPO E NON VIOLAZIONE. IN NESSUN CASO GLI AUTORI O I TITOLARI DEL COPYRIGHT SARANNO RESPONSABILI PER QUALSIASI RICHIESTA, DANNO O ALTRA RESPONSABILITÀ, IN CASO DI AZIONE DI CONTRATTO, TORTO O ALTRIMENTI, DERIVANTE DA, FUORI O IN CONNESSIONE CON IL SOFTWARE O L'USO O ALTRI AFFARI NEL SOFTWARE.

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
----
-license: mit
----

+---
+license: mit
+language:
+- it
+---
+# Model Card for Modello Italia 9B INT4 group-size 128 GPU-optimized
+This an UNOFFICIAL conversion/quantization of the OFFICIAL model checkpoint of *"Modello Italia 9B"*, Large Language Model (LLM) developed by [iGenius](https://it.igenius.ai/) in collaboration with [CINECA](https://www.cineca.it/).
+* More information about Modello Italia: [click here](https://it.igenius.ai/language-models).
+This model has been quantized in INT4, group-size 128, and optimized for inferencing on GPU.
+## 🚨 Disclaimers
+* This is an UNOFFICIAL quantization of the OFFICIAL model checkpoint released by iGenius.
+* This model is based also on the conversion made for HF Transformers by [Sapienza NLP, Sapienza University of Rome](https://huggingface.co/sapienzanlp).
+* The original model was developed using LitGPT, therefore, the weights need to be converted before they can be used with Hugging Face transformers.
+## 🚨 Terms and Conditions
+* **Note:** By using this model, you accept the iGenius' [**terms and conditions**](https://secure.igenius.ai/legal/italia_terms_and_conditions.pdf).
+## 🚨 Reproducibility
+This model has been quantized using Intel [auto-round](https://github.com/intel/auto-round), based on [SignRound technique](https://arxiv.org/pdf/2309.05516v4).
+```
+git clone https://github.com/fbaldassarri/model-conversion.git
+cd model-conversion
+mkdir models
+cd models
+huggingface-cli download --resume-download --local-dir sapienzanlp_modello-italia-9b --local-dir-use-symlinks False  sapienzanlp/modello-italia-9b
+```
+Then,
+```
+python3 ./examples/language-modeling/main.py \
+  --model_name  ./models/sapienzanlp_modello-italia-9b \
+  --device 0 \
+  --group_size 128 \
+  --bits 4 \
+  --iters 1000 \
+  --deployment_device 'gpu' \
+  --output_dir "./models/sapienzanlp_modello-italia-9b-int4" \
+  --train_bs 2 \
+  --gradient_accumulate_steps 8
+```
+## 🚨 Biases and Risks
+From the terms and conditions of iGenius for Modello Italia:
+> Modello Italia è concepito per essere utilizzato da tutti e per adattarsi a una vasta gamma di casi
+  d'uso. È stato progettato con l'obiettivo di essere accessibile a persone provenienti da
+  background, esperienze e prospettive diverse. Modello Italia si rivolge agli utenti e alle loro
+  esigenze senza inserire giudizi superflui o normative, riconoscendo al contempo che anche
+  contenuti potenzialmente problematici in determinati contesti possono avere scopi validi in altri.
+  Il rispetto per la dignità e l'autonomia di tutti gli utenti, specialmente in termini di libertà di
+  pensiero ed espressione, è un pilastro fondamentale del suo design. Tuttavia, essendo una nuova
+  tecnologia, Modello Italia comporta rischi legati al suo utilizzo. I test condotti finora sono stati
+  eseguiti in italiano e non hanno potuto coprire tutte le possibili situazioni. Pertanto, come per
+  tutti gli LLM, non è possibile prevedere in anticipo gli output di Modello Italia e il modello
+  potrebbe in alcuni casi generare risposte imprecise, tendenziose o altre risposte discutibili. Prima
+  di utilizzare Modello Italia in qualsiasi contesto, gli sviluppatori sono fortemente incoraggiati a
+  eseguire test di sicurezza e adattamento specifici per le loro applicazioni.
+We are aware of the biases and potential problematic/toxic content that current pretrained large language models exhibit: more specifically, as probabilistic models of (Italian and English) languages, they reflect and amplify the biases of their training data.
+For more information about this issue, please refer to our survey paper:
+* [Biases in Large Language Models: Origins, Inventory, and Discussion](https://dl.acm.org/doi/full/10.1145/3597307)
+## Model architecture
+* The model architecture is **based on GPT-NeoX**.
+## Results
+**Modello Italia 9B INT4 group-size 128 GPU-optimized** has not been evaluated on standard benchmarks yet.
+If you would like to contribute with your evaluation, please feel free to submit a pull request.

config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "_name_or_path": "./models/sapienzanlp_modello-italia-9b",
+  "architectures": [
+    "GPTNeoXForCausalLM"
+  ],
+  "attention_bias": true,
+  "attention_dropout": 0.0,
+  "attention_probs_dropout_prob": 0,
+  "bos_token_id": 0,
+  "classifier_dropout": 0.1,
+  "eos_token_id": 0,
+  "hidden_act": "gelu_fast",
+  "hidden_dropout": 0.0,
+  "hidden_dropout_prob": 0,
+  "hidden_size": 5120,
+  "initializer_range": 0.02,
+  "intermediate_size": 12800,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 4096,
+  "model_type": "gpt_neox",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 34,
+  "quantization_config": {
+    "autoround_version": "0.2",
+    "bits": 4,
+    "damp_percent": 0.01,
+    "desc_act": false,
+    "enable_minmax_tuning": true,
+    "enable_quanted_input": true,
+    "group_size": 128,
+    "is_marlin_format": false,
+    "iters": 1000,
+    "lr": 0.001,
+    "minmax_lr": 0.001,
+    "model_file_base_name": "model",
+    "model_name_or_path": null,
+    "quant_method": "gptq",
+    "scale_dtype": "float16",
+    "static_groups": false,
+    "sym": false,
+    "true_sequential": false
+  },
+  "rope_scaling": null,
+  "rotary_emb_base": 10000,
+  "rotary_pct": 0.4,
+  "tie_word_embeddings": false,
+  "torch_dtype": "float32",
+  "transformers_version": "4.41.2",
+  "use_cache": true,
+  "use_parallel_residual": true,
+  "vocab_size": 50176
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ebddc75fec7dae157e6c7ec94d59f5765a2375e19dc738d176981d20fdc600d8
+size 5203031160

quantize_config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "bits": 4,
+  "group_size": 128,
+  "damp_percent": 0.01,
+  "desc_act": false,
+  "static_groups": false,
+  "sym": false,
+  "true_sequential": false,
+  "model_name_or_path": null,
+  "model_file_base_name": "model",
+  "is_marlin_format": false,
+  "quant_method": "intel/auto-round",
+  "autoround_version": "0.2",
+  "iters": 1000,
+  "lr": 0.001,
+  "minmax_lr": 0.001,
+  "enable_minmax_tuning": true,
+  "enable_quanted_input": true,
+  "scale_dtype": "float16"
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd74bea2ba620d87e0a2127d9a21196b862a5cc7942ba4638eb2159bbab3340c
+size 1090536

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "add_prefix_space": true,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "chat_template": "{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n'  + message['content'] + eos_token }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "</s>",
+  "legacy": true,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": null,
+  "sp_model_kwargs": {},
+  "spaces_between_special_tokens": false,
+  "tokenizer_class": "LlamaTokenizer",
+  "unk_token": "<unk>",
+  "use_default_system_prompt": false
+}