Llama-2-Small / .ipynb_checkpoints /README-checkpoint.md

update readme

8c640cc 10 months ago

2.19 kB

	---
	language:
	- it
	pipeline_tag: text-generation
	max_length: 100
	widget:
	- text: Alessandro è un ragazzo che progetta Infissi
	- text: Melissa è una ragazza che adora
	tags:
	- italian
	- italiano
	- llama
	---
	This is a train starting from an empty model based exclusively on Italian language datasets (currently redpajama 2023-14 it)<br/>
	<br/>
	the train is ongoing and will extend to new datasets.<br/>
	<br/>
	More precise versions will be published shortly.<br/>
	<br/>
	Train on my server, i have studied and adapted the model starting from the repository https://github.com/karpathy/llama2.c<br/>
	<br/>
	- LLama model parameter:
	- max_seq_len: (7b = 2048) The maximum sequence length for input data.
	- dim (7b= 4096) Represents the dimensionalityl
	- n_layers: (7b = 32) The number of layers
	- n_heads: (7b = 32) Determines the number of attention heads
	- n_kv_heads: (7b = 32) The number of key and value heads
	- multiple_of: (7b = 256) A value used to make the SwiGLU hidden layer size a multiple of a large power of 2
	<br/>
	- Model parameter
	- max_seq_len = 1024
	- dim = 768
	- n_layers = 32
	- n_heads = 32
	- n_kv_heads = 32
	- multiple_of = 32
	<br/>
	num decayed parameter tensors: 225, with 251,068,416 parameters<br/>
	num non-decayed parameter tensors: 65, with 49,920 parameters<br/>

	To just use the model, you can run:

	```py

	# Load model directly
	from transformers import AutoTokenizer, AutoModelForCausalLM

	# Load the model and tokenizer
	tokenizer_model = AutoTokenizer.from_pretrained("peruginia/Llama-2-Small")
	model = AutoModelForCausalLM.from_pretrained("peruginia/Llama-2-Small")
	model.to('cuda')
	from tokenizer import Tokenizer

	# Define the prompt
	prompt = "Alessandro è un ragazzo che progetta Infissi"

	# Tokenize the prompt
	inputs = tokenizer_model(prompt, return_tensors="pt").to('cuda')

	# Generate text
	output = model.generate(**inputs, do_sample = True, max_new_tokens=100, top_k = 300, top_p = 0.85, temperature = 1.0, num_return_sequences = 1)

	# Decode and print the generated text
	generated_text = tokenizer_model.decode(output[0], skip_special_tokens=True)

	print(generated_text)
	```