Edit model card

Small dummy LLama2-type Model useable for Unit/Integration tests. Suitable for CPU only machines, see H2O LLM Studio for an example integration test.

Model was created as follows:

from transformers import AutoConfig, AutoTokenizer, AutoModelForCausalLM

repo_name = "MaxJeblick/llama2-0b-unit-test"
model_name = "h2oai/h2ogpt-4096-llama2-7b-chat"
config = AutoConfig.from_pretrained(model_name)
config.hidden_size = 12
config.max_position_embeddings = 1024
config.intermediate_size = 24
config.num_attention_heads = 2
config.num_hidden_layers = 2
config.num_key_value_heads = 2

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_config(config)
print(model.num_parameters())  # 770_940

model.push_to_hub(repo_name, private=False)
tokenizer.push_to_hub(repo_name, private=False)
config.push_to_hub(repo_name, private=False)

Use the following configuration in H2O LLM Studio to run a complete experiment in 5 seconds using the default dataset and default settings otherwise:

Validation Size: 0.1
Data Sample: 0.1
Max Length Prompt: 32
Max Length Answer: 32
Max Length: 64
Backbone Dtype: float16
Gradient Checkpointing: False
Batch Size: 8
Max Length Inference: 16
Downloads last month
564
Safetensors
Model size
771k params
Tensor type
F32
ยท
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using MaxJeblick/llama2-0b-unit-test 1