YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Fine-tuned Llama 3 8B Model
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B that has been optimized for deployment on Hugging Face Endpoints.
Model Details
- Base model: meta-llama/Meta-Llama-3-8B
- Fine-tuning method: LoRA (Low-Rank Adaptation)
- Training dataset format: instruction+input+output
- Number of examples: 100
Usage
Python Code
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("deployable_model")
tokenizer = AutoTokenizer.from_pretrained("deployable_model")
# Format your prompt correctly
prompt = "<|you|>\nWhat is Italy's capital and why is it historically important?\n<|my response|>\n"
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs.input_ids,
max_new_tokens=200,
temperature=0.7,
top_p=0.9,
do_sample=True
)
# Decode and process the response
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=False)
response = generated_text.split("<|my response|>")[1].split("<|you|>")[0].strip()
print(response)
API Example (Text Generation Inference)
When deployed on Hugging Face Endpoints, you can use the following format:
import requests
API_URL = "https://api-inference.huggingface.co/models/YOUR_USERNAME/deployable_model"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "<|you|>\nWhat is Italy's capital and why is it historically important?\n<|my response|>\n",
"parameters": {"max_new_tokens": 200, "temperature": 0.7, "top_p": 0.9}
})
Input Format
The model expects inputs in this format:
<|you|>
User message here
<|my response|>
For system prompts, use:
<|my identity|>
System prompt here
<|you|>
User message here
<|my response|>
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support