--- tags: - autotrain - text-generation widget: - text: 'I love AutoTrain because ' license: apache-2.0 datasets: - manjunathshiva/autotrain-data-GRADE3B-7B-02 language: - en --- # Model Trained Using AutoTrain This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). # Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch access_token = "" tokenizer = AutoTokenizer.from_pretrained( "meta-llama/Llama-2-7b-chat-hf" ) base_model = AutoModelForCausalLM.from_pretrained( 'meta-llama/Llama-2-7b-chat-hf', token=access_token, trust_remote_code=True, #device_map="auto", #Uncomment if you hava a good GPU Memory torch_dtype=torch.float16, offload_folder="offload/" ) model = PeftModel.from_pretrained( base_model, 'manjunathshiva/GRADE3B-7B-02-0', token=access_token, offload_folder="offload/" ).eval() # Prompt content: "When is Maths Unit Test 2?" messages = [ {"role": "user", "content": "When is Maths Unit Test 2?"} ] input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt') #output_ids = model.generate(input_ids.to('cuda')) #Uncomment if you have CUDA and comment below line output_ids = model.generate(input_ids=input_ids, temperature=0.01 ) response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True) # Model response: "" print(response) ```