--- base_model: meta-llama/Meta-Llama-3.1-8B language: - ko license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - llama - trl --- ``` import torch from transformers import AutoTokenizer, AutoModelForCausalLM base_model = 'bigdefence/Llama-3.1-8B-Ko-bigdefence' device = 'cuda' if torch.cuda.is_available() else 'cpu' tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16, device_map="auto") model.eval() def generate_response(prompt, model, tokenizer, text_streamer,max_new_tokens=256): inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=True) inputs = inputs.to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, streamer=text_streamer, max_new_tokens=max_new_tokens, do_sample=True, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) return response.replace(prompt, '').strip() key = "안녕?" prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {key} ### Response: """ text_streamer = TextStreamer(tokenizer) response = generate_response(prompt, model, tokenizer,text_streamer) print(response) ``` # Uploaded model - **Developed by:** Bigdefence - **License:** apache-2.0 - **Finetuned from model :** meta-llama/Meta-Llama-3.1-8B - **Dataset :** MarkrAI/KoCommercial-Dataset # Thanks - 한국어 LLM 오픈생태계에 많은 공헌을 해주신, Beomi 님과 maywell 님, MarkrAI님 감사의 인사 드립니다. This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)