inetnuc
/

Llama-3.1-8B-bnb-4bit-chat-nuclear-lora-f16

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3.1-8B-bnb-4bit-chat-nuclear-lora-f16 / README.md

inetnuc's picture

Update README.md

f50cf38 verified 5 months ago

|

history blame contribute delete

1.96 kB

	---
	base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- gguf
	---

	# LLAMA-3.1 8B Chat Nuclear Model

	- Developed by: inetnuc
	- License: apache-2.0
	- Finetuned from model: unsloth/Meta-Llama-3.1-8B-bnb-4bit

	This LLAMA-3.1 model was finetuned to enhance capabilities in text generation for nuclear-related topics. The training was accelerated using [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library, achieving a 2x faster performance.

	## Finetuning Process
	The model was finetuned using the Unsloth library, leveraging its efficient training capabilities. The process included the following steps:

	1. Data Preparation: Loaded and preprocessed nuclear-related data.
	2. Model Loading: Utilized `unsloth/llama-3-8b-bnb-4bit` as the base model.
	3. LoRA Patching: Applied LoRA (Low-Rank Adaptation) for efficient training.
	4. Training: Finetuned the model using Hugging Face's TRL library with optimized hyperparameters.

	## Model Details

	- Base Model: `unsloth/llama-3.1-8b-bnb-4bit`
	- Language: English (`en`)
	- License: Apache-2.0

	## Author

	MUSTAFA UMUT OZBEK

	https://www.linkedin.com/in/mustafaumutozbek/
	https://x.com/m_umut_ozbek


	## Usage

	### Loading the Model

	You can load the model and tokenizer using the following code snippet:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("inetnuc/Llama-3.1-8B-bnb-4bit-chat-nuclear-lora-f16")
	model = AutoModelForCausalLM.from_pretrained("inetnuc/Llama-3.1-8B-bnb-4bit-chat-nuclear-lora-f16")

	# Example of generating text
	inputs = tokenizer("what is the iaea approach for cyber security?", return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=128)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))