--- license: apache-2.0 --- --- language: en tags: - text-generation - causal-lm - fine-tuning - unsupervised --- # Model Name: olabs-ai/reflection_model ## Model Description The `olabs-ai/reflection_model` is a fine-tuned language model based on [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/Meta-Llama-3.1-8B-Instruct). It has been further fine-tuned using LoRA (Low-Rank Adaptation) for improved performance in specific tasks. This model is designed for text generation and can be used for various applications like conversational agents, content creation, and more. ## Model Details - **Base Model**: Meta-Llama-3.1-8B-Instruct - **Fine-Tuning Method**: LoRA - **Architecture**: LlamaForCausalLM - **Number of Parameters**: 8 Billion (Base Model) - **Training Data**: [Details about the training data used for fine-tuning, if available] ## Usage To use this model, you need to have the `transformers` and `unsloth` libraries installed. You can load the model and tokenizer as follows: ```python from transformers import AutoConfig, AutoModel, AutoTokenizer from unsloth import FastLanguageModel # Load base model configuration base_model_name = "olabs-ai/Meta-Llama-3.1-8B-Instruct" base_config = AutoConfig.from_pretrained(base_model_name) base_model = AutoModel.from_pretrained(base_model_name, config=base_config) tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Load LoRA adapter adapter_config_path = "path_to_your_adapter_config.json" adapter_weights_path = "path_to_your_adapter_weights" # Use FastLanguageModel to apply LoRA adapter model = FastLanguageModel.from_pretrained( model_name=base_model_name, adapter_weights=adapter_weights_path, config=adapter_config_path ) # Set inference mode for LoRA FastLanguageModel.for_inference(model) # Prepare inputs custom_prompt = "What is a famous tall tower in Paris?" inputs = tokenizer([custom_prompt], return_tensors="pt").to("cuda") from transformers import TextStreamer text_streamer = TextStreamer(tokenizer) # Generate outputs outputs = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)