vishanoberoi commited on
Commit
be39dc4
1 Parent(s): a31cd5c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card for vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF
2
+
3
+ This model is a fine-tuned version of Llama-2-Chat-7b on company-specific question-answers data. It is designed for efficient performance while maintaining high-quality output, suitable for conversational AI applications.
4
+
5
+ ## Model Details
6
+ It was fined using QLORA and PEFT. After fine-tuning, the adapters were merged with the base model and then quantized to GGUF.
7
+ - **Developed by:** Vishan Oberoi and Dev Chandan.
8
+ - **Model type:** Transformer-based Large Language Model
9
+ - **Language(s) (NLP):** English
10
+ - **License:** MIT
11
+ - **Finetuned from model:** https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
12
+
13
+ ### Model Sources
14
+
15
+ - **Repository:** [vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF](https://huggingface.co/vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF)
16
+ - **Links:**
17
+ - LLaMA: [LLaMA Paper](https://arxiv.org/abs/2302.13971)
18
+ - QLORA: [QLORA Paper](https://arxiv.org/abs/2305.14314)
19
+ - GGUF: [GGUF Paper](https://arxiv.org/abs/abs_link)
20
+ - llama.cpp: [llama.cpp Paper/Documentation](https://github.com/ggerganov/llama.cpp)
21
+
22
+ ## Uses
23
+
24
+
25
+ This model is optimized for direct use in conversational AI, particularly for generating responses based on company-specific data. It can be utilized effectively in customer service bots, FAQ bots, and other applications where accurate and contextually relevant answers are required.
26
+ ## Usage notebook
27
+ https://colab.research.google.com/drive/1885wYoXeRjVjJzHqL9YXJr5ZjUQOSI-w?authuser=4#scrollTo=TZIoajzYYkrg
28
+
29
+ #### Example with `ctransformers`:
30
+
31
+ ```python
32
+ from ctransformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ llm = AutoModelForCausalLM.from_pretrained("vishanoberoi/Llama-2-7b-chat-hf-finedtuned-to-GGUF", model_file="finetuned.gguf", model_type="llama", gpu_layers = 50, max_new_tokens = 2000, temperature = 0.2, top_k = 40, top_p = 0.6, context_length = 6000)
35
+
36
+ system_prompt = '''<<SYS>>
37
+ You are a useful bot
38
+ <</SYS>>
39
+
40
+ '''
41
+
42
+ user_prompt = "Tell me about your company"
43
+
44
+ # Combine system prompt with user prompt
45
+ full_prompt = f"{system_prompt}\n[INST]{user_prompt}[/INST]"
46
+
47
+ # Generate the response
48
+ response = llm(full_prompt)
49
+
50
+ # Print the response
51
+ print(response)
52
+
53
+