DipeshChaudhary
/

ShareGPTChatBot-Counselchat1

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DipeshChaudhary commited on Jul 2

Commit

c99e77d

•

1 Parent(s): 22ecf09

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -17,12 +17,33 @@ base_model: unsloth/llama-3-8b-Instruct-bnb-4bit
   **STEP 1:**
   - Installs Unsloth, Xformers (Flash Attention) and all other packages! according to your environments and GPU
   - To install Unsloth on your own computer, follow the installation instructions on our Github page : [LINK IS HERE](https://github.com/unslothai/unsloth#installation-instructions---conda)
   **Now Follow the CODE**
     ```markdown
     ```
-    ```
     from unsloth import FastLanguageModel
     ```
     ```
 # Uploaded  model

   **STEP 1:**
   - Installs Unsloth, Xformers (Flash Attention) and all other packages! according to your environments and GPU
   - To install Unsloth on your own computer, follow the installation instructions on our Github page : [LINK IS HERE](https://github.com/unslothai/unsloth#installation-instructions---conda)
   **Now Follow the CODE**
     ```markdown
     ```
     from unsloth import FastLanguageModel
     ```
+    import torch
+    max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
+    dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
+    load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
+    from transformers import AutoTokenizer
+    ```
+    model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="DipeshChaudhary/ShareGPTChatBot-Counselchat1",  # Your fine-tuned model
+    max_seq_length=max_seq_length,
+    dtype=dtype,
+    load_in_4bit=load_in_4bit,
+    )
+    ```
+    #We now use the Llama-3 format for conversation style finetunes. We use Open Assistant conversations in ShareGPT style.
+    **We use our get_chat_template function to get the correct chat template. They support zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old and their own optimized unsloth template**
+    from unsloth.chat_templates import get_chat_template
+    tokenizer = get_chat_template(
+    tokenizer,
+    chat_template = "llama-3", # Supports zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, unsloth
+    mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style
+    )
     ```
 # Uploaded  model