macadeliccc
/

laser-polyglot-4x7b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

macadeliccc commited on Jan 12

Commit

1ff8d2c

•

1 Parent(s): b171572

Update README.md

Files changed (1) hide show

README.md +63 -1

README.md CHANGED Viewed

@@ -13,4 +13,66 @@ The model is a merge of models that are capable of Chinese and Japanese output.
 + oshizo/japanese-e5-mistral-7b_slerp
 + cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
 + s3nh/Mistral-7B-Evol-Instruct-Chinese

 + oshizo/japanese-e5-mistral-7b_slerp
 + cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
 + s3nh/Mistral-7B-Evol-Instruct-Chinese
+# Code Example
+```python
+# Import necessary libraries
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("macadeliccc/laser-dolphin-mixtral-2x7b-dpo")
+model = AutoModelForCausalLM.from_pretrained("macadeliccc/laser-dolphin-mixtral-2x7b-dpo")
+def generate_response(prompt, max_length=50, num_return_sequences=1, temperature=1.0, top_k=50, top_p=1.0):
+    """
+    Generate a response from the model based on the input prompt and hyperparameters.
+    Args:
+    prompt (str): Prompt for the model.
+    max_length (int): Maximum length of the model's response.
+    num_return_sequences (int): Number of response sequences to generate.
+    temperature (float): Sampling temperature for model generation.
+    top_k (int): The number of highest probability vocabulary tokens to keep for top-k filtering.
+    top_p (float): If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation.
+    Returns:
+    str: The generated response from the model.
+    """
+    messages = [
+        {"role": "system", "content": "You are Dolphin, an AI assistant."},
+        {"role": "user", "content": prompt}
+    ]
+    # Apply chat template to input messages
+    gen_input = tokenizer.apply_chat_template(messages, return_tensors="pt")
+    # Generate a response
+    output = model.generate(**gen_input,
+                            max_length=max_length,
+                            num_return_sequences=num_return_sequences,
+                            temperature=temperature,
+                            top_k=top_k,
+                            top_p=top_p)
+    # Decode the generated tokens to a string
+    response = tokenizer.decode(output[0], skip_special_tokens=True)
+    return response
+# Example prompts in different languages
+english_prompt = "Write a quicksort algorithm in python"
+chinese_prompt = "用Python写一个快速排序算法"
+japanese_prompt = "Pythonでクイックソートアルゴリズムを書いてください"
+# Generate and print responses for each language
+print("English Response:")
+print(generate_response(english_prompt, max_length=100, temperature=0.8), "\n")
+print("Chinese Response:")
+print(generate_response(chinese_prompt, max_length=100, temperature=0.8), "\n")
+print("Japanese Response:")
+print(generate_response(japanese_prompt, max_length=100, temperature=0.8), "\n")
+```