akameswa
/

mixtral-4x7b-instruct-code-old

Text Generation

akameswa/mistral-7b-instruct-javascript-16bit

akameswa/mistral-7b-instruct-java-16bit

akameswa/mistral-7b-instruct-cpp-16bit

akameswa/mistral-7b-instruct-python-16bit

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

akameswa commited on Mar 15

Commit

b4862e9

•

1 Parent(s): 386c7b0

Update README.md

Files changed (1) hide show

README.md +29 -1

README.md CHANGED Viewed

@@ -32,4 +32,32 @@ experts:
     positive_prompts: ["You are helpful a coding assistant good at cpp"]
   - source_model: akameswa/mistral-7b-instruct-python-16bit
     positive_prompts: ["You are helpful a coding assistant good at python"]
-```

     positive_prompts: ["You are helpful a coding assistant good at cpp"]
   - source_model: akameswa/mistral-7b-instruct-python-16bit
     positive_prompts: ["You are helpful a coding assistant good at python"]
+```
+## Inference
+```python
+from transformers import AutoTokenizer
+import transformers
+import torch
+model = "akameswa/mixtral-4x7b-instruct-code-trial"
+messages = [{"role": "user", "content": "What is a large language model?"}]
+tokenizer = AutoTokenizer.from_pretrained(model)
+prompt = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    model_kwargs={"load_in_4bit": True},
+)
+outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+```
+* [Link to inference notebook](https://github.com/akameswa/CodeGenerationMoE/blob/main/code/inference_moe.ipynb)