|
--- |
|
base_model: |
|
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
- 78health/TinyLlama_1.1B-function-calling |
|
- phanerozoic/Tiny-Pirate-1.1b-v0.1 |
|
- Tensoic/TinyLlama-1.1B-3T-openhermes |
|
tags: |
|
- mergekit |
|
- merge |
|
license: mit |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
Example usage: |
|
|
|
|
|
```python |
|
from transformers import AutoModelForCausalLM |
|
from transformers import AutoTokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE") |
|
tokenizer = AutoTokenizer.from_pretrained("s3nh/TinyLLama-1.1B-MoE") |
|
|
|
input_text = """ |
|
###Input: You are a pirate. tell me a story about wrecked ship. |
|
###Response: |
|
""") |
|
|
|
input_ids = tokenizer.encode(input_text, return_tensors='pt').to(device) |
|
output = model.generate(inputs=input_ids, |
|
max_length=max_length, |
|
do_sample=True, |
|
top_k=10, |
|
temperature=0.7, |
|
pad_token_id=tokenizer.eos_token_id, |
|
attention_mask=input_ids.new_ones(input_ids.shape)) |
|
tokenizer.decode(output[0], skip_special_tokens=True) |
|
``` |
|
|
|
|
|
This model was possible to create by tremendous work of mergekit developers. I decided to merge tinyLlama models to |
|
create mixture of experts. |
|
Config used as below: |
|
|
|
``` |
|
"""base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
experts: |
|
- source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
positive_prompts: |
|
- "chat" |
|
- "assistant" |
|
- "tell me" |
|
- "explain" |
|
- source_model: 78health/TinyLlama_1.1B-function-calling |
|
positive_prompts: |
|
- "code" |
|
- "python" |
|
- "javascript" |
|
- "programming" |
|
- "algorithm" |
|
- source_model: phanerozoic/Tiny-Pirate-1.1b-v0.1 |
|
positive_prompts: |
|
- "storywriting" |
|
- "write" |
|
- "scene" |
|
- "story" |
|
- "character" |
|
- source_model: Tensoic/TinyLlama-1.1B-3T-openhermes |
|
positive_prompts: |
|
- "reason" |
|
- "provide" |
|
- "instruct" |
|
- "summarize" |
|
- "count" |
|
""" |
|
``` |