Isotonic's picture
Update README.md
81d065d verified
|
raw
history blame
6.6 kB
metadata
license: apache-2.0
tags:
  - moe
  - merge
  - mergekit
  - lazymergekit
  - BEE-spoke-data/smol_llama-220M-openhermes
  - BEE-spoke-data/beecoder-220M-python
  - BEE-spoke-data/zephyr-220m-sft-full
  - BEE-spoke-data/zephyr-220m-dpo-full
datasets:
  - JeanKaddour/minipile
  - pszemraj/simple_wikipedia_LM
  - mattymchen/refinedweb-3m
  - HuggingFaceH4/ultrachat_200k
  - teknium/openhermes
  - HuggingFaceH4/ultrafeedback_binarized
  - EleutherAI/proof-pile-2
  - bigcode/the-stack-smol-xl
pipeline_tag: text-generation
widget:
  - text: >-
      <|im_start|>system

      You are a helpful assistant who gives creative responses.<|im_end|>

      <|im_start|>user

      Write the background story of a game about wizards and llamas in a sci-fi
      world.<|im_end|>

      <|im_start|>assistant
  - text: >-
      <|im_start|>system

      A friendly chat between a user and an assistant.<|im_end|>

      <|im_start|>user

      Got a question for you!<|im_end|>

      <|im_start|>assistant

      Sure! What's it?<|im_end|>

      <|im_start|>user

      I need to build a simple website. Where should I start learning about web
      development?<|im_end|>

      <|im_start|>assistant
  - text: >-
      <|im_start|>system

      You are a helpful assistant who provides concise answers to the user's
      questions.<|im_end|>

      <|im_start|>user

      How to become more healthy?<|im_end|>

      <|im_start|>assistant
  - text: |-
      <|im_start|>system
      You are a helpful assistant, who always answers with empathy.<|im_end|>
      <|im_start|>user
      List the pros and cons of social media.<|im_end|>
      <|im_start|>assistant
  - text: >-
      <|im_start|>system

      You are a helpful assistant, who always answers with empathy.<|im_end|>

      <|im_start|>user

      Hello!<|im_end|>

      <|im_start|>assistant

      Hi! How can I help you today?<|im_end|>

      <|im_start|>user

      Take a look at the info below.


      - The tape inside the VHS cassettes is very delicate and can be easily
      ruined, making them unplayable and unrepairable. The reason the tape
      deteriorates is that the magnetic charge needed for them to work is not
      permanent, and the magnetic particles end up losing their charge in a
      process known as remanence decay. These particles could also become
      demagnetised via being stored too close to a magnetic source.

      - One of the most significant issues with VHS tapes is that they have
      moving parts, meaning that there are more occasions when something can go
      wrong, damaging your footage or preventing it from playing back. The tape
      itself is a prominent cause of this, and tape slippage can occur. Tapes
      slippage can be caused when the tape loses its tension, or it has become
      warped. These problems can occur in storage due to high temperatures or
      frequent changes in humidity.

      - VHS tapes deteriorate over time from infrequent or overuse. Neglect
      means mold and dirt, while overuse can lead to scratches and technical
      difficulties. This is why old VHS tapes inevitably experience malfunctions
      after a long period of time. Usually anywhere between 10 to 25+ years.

      - Some VHS tapes like newer mini DVs and Digital 8 tapes can suffer from
      digital corruption, meaning that the footage becomes lost and cannot be
      recovered. These tapes were the steppingstone from VHS to the digital age
      when capturing footage straight to digital became the norm.
      Unfortunately,they are susceptible to digital corruption, which causes
      video pixilation and/or loss of audio.<|im_end|>

      <|im_start|>assistant

      Alright!<|im_end|>

      <|im_start|>user

      Now I'm going to write my question, and if the info above is useful, you
      can use them in your response.

      Ready?<|im_end|>

      <|im_start|>assistant

      Ready for your question!<|im_end|>

      <|im_start|>user

      Why do VHS tapes deteriorate over time?<|im_end|>

      <|im_start|>assistant
inference:
  parameters:
    max_new_tokens: 250
    penalty_alpha: 0.5
    top_k: 4
    repetition_penalty: 1.105

smol_llama-4x220M-MoE

smol_llama-4x220M-MoE is a Mixure of Experts (MoE) made with the following models using LazyMergekit:

💻 Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Isotonic/smol_llama-4x220M-MoE"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.bfloat16},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

🧩 Configuration

experts:
  - source_model: BEE-spoke-data/smol_llama-220M-openhermes
    positive_prompts:
    - "reasoning"
    - "logic"
    - "problem-solving"
    - "critical thinking"
    - "analysis"
    - "synthesis"
    - "evaluation"
    - "decision-making"
    - "judgment"
    - "insight"

  - source_model: BEE-spoke-data/beecoder-220M-python
    positive_prompts:
    - "program"
    - "software"
    - "develop"
    - "build"
    - "create"
    - "design"
    - "implement"
    - "debug"
    - "test"
    - "code"
    - "python"
    - "programming"
    - "algorithm"
    - "function"

  - source_model: BEE-spoke-data/zephyr-220m-sft-full
    positive_prompts:
    - "storytelling"
    - "narrative"
    - "fiction"
    - "creative writing"
    - "plot"
    - "characters"
    - "dialogue"
    - "setting"
    - "emotion"
    - "imagination"
    - "scene"
    - "story"
    - "character"
    
  - source_model: BEE-spoke-data/zephyr-220m-dpo-full
    positive_prompts:
    - "chat"
    - "conversation"
    - "dialogue"
    - "discuss"
    - "ask questions"
    - "share thoughts"
    - "explore ideas"
    - "learn new things"
    - "personal assistant"
    - "friendly helper"