- Alpha release of checkpoint before train and eval loss spikes. Additionally, there seems to be some alignment which is easily jailbroken.
💵 Donate to OpenAccess AI Collective to help us keep building great tools and models!
Manticore 30B Chat builds on Manticore v1 with new datasets, including a de-duped subset of the Pygmalion dataset. It also removes all Alpaca style prompts using
### in favor of
chat only style prompts using
ASSISTANT: as well as pygmalion/metharme prompting using
<|system|>, <|user|> and <|model|> tokens.
Manticore 30B Chat is a Llama 30B model fine-tuned on the following datasets along with the datasets from the original Manticore 30B.
**Manticore 30B Chat was trained on effectively 40% of the datasets below due to only training for 0.4 epochs.
- de-duped pygmalion dataset, filtered down to RP data
- riddle_sense - instruct augmented
- hellaswag, updated for detailed explanations w 30K+ rows
- gsm8k - instruct augmented
- ShareGPT - based on a cleaned and de-suped subset
- subset of QingyiSi/Alpaca-CoT for roleplay and CoT
- ARC-Easy & ARC-Challenge - instruct augmented for detailed responses, derived from the
- hellaswag - 5K row subset of instruct augmented for concise responses, derived from the
- metaeval/ScienceQA_text_only - instruct for concise responses
- openai/summarize_from_feedback - instruct augmented tl;dr summarization
Not added from Manticore 13B:
- mmlu - mmlu datasets were not added to this model as the
testsplit is used for benchmarks
Special thanks to Nanobit for helping with Axolotl, TheBloke for quantizing these models are more accessible to all, ehartford for cleaned datasets, and 0x000011b for the RP dataset.
Try out the model in HF Spaces. The demo uses a quantized GGML version of the model to quickly return predictions on smaller GPUs (and even CPUs). Quantized GGML may have some minimal loss of model quality.
Manticore was built with Axolotl on 8xA100 80GB
- 0.4 epochs taking approximately 14 hours. No further epochs will be released for the alpha.
- The configuration to duplicate this build is provided in this repo's /config folder.
Manticore has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). Manticore was fine-tuned from the base model LlaMa 13B, please refer to its model card's Limitations Section for relevant information.
- Downloads last month