--- license: llama2 datasets: - tiiuae/falcon-refinedweb - EleutherAI/pile - meta-math/MetaMathQA language: - en library_name: transformers --- # Saily 220B --- ## Announcements **1.** Date: 17th December, 2023 Releasing v1. Saily_220B is a powerful AI model built on top of Llama2-70B merges. We created 10 fine-tuned **Llama2 70B** models. The models were fine-tuned on a part of Refined-Web Dataset (common for all) and individually the models were finetuned on niche specific datasets: - Code - Humor - Maths - Logical Understanding - Physics - Reasoning - Psychology - Roleplay We created 4 linear merges while keeping **Logical-Understanding** and **Reasoning** models constant in all linear merges. and then finally we created a passthrough merge between the models. Public Datasets used: 1. [RefinedWeb](https://hf.co/datasets/tiiuae/falcon-refinedweb) (part of it) 2. Pile (part of it) 3. [MetaMathQA](https://hf.co/datasets/meta-math/MetaMathQA) 4. Unnatural Code (Javascript, Python, C++) ### How did we create the private dataset? We recorded many internal brain-storming sessions where we just talked about random things. We also invited many experts from different fields: - Mathematicians - Developers - Bio-Engineers - Authors - Psychologists - and others... We talked about different things with them and recorded the sessions and then transcribed the audio to create the datasets. --- ### Please don't refer to the config.json in the files, it isn't accurate. You can run: ```python from transformers import AutoModelForCausalLM as amclm model = amclm.from_pretrained("deepnight-research/saily_220b", device_map="auto") # print(model.config) model.config ``` to check out the model's configuration. --- ### Try it: You definitely need GPUs here (that goes without saying) * We have tried it on **4 x A100 80GB** and **2 x A100 80GB**. * You will have to load the model in **4bit** to fit on **2 x A100 (80GB)**. ```python from transformers import AutoModelForCausalLM as amclm from transformers import AutoTokenizer model_name = "deepnight-research/saily_220b" model = amclm.from_pretrained(model_name, device_map="auto") # To load in 8Bit, make sure you have bitsandbytes installed. # model = amclm.from_pretrained(model_name, # device_map="auto", # load_in_8bit=True # ) # Float16 # import torch # model = amclm.from_pretrained(model_name, # device_map="auto", # torch_dtype=torch.float16 # ) tokenizer = AutoTokenier.from_pretrained(model_name) input_ids = tokenizer.encode("[INST]\nWrite a poem about cats\n[/INST]\n\n", return_tensors="pt") output = model.generate(input_ids, max_length=128, temperature=0.7, repetition_penalty=1.1, top_p=0.7, top_k=50 ) output_text = tokenizer.decode(output[0], skip_special_tokens=True) ``` We recommend following **Alpaca Prompt Format**, and if you're trying it out in Text-Generation-WebUI, please use **INSTRUCT** or **CHAT-INSTRUCT** mode. --- ## Limitations and Bias As with all language models, Saily_220B may generate incorrect or biased content. It's important to keep this in mind when using the model. --- ## Wanna Talk? Reach out to us at [research@deepnight.tech](mailto:research@deepnight.tech) or [hello@deepnight.tech](mailto:hello@deepnight.tech)