RichardErkhov/frankenmerger_-_cosmo-3b-test-gguf

Quantization made by Richard Erkhov.

cosmo-3b-test - GGUF

Model creator: https://huggingface.co/frankenmerger/
Original model: https://huggingface.co/frankenmerger/cosmo-3b-test/

Name	Quant method	Size
cosmo-3b-test.Q2_K.gguf	Q2_K	1.03GB
cosmo-3b-test.IQ3_XS.gguf	IQ3_XS	1.14GB
cosmo-3b-test.IQ3_S.gguf	IQ3_S	1.21GB
cosmo-3b-test.Q3_K_S.gguf	Q3_K_S	1.21GB
cosmo-3b-test.IQ3_M.gguf	IQ3_M	1.26GB
cosmo-3b-test.Q3_K.gguf	Q3_K	1.34GB
cosmo-3b-test.Q3_K_M.gguf	Q3_K_M	1.34GB
cosmo-3b-test.Q3_K_L.gguf	Q3_K_L	1.46GB
cosmo-3b-test.IQ4_XS.gguf	IQ4_XS	1.49GB
cosmo-3b-test.Q4_0.gguf	Q4_0	1.56GB
cosmo-3b-test.IQ4_NL.gguf	IQ4_NL	1.57GB
cosmo-3b-test.Q4_K_S.gguf	Q4_K_S	1.57GB
cosmo-3b-test.Q4_K.gguf	Q4_K	1.67GB
cosmo-3b-test.Q4_K_M.gguf	Q4_K_M	1.67GB
cosmo-3b-test.Q4_1.gguf	Q4_1	1.73GB
cosmo-3b-test.Q5_0.gguf	Q5_0	1.9GB
cosmo-3b-test.Q5_K_S.gguf	Q5_K_S	1.9GB
cosmo-3b-test.Q5_K.gguf	Q5_K	1.95GB
cosmo-3b-test.Q5_K_M.gguf	Q5_K_M	1.95GB
cosmo-3b-test.Q5_1.gguf	Q5_1	2.07GB
cosmo-3b-test.Q6_K.gguf	Q6_K	2.25GB
cosmo-3b-test.Q8_0.gguf	Q8_0	2.92GB

Original model description:

widget: - text: 'Artificial Intelligence is' example_title: Textbook group: Completion - text: ' [INST] How to take care of exotic cars? [/INST] ' example_title: Wikihow group: Completion - text: ' [INST] Generate a story about a Dark Knight [/INST] ' example_title: Story group: Completion inference: parameters: temperature: 0.6 top_p: 0.9 top_k: 30 repetition_penalty: 1.2 license: apache-2.0 language: - en pipeline_tag: text-generation

💻 Usage

!pip install -qU transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "gmonsoon/frankencosmo-test" messages = [{"role": "user", "content": "What is a large language model?"}] tokenizer = AutoTokenizer.from_pretrained(model) prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) pipeline = transformers.pipeline( "text-generation", model=model, torch_dtype=torch.float16, device_map="auto", ) outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs[0]["generated_text"])