Edit model card

Model Card for Model ID

Model Name

Luxeai-anu-1-bit-70M

Model Description

The Luxeai-anu-1-bit-70M Large Language Model (LLM) is my first trial to implement one-bit LLM based on the original paper - "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". I have taken the pre-trained Mistral-7B-v0.3 and abideen/Cosmopedia-100k-pretrain dataset. I used Microsoft Azure Standard_NC6s_v3 6 cores, 112GB RAM, 736GB storage 1 x NVIDIA Tesla V100 to train this initial model. I will be training on a much bigger dataset once I get a sponshorship for a 8x DGX System. I have tested on a sub-set of the same dataset.

Intended Use

  • Task: text generation

How to Use

Please follow the below code to run and test it in Python Jupyter Notebook


from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers.models.llama.modeling_llama import *

# Load the model
model = "arunb74/Luxeai-anu-1-bit-70M"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)


# Create a text generation pipeline
pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer, 
    device_map="auto"
)

prompt = "The LISA Pathfinder scientific collaboration will meet in Trento"

sequences = pipe(
    f"<s>[INST] {prompt} [/INST]",
    do_sample=True,
    max_new_tokens=100, 
    temperature=0.7, 
    top_k=50, 
    top_p=0.95,
    num_return_sequences=1,
)

print(sequences[0]['generated_text'])

"""
The output will be as follows - <s>[INST] The LISA Pathfinder scientific collaboration will meet in Trento [/INST]

The LISA Pathfinder Biology, a leading provider of biochemistry and molecular biology, provides a comprehensive understanding of the mechanisms and mechanisms of the LISA pathways. The LISA Pathfinder Biology, a researcher specializing in molecular biology, is a clinical trial of the disease, and its pathophysiology, and a combination of the most commonly used and widely used treatments. It is a relatively simple procedure that involves two steps.

# I need community members to help me further for feedback, suitable dataset for further training, testing, evaluation.

"""
Downloads last month
9
Safetensors
Model size
71.6M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train arunb74/Luxeai-anu-1-bit-70M