metadata

license: apache-2.0
datasets:
  - allenai/dolma

OLMo-Bitnet-1B

OLMo-Bitnet-1B is a 1B parameter model trained using the method described in The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits.

It was trained on the first 60B tokens of the Dolma dataset, so it is merely a research proof-of-concept to test out the methodolgy.

A separate training run was run with the exact same hyperparameters, but using standard fp16 weights. The comparison can be found in this wandb report.

Sample inference code

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("NousResearch/OLMo-Bitnet-1B")
model = AutoModelForCausalLM.from_pretrained("NousResearch/OLMo-Bitnet-1B",
    torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")

streamer = TextStreamer(tokenizer)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, pad_token_id=tokenizer.eos_token_id,
    temperature=0.8, repetition_penalty=1.1, do_sample=True,streamer=streamer)
pipe("The capitol of Paris is",  max_new_tokens=256)

Training was performed using OLMo.