Edit model card

Llama-2-70b-chat-hf-2bit_g16_s128-HQQ

This is a version of the LLama-2-70B-chat-hf model quantized to 2-bit via Half-Quadratic Quantization (HQQ): https://mobiusml.github.io/hqq_blog/

Basic Usage

To run the model, install the HQQ library from https://github.com/mobiusml/hqq and use it as follows:

model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'

from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = HQQModelForCausalLM.from_quantized(model_id)

Basic Chat Example

model_id = 'mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ'

from hqq.engine.hf import HQQModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model     = HQQModelForCausalLM.from_quantized(model_id)

##########################################################################################################
import transformers
from threading import Thread

from sys import stdout
def print_flush(data):
    stdout.write("\r" + data)
    stdout.flush()

#Adapted from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/app.py
def process_conversation(chat):
    system_prompt = chat['system_prompt']
    chat_history  = chat['chat_history']
    message       = chat['message']

    conversation = []
    if system_prompt:
        conversation.append({"role": "system", "content": system_prompt})
    for user, assistant in chat_history:
        conversation.extend([{"role": "user", "content": user}, {"role": "assistant", "content": assistant}])
    conversation.append({"role": "user", "content": message})

    return tokenizer.apply_chat_template(conversation, return_tensors="pt").to('cuda')

def chat_processor(chat, max_new_tokens=100, do_sample=True):
    tokenizer.use_default_system_prompt = False
    streamer = transformers.TextIteratorStreamer(tokenizer, timeout=10.0, skip_prompt=True, skip_special_tokens=True)

    generate_params = dict(
        {"input_ids": process_conversation(chat)},
        streamer=streamer,
        max_new_tokens=max_new_tokens,
        do_sample=do_sample,
        top_p=0.90,
        top_k=50,
        temperature= 0.6,
        num_beams=1,
        repetition_penalty=1.2,
    )

    t = Thread(target=model.generate, kwargs=generate_params)
    t.start()

    outputs = []
    for text in streamer:
        outputs.append(text)
        print_flush("".join(outputs))

    return outputs

###################################################################################################

outputs = chat_processor({'system_prompt':"You are a helpful assistant.",
                        'chat_history':[],
                        'message':"How can I build a car?"
                        }, 
                         max_new_tokens=1000, do_sample=False)

Output:

Building a car is a complex process that involves designing, prototyping, testing, and manufacturing. Here are some general steps you can follow to build a car:

  1. Design the car: Determine the type of car you want to build, including the size, shape, and features. Create a detailed set of blueprints or computer-aided design (CAD) drawings to guide your building process.
  2. Source materials: Purchase or gather all the necessary materials, such as steel, aluminum, rubber, plastics, and any other components required for the car's body, frame, and engine.
  3. Build the frame: Construct the frame, which is the foundation of the car. This includes creating the chassis, suspension, and steering systems.
  4. Install the engine: Choose an appropriate engine and install it in the frame. Connect the engine to the transmission, exhaust system, and cooling system.
  5. Add the body: Attach the body panels to the frame, including the hood, doors, trunk lid, and roof. Ensure proper alignment and fitment.
  6. Install the electrical system: Connect the battery, starter, alternator, and wiring harness to the engine and other components. Install headlights, taillights, and other electrical accessories.
  7. Add the brakes: Install the brake system, including the brake pads, rotors, calipers, and master cylinder. Connect the brake lines and bleed the system to remove air bubbles.
  8. Install the interior: Fit the seats, dashboard, carpeting, and other interior components. Install the steering column, pedals, and shifter.
  9. Test and inspect: Check the car's systems, including the brakes, suspension, and engine performance. Make sure everything is functioning properly and safely.
  10. Register and insure: Obtain registration and insurance for your newly built car. Comply with local regulations and laws regarding vehicle ownership and operation.

Please note that this is a high-level overview of the process, and building a car can be a complex and time-consuming task. It requires specialized knowledge, skills, and tools, as well as a clean and organized workspace. Additionally, safety precautions should always be taken when working on vehicles, as they can be dangerous if mishandled.

If you are not experienced in automotive construction, it may be advisable to seek guidance from professionals or take a course in automotive mechanics before attempting to build a car.


Limitations:
-Only supports single GPU runtime.
-Not compatible with HuggingFace's PEFT.

Downloads last month
14
Inference API
Input a message to start chatting with mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ.
Inference API (serverless) has been turned off for this model.

Collection including mobiuslabsgmbh/Llama-2-70b-chat-hf-2bit_g16_s128-HQQ