YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
llama-65b-4bit
This works with my branch of GPTQ-for-LLaMa: https://github.com/catid/GPTQ-for-LLaMa-65B-2GPU
To test it out on two RTX4090 GPUs and 64GB RAM (might work with a big swap file haven't tested):
# Install git-lfs
sudo apt install git git-lfs
# Clone the code
git clone https://github.com/catid/GPTQ-for-LLaMa-65B-2GPU
cd GPTQ-for-LLaMa-65B-2GPU
# Clone the model weights
git lfs install
git clone https://huggingface.co/catid/llama-65b-4bit
# Set up conda environment
conda create -n gptq python=3.10
conda activate gptq
# Install script dependencies
pip install -r requirements.txt
# Work around protobuf error
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
# Run a test
python llama_inference.py llama-65b-4bit --load llama-65b-4bit/llama65b-4bit-128g.safetensors --groupsize 128 --wbits 4 --text "I woke up with a dent in my forehead. " --max_length 128 --min_length 32
license: bsd-3-clause
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.