Instructions to use befm/BeFM1.5-70B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use befm/BeFM1.5-70B with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("/nfs/turbo/si-qmei/huangjin/.cache/huggingface/hub/models--meta-llama--Llama-3.3-70B-Instruct/snapshots/6f6073b423013f6a7d4d9f39144961bfbfbc386b") model = PeftModel.from_pretrained(base_model, "befm/BeFM1.5-70B") - Notebooks
- Google Colab
- Kaggle
Be.FM 1.5-70B Model Card
Overview
Be.FM 1.5-70B is an open foundation model for human behavior modeling, built on Llama-3.3-70B-Instruct and fine-tuned via LoRA on diverse behavioral datasets. It is designed for predicting human survey responses, personality scores, demographic attributes, and behavior in economic and strategic games.
Paper: The Be.FM 1.5 paper link will be added here when it is released.
You will need to accept the Llama 3.3 Community License on Meta's repository before downloading the base model.
Usage
Be.FM 1.5-70B is a LoRA adapter on top of meta-llama/Llama-3.3-70B-Instruct. The base model needs roughly 140 GB of GPU memory in bfloat16; use device_map="auto" to spread it across multiple GPUs.
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model_id = "meta-llama/Llama-3.3-70B-Instruct"
peft_model_id = "befm/BeFM1.5-70B"
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(
base_model_id, device_map="auto", torch_dtype="bfloat16"
)
model = PeftModel.from_pretrained(model, peft_model_id)
Inference
Be.FM 1.5 uses the standard chat template; format prompts with system + user roles.
messages = [
{"role": "system", "content": "You are a participant in a behavioral study."},
{"role": "user", "content": "<your question here>"},
]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs, max_new_tokens=64,
temperature=0.6, top_p=0.9, do_sample=True,
)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Recommended sampling: temperature=0.6, top_p=0.9.
More examples can be found in the appendix of the paper.
Citation, Terms of Use, and Feedback
The Be.FM 1.5 paper will be linked here when it is released.
By using this model, you agree to Be.FM Terms of Use.
License: Llama 3.3 Community License, inherited from the Llama-3.3-70B-Instruct base model. See LICENSE. Use is also subject to Meta's Acceptable Use Policy.
We welcome your feedback on model performance as you apply Be.FM 1.5 to your work. Please share your feedback via the feedback form.
- Downloads last month
- -
Model tree for befm/BeFM1.5-70B
Base model
meta-llama/Llama-3.1-70B