Phi-1.5 TOFU Unlearning Model
IMPORTANT: This model's checkpoints are stored in separate branches. You MUST specify a revision when loading the model to access a specific checkpoint.
This model is a variant of the Phi-1.5 model, fine-tuned on the TOFU (Task of Fictitious Unlearning) dataset and then subjected to various unlearning algorithms.
Model Details
- Base Model: Phi-1.5
- Training: Fine-tuned on TOFU dataset
- Unlearning: Applied various unlearning algorithms
Unlearning Algorithm
This model uses the KL_1e-05
unlearning algorithm with the following parameters:
- Learning Rate:
forget01
- Forget Percentage:
10%
Revisions
The model is organized into multiple revisions, each representing a checkpoint during the unlearning process. The revision names follow the pattern checkpoint-X
, where X is the checkpoint number. Each revision is stored in a separate branch.
Loading the Model
To load a specific revision of this model, you MUST specify the revision parameter. Use the following code:
from transformers import AutoModelForCausalLM, AutoTokenizer
# The 'revision' parameter is REQUIRED. Replace 'checkpoint-X' with the desired revision (e.g., 'checkpoint-12')
revision = "checkpoint-X"
model = AutoModelForCausalLM.from_pretrained("locuslab/{model_name}", revision=revision)
tokenizer = AutoTokenizer.from_pretrained("locuslab/{model_name}", revision=revision)
Note: If you don't specify a revision, you will not be able to load the model correctly.
TOFU Dataset
TOFU (Task of Fictitious Unlearning) is a dataset designed for training and evaluating unlearning algorithms in language models. It simulates scenarios where certain information needs to be "forgotten" or removed from the model's knowledge.
Unlearning Process
- The base Phi-1.5 model was first fine-tuned on the TOFU dataset (checkpoint-625).
- Various unlearning algorithms were then applied to this fine-tuned model to selectively "forget" certain information.
- The results of these unlearning processes are captured in the different revisions (branches) of this model.
Usage and Limitations
This model is primarily intended for research purposes, particularly in the field of machine unlearning and privacy in language models. It may not be suitable for general-purpose language tasks without further evaluation.
Citation
If you use this model in your research, please cite:
@misc{tofu2024,
title={TOFU: A Task of Fictitious Unlearning for LLMs},
author={Pratyush Maini and Zhili Feng and Avi Schwarzschild and Zachary C. Lipton and J. Zico Kolter},
year={2024},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Contact
For questions or issues regarding this model, please contact pratyushmaini@cmu.edu.