xiuyul
/

Lloco-7b-qasper

Model card Files Files and versions Community

Lloco-7b-qasper / README.md

xiuyul's picture

Update README.md

434b166 verified 4 months ago

|

history blame contribute delete

2.3 kB

	---
	license: apache-2.0
	datasets:
	- allenai/qasper
	metrics:
	- f1
	---

	# LLoCO: Learning Long Contexts Offline
	[Paper](https://arxiv.org/abs/2404.07979) \| [Code](https://github.com/jeffreysijuntan/lloco)

	Lloco-7b-qasper is the LoRA adaptor checkpoint finetuned from [AutoCompressor-Llama-2-7b-6k](https://huggingface.co/princeton-nlp/AutoCompressor-Llama-2-7b-6k/) and [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
	using the LLoCO method in [LLoCO: Learning Long Contexts Offline](https://arxiv.org/abs/2404.07979). It is instruction-tuned on the Qasper training set.

	LLoCO enables LLMs to process long-context efficiently by learning contexts offline through context compression and in-domain parameter-efficient finetuning with LoRA. This approach extends the effective context window of a 4k token LLaMA2-7B model to handle up to 128k tokens, while using
	30x fewer tokens and achieving up to 7.62x inference speed-up.

	## Released LoRA Checkpoint
	\| Model \| LoRA Rank \| Dataset \| Link \|
	\|:----------------\|-----------\|-------------\|--------------------------------------------------------\|
	\| Lloco-7b-quality\| 8 \| QuALITY \| [link](https://huggingface.co/xiuyul/Lloco-7b-quality/)\|
	\| Lloco-7b-qasper \| 8 \| Qasper \| [link](https://huggingface.co/xiuyul/Lloco-7b-qasper/) \|
	\| Lloco-7b-qmsum \| 8 \| QMSum \| [link](https://huggingface.co/xiuyul/Lloco-7b-qmsum/) \|
	\| Lloco-7b-nqa \| 8 \| NarrativeQA \| [link](https://huggingface.co/xiuyul/Lloco-7b-nqa/) \|
	\| Lloco-7b-hqa \| 8 \| HotpotQA \| [link](https://huggingface.co/xiuyul/Lloco-7b-hqa/) \|

	## Citation
	If you find this project useful, please consider citing:

	```
	@article{tan2024lloco,
	title={LLoCO: Learning Long Contexts Offline},
	author={Tan, Sijun and Li, Xiuyu and Patil, Shishir and Wu, Ziyang and Zhang, Tianjun and Keutzer, Kurt and Gonzalez, Joseph E and Popa, Raluca Ada},
	journal={arXiv preprint arXiv:2404.07979},
	year={2024}
	}
	```

	## Evaluation
	Check out [LLoCO: Learning Long Contexts Offline](https://arxiv.org/abs/2404.07979) for evaluation results on various long-context tasks such as long document question answering and summarization.