syzymon
/

long_llama_code_7b

Text Generation

text-generation-inference

Model card Files Files and versions Community

syzymon commited on Sep 21, 2023

Commit

bea3280

•

1 Parent(s): 76ac8b7

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -34,6 +34,9 @@ model-index:
 <div align="center">
@@ -69,13 +72,13 @@ model-index:
 </div>
 ## TLDR
 This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
 LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method.
 LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf).
 ## Overview
 ### Base models
@@ -98,6 +101,10 @@ with three layers used for context extension. **Crucially, LongLLaMA is able to
 </div>
 ## Usage

 <div align="center">
+<p align="center" width="100%">
+<img src="https://raw.githubusercontent.com/CStanKonrad/long_llama/main/assets/results.png" alt="LongLLaMA" style="width: 70%; min-width: 300px; display: block; margin: auto;">
+</p>
 </div>
 ## TLDR
 This repository contains the research preview of **LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more**.
 LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method.
 LongLLaMA Code is built upon the foundation of [Code Llama](https://huggingface.co/codellama/CodeLlama-7b-hf).
 ## Overview
 ### Base models
 </div>
+## Results
 ## Usage