yinsong1986 commited on
Commit
66d630e
1 Parent(s): edaa6d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -5,13 +5,13 @@ inference: false
5
 
6
  # MistralLite Model
7
 
8
- MistralLite is a fine-tuned [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model, with enhanced capblities of processing long context (up to 36K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLight is able to **perform signficantly better on several long context retrieve and answering tasks**, while keeping the simple model structure of the original model. MistralLite is useful for applications such as long context line and topic retrieval, summarization, question-answering, and etc. MistralLite can be deployed on a single AWS `g5.2x` instance with Sagemaker [Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) endpoint, making it suitable for applications that require high performance in resource-constrained environments. You can also serve the MistralLite model directly using TGI docker containers. Also, MistralLite supports other ways of serving like [vLLM](https://github.com/vllm-project/vllm), and you can use MistralLite in Python by using the [HuggingFace transformers](https://huggingface.co/docs/transformers/index) and [FlashAttention-2](https://github.com/Dao-AILab/flash-attention) library.
9
 
10
  MistralLight evolves from [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), and their similarities and differences are summarized below:
11
- |Model|Fine-tuned on long contexts| Quantization | Max context length| RotaryEmbedding adaptation| Sliding Window Size|
12
  |----------|-------------:|-------------:|------------:|-----------:|-----------:|
13
- | Mistral-7B-v0.1 | No | No | 36K | rope_theta = 10000 | 4096 |
14
- | MistralLite | Yes | No | 36K | **rope_theta = 1000000** | **16384** |
15
 
16
  ## Motivation of Developing MistralLite
17
 
 
5
 
6
  # MistralLite Model
7
 
8
+ MistralLite is a fine-tuned [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) language model, with enhanced capblities of processing long context (up to 32K tokens). By utilizing an adapted Rotary Embedding and sliding window during fine-tuning, MistralLight is able to **perform signficantly better on several long context retrieve and answering tasks**, while keeping the simple model structure of the original model. MistralLite is useful for applications such as long context line and topic retrieval, summarization, question-answering, and etc. MistralLite can be deployed on a single AWS `g5.2x` instance with Sagemaker [Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) endpoint, making it suitable for applications that require high performance in resource-constrained environments. You can also serve the MistralLite model directly using TGI docker containers. Also, MistralLite supports other ways of serving like [vLLM](https://github.com/vllm-project/vllm), and you can use MistralLite in Python by using the [HuggingFace transformers](https://huggingface.co/docs/transformers/index) and [FlashAttention-2](https://github.com/Dao-AILab/flash-attention) library.
9
 
10
  MistralLight evolves from [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), and their similarities and differences are summarized below:
11
+ |Model|Fine-tuned on long contexts| Max context length| RotaryEmbedding adaptation| Sliding Window Size|
12
  |----------|-------------:|-------------:|------------:|-----------:|-----------:|
13
+ | Mistral-7B-v0.1 | No | 32K | rope_theta = 10000 | 4096 |
14
+ | MistralLite | Yes | 32K | **rope_theta = 1000000** | **16384** |
15
 
16
  ## Motivation of Developing MistralLite
17