File size: 6,346 Bytes
d6c27fb
 
aa8acc3
 
2710fd7
f7dda94
 
0da5d83
d6c27fb
ebcf71b
b3b7997
ebcf71b
b3b7997
09a47b9
b3b7997
 
6c8003e
b3b7997
 
 
 
 
25192d1
 
2de61c6
25192d1
 
 
 
03454c3
6f2ae87
03454c3
 
4002057
ebcf71b
 
 
 
 
4002057
 
6f2ae87
5262e95
320a1ae
774b526
ebcf71b
6f2ae87
 
 
 
 
 
2cdd7e5
ebcf71b
 
3f39097
5303064
 
 
 
 
ebcf71b
2cdd7e5
 
 
 
 
 
 
42ddc93
2a14a0b
 
 
 
 
ee1e5f1
 
 
2a14a0b
727d5db
ebcf71b
b843a82
 
 
 
ebcf71b
 
727d5db
ebcf71b
b843a82
 
 
 
ebcf71b
 
727d5db
ebcf71b
b843a82
 
 
ebcf71b
 
 
6f2ae87
ebcf71b
3f39097
ebcf71b
 
b843a82
ebcf71b
5a447fa
ebcf71b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---
license: apache-2.0
language:
- en
library_name: elm
tags:
- elm
pipeline_tag: text-generation
---
# SliceX AI™ ELM (Efficient Language Models)
**ELM** (which stands for **E**fficient **L**anguage **M**odels) is the first version in the series of cutting-edge language models from [SliceX AI](https://slicex.ai) that is designed to achieve the best in class performance in terms of _quality_, _throughput_ & _memory_.

<div align="center">
  <img src="elm-rambutan.png" width="256"/>
</div>

ELM is designed to be a modular and customizable family of neural networks that are highly efficient and performant. Today we are sharing the first version in this series: **ELM-v0.1** models (named _Rambutan_). 

_Model:_ ELM introduces a new type of _(de)-composable LLM model architecture_ along with the algorithmic optimizations required to learn (training) and run (inference) these models. At a high level, we train a single ELM model in a self-supervised manner (during pre-training phase) but once trained the ELM model can be sliced in many ways to fit different user/task needs. The optimizations can be applied to the model either during the pre-training and/or fine-tuning stage. 

_Fast Inference with Customization:_ Once trained, the ELM model architecture permits flexible inference strategies at runtime depending on the deployment needs. For instance, the ELM model can  be _decomposed_ into smaller slices, i.e., smaller (or larger) models can be extracted from the original model to create multiple inference endpoints. Alternatively, the original (single) ELM model can be loaded _as is_ for inference and different slices within the model can be queried directly to power faster inference. This provides an additional level of flexibility for users to make compute/memory tradeoffs depending on their application and runtime needs.

- **Blog:** [Medium](https://medium.com/sujith-ravi/introducing-elm-efficient-customizable-privacy-preserving-llms-cea56e4f727d)

- **Github:** https://github.com/slicex-ai/elm

- **Demo** (try it out): https://huggingface.co/spaces/slicexai/elm-demo-v1

- **HuggingFace** (access ELM Model cards, code & app from HF): https://huggingface.co/slicexai

## ELM-v0.1 Model Release
This repository contains code to run our ELM models. The current ELM model `elm-v0.1` (named _Rambutan_) was pre-trained (an intermediate checkpoint was used) and then instruction fine-tuned for downstream tasks.

ELM models (in the `models` folder) in this repository come in three sizes (`elm-1.0`, `elm-0.75` and `elm-0.25`). **All these different slices are extracted from the same ELM finetuned checkpoint for inference** and supports the following use-case.
- news_classification
- toxicity_detection
- news_content_generation
- news_summarization

**NOTE: ELM-v0.1 release is an early version finetuned from an intermediate pretrained checkpoint & without any KV caching, decoding optimizations, or quantization applied.**

## Setup ELM
### Download ELM repo
 ```bash
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/slicexai/elm-v0.1
```
### Installation
```bash
cd elm-v0.1
pip install -r requirements.txt
```



## Download ELM task-specific model checkpoints
### Install git-lfs
 ```bash
sudo apt-get install git-lfs
git lfs install
```
For Macbook, replace `sudo apt-get install git-lfs` with `brew install git-lfs`

(Optional) Installing git-lfs without sudo,
```bash
wget https://github.com/git-lfs/git-lfs/releases/download/v3.2.0/git-lfs-linux-amd64-v3.2.0.tar.gz
tar -xzf git-lfs-linux-amd64-v3.2.0.tar.gz
PATH=$PATH:/<absolute-path>/git-lfs-3.2.0/
git lfs install
```
### Download ELM checkpoints

To download all checkpoints 
```bash
git lfs pull
```
```note
NOTE: Please allow a few minutes for the full download of all model checkpoints.
```

To download elm-1.0 model checkpoints individually
```bash
git lfs pull -I elm-1.0_news_classification/ckpt.pt
git lfs pull -I elm-1.0_toxicity_detection/ckpt.pt
git lfs pull -I elm-1.0_news_content_generation/ckpt.pt
git lfs pull -I elm-1.0_news_summarization/ckpt.pt
```

To download elm-0.75 model checkpoints individually
```bash
git lfs pull -I elm-0.75_news_classification/ckpt.pt
git lfs pull -I elm-0.75_toxicity_detection/ckpt.pt
git lfs pull -I elm-0.75_news_content_generation/ckpt.pt
git lfs pull -I elm-0.75_news_summarization/ckpt.pt
```

To download elm-0.25 model checkpoints individually
```bash
git lfs pull -I elm-0.25_news_classification/ckpt.pt
git lfs pull -I elm-0.25_toxicity_detection/ckpt.pt
git lfs pull -I elm-0.25_news_content_generation/ckpt.pt
```




## How to use: Run ELM on a sample task (e.g., news classification)
```bash
python run.py <elm-model-directory>
E.g. python run.py elm-0.75_news_classification
``` 
Prompts for the specific tasks can be found in the corresponding checkpoint directory. See an example below from `models/elm-0.75_news_classification/example_prompts.json`.
```json
{
    "inputs": ["GM May Close Plant in Europe  DETROIT (Reuters) - General Motors Corp. &lt;A HREF=\"http://www.investor.reuters.com/FullQuote.aspx?ticker=GM.N target=/stocks/quickinfo/fullquote\"&gt;GM.N&lt;/A&gt; will likely  cut some jobs in Europe and may close a plant there as part of  a restructuring plan under development to try to return the  region to profitability, the U.S. automaker said on Wednesday."],
    "template": "[INST]Below is a news article. Please classify it under one of the following classes (World, Business, Sports, Sci/Tech). Please format your response as a JSON payload.\n\n### Article: {input}\n\n### JSON Response:[/INST]"
}
```

Running the above command returns the following response

```json
{
    "prompt": "[INST]Below is a news article. Please classify it under one of the following classes (World, Business, Sports, Sci/Tech). Please format your response as a JSON payload.\n\n### Article: GM May Close Plant in Europe  DETROIT (Reuters) - General Motors Corp. &lt;A HREF=\"http://www.investor.reuters.com/FullQuote.aspx?ticker=GM.N target=/stocks/quickinfo/fullquote\"&gt;GM.N&lt;/A&gt; will likely  cut some jobs in Europe and may close a plant there as part of  a restructuring plan under development to try to return the  region to profitability, the U.S. automaker said on Wednesday.\n\n### JSON Response:[/INST]",
    "response": "{'text_label': 'Business'}"
}
```