mlabonne commited on
Commit
d300d6a
Β·
verified Β·
1 Parent(s): ab26029

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -16
README.md CHANGED
@@ -74,7 +74,7 @@ tags:
74
 
75
  LFM2 is a new generation of hybrid models developed by [Liquid AI](https://www.liquid.ai/), specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.
76
 
77
- We're releasing the weights of three post-trained checkpoints with 350M, 700M, and 1.2B parameters. They provide the following key features to create AI-powered edge applications:
78
 
79
  * **Fast training & inference** – LFM2 achieves 3x faster training compared to its previous generation. It also benefits from 2x faster decode and prefill speed on CPU compared to Qwen3.
80
  * **Best performance** – LFM2 outperforms similarly-sized models across multiple benchmark categories, including knowledge, mathematics, instruction following, and multilingual capabilities.
@@ -89,15 +89,15 @@ Due to their small size, **we recommend fine-tuning LFM2 models on narrow use ca
89
  They are particularly suited for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations.
90
  However, we do not recommend using them for tasks that are knowledge-intensive or require programming skills.
91
 
92
- | Property | [**LFM2-350M**](https://huggingface.co/LiquidAI/LFM2-350M) | [**LFM2-700M**](https://huggingface.co/LiquidAI/LFM2-700M) | [**LFM2-1.2B**](https://huggingface.co/LiquidAI/LFM2-1.2B) |
93
- | ------------------- | ----------------------------- | ----------------------------- | ----------------------------- |
94
- | **Parameters** | 354,483,968 | 742,489,344 | 1,170,340,608 |
95
- | **Layers** | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) |
96
- | **Context length** | 32,768 tokens | 32,768 tokens | 32,768 tokens |
97
- | **Vocabulary size** | 65,536 | 65,536 | 65,536 |
98
- | **Precision** | bfloat16 | bfloat16 | bfloat16 |
99
- | **Training budget** | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens |
100
- | **License** | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 |
101
 
102
  **Supported languages**: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
103
 
@@ -152,13 +152,11 @@ The candidate with ID 12345 is currently in the "Interview Scheduled" stage for
152
 
153
  ## πŸƒ How to run LFM2
154
 
155
- You can run LFM2 with transformers and llama.cpp. vLLM support is coming.
156
-
157
  ### 1. Transformers
158
 
159
- To run LFM2, you need to install Hugging Face [`transformers`](https://github.com/huggingface/transformers) v4.55 or more recent as follows:
160
 
161
- ```python
162
  pip install -U transformers
163
  ```
164
 
@@ -168,7 +166,7 @@ Here is an example of how to generate an answer with transformers in Python:
168
  from transformers import AutoModelForCausalLM, AutoTokenizer
169
 
170
  # Load model and tokenizer
171
- model_id = "LiquidAI/LFM2-700M"
172
  model = AutoModelForCausalLM.from_pretrained(
173
  model_id,
174
  device_map="auto",
@@ -206,7 +204,41 @@ print(tokenizer.decode(output[0], skip_special_tokens=False))
206
 
207
  You can directly run and test the model with this [Colab notebook](https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing).
208
 
209
- ### 2. Llama.cpp
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
210
 
211
  You can run LFM2 with llama.cpp using its [GGUF checkpoint](https://huggingface.co/LiquidAI/LFM2-700M-GGUF). Find more information in the model card.
212
 
 
74
 
75
  LFM2 is a new generation of hybrid models developed by [Liquid AI](https://www.liquid.ai/), specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.
76
 
77
+ We're releasing the weights of four post-trained checkpoints with 350M, 700M, 1.2B, and 2.6 parameters. They provide the following key features to create AI-powered edge applications:
78
 
79
  * **Fast training & inference** – LFM2 achieves 3x faster training compared to its previous generation. It also benefits from 2x faster decode and prefill speed on CPU compared to Qwen3.
80
  * **Best performance** – LFM2 outperforms similarly-sized models across multiple benchmark categories, including knowledge, mathematics, instruction following, and multilingual capabilities.
 
89
  They are particularly suited for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations.
90
  However, we do not recommend using them for tasks that are knowledge-intensive or require programming skills.
91
 
92
+ | Property | [**LFM2-350M**](https://huggingface.co/LiquidAI/LFM2-350M) | [**LFM2-700M**](https://huggingface.co/LiquidAI/LFM2-700M) | [**LFM2-1.2B**](https://huggingface.co/LiquidAI/LFM2-1.2B) | [**LFM2-2.6B**](https://huggingface.co/LiquidAI/LFM2-2.6B) |
93
+ | ------------------- | ----------------------------- | ----------------------------- | ----------------------------- | ----------------------------- |
94
+ | **Parameters** | 354,483,968 | 742,489,344 | 1,170,340,608 | 2,569,272,320 |
95
+ | **Layers** | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 30 (22 conv + 8 attn) |
96
+ | **Context length** | 32,768 tokens | 32,768 tokens | 32,768 tokens | 32,768 tokens |
97
+ | **Vocabulary size** | 65,536 | 65,536 | 65,536 | 65,536 |
98
+ | **Precision** | bfloat16 | bfloat16 | bfloat16 | bfloat16 |
99
+ | **Training budget** | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens |
100
+ | **License** | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 |
101
 
102
  **Supported languages**: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
103
 
 
152
 
153
  ## πŸƒ How to run LFM2
154
 
 
 
155
  ### 1. Transformers
156
 
157
+ To run LFM2, you need to install Hugging Face [`transformers`](https://github.com/huggingface/transformers) v4.55 or a more recent version as follows:
158
 
159
+ ```bash
160
  pip install -U transformers
161
  ```
162
 
 
166
  from transformers import AutoModelForCausalLM, AutoTokenizer
167
 
168
  # Load model and tokenizer
169
+ model_id = "LiquidAI/LFM2-1.2B"
170
  model = AutoModelForCausalLM.from_pretrained(
171
  model_id,
172
  device_map="auto",
 
204
 
205
  You can directly run and test the model with this [Colab notebook](https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing).
206
 
207
+ ### 2. vLLM
208
+
209
+ You need to install [`vLLM`](https://github.com/vllm-project/vllm) v0.10.2 or a more recent version as follows:
210
+
211
+ ```bash
212
+ uv pip install vllm==0.10.2 --extra-index-url https://wheels.vllm.ai/0.10.2/ --torch-backend=auto
213
+ ```
214
+
215
+ Here is an example of how to use it for inference:
216
+
217
+ ```python
218
+ from vllm import LLM, SamplingParams
219
+
220
+ prompts = [
221
+ "What is C. elegans?",
222
+ "Say hi in JSON format",
223
+ "Define AI in Spanish"
224
+ ]
225
+ sampling_params = SamplingParams(
226
+ temperature=0.3,
227
+ min_p=0.15,
228
+ repetition_penalty=1.05
229
+ )
230
+
231
+ llm = LLM(model="LiquidAI/LFM2-700M")
232
+
233
+ outputs = llm.generate(prompts, sampling_params)
234
+
235
+ for output in outputs:
236
+ prompt = output.prompt
237
+ generated_text = output.outputs[0].text
238
+ print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
239
+ ```
240
+
241
+ ### 3. llama.cpp
242
 
243
  You can run LFM2 with llama.cpp using its [GGUF checkpoint](https://huggingface.co/LiquidAI/LFM2-700M-GGUF). Find more information in the model card.
244