yi-01-ai commited on
Commit
1a5af72
1 Parent(s): ba5924d

Auto Sync from git://github.com/01-ai/Yi.git/commit/135a9210d8028e6c48a224ec3eef6a00db6e425b

Browse files
Files changed (1) hide show
  1. README.md +42 -9
README.md CHANGED
@@ -120,7 +120,7 @@ pipeline_tag: text-generation
120
 
121
  - For English language capability, the Yi series models ranked 2nd (just behind GPT-4), outperforming other LLMs (such as LLaMA2-chat-70B, Claude 2, and ChatGPT) on the [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) in Dec 2023.
122
 
123
- - For Chinese language capability, the Yi series models landed in 2nd place (following GPT4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
124
 
125
  - 🙏 (Credits to LLaMA) Thanks to the Transformer and LLaMA open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem. If you're interested in Yi's adoption of LLaMA architecture and license usage policy, see [Yi's relation with LLaMA](https://github.com/01-ai/Yi/blob/main/docs/yi_relation_llama.md).
126
 
@@ -130,7 +130,7 @@ pipeline_tag: text-generation
130
 
131
  Yi models come in multiple sizes and cater to different use cases. You can also fine-tune Yi models to meet your specific requirements.
132
 
133
- For detailed deployment requirements, see [hardware requirements](https://github.com/01-ai/Yi/blob/main/docs/deployment.md#hardware-requirements).
134
 
135
  ### Chat models
136
 
@@ -296,15 +296,14 @@ If you want to chat with Yi with more customizable options (e.g., system prompt,
296
 
297
  ### pip
298
 
299
- This tutorial guides you through every step of running Yi (Yi-34B-Chat) locally and then performing inference.
300
 
301
  #### Step 0: Prerequistes
302
-
303
- - This tutorial assumes you are running the **Yi-34B-Chat** with an **A800 (80G)** GPU.
304
- - For detailed deployment requirements to run Yi models, see [hardware requirements]( https://github.com/01-ai/Yi/blob/main/docs/deployment.md).
305
 
306
  - Make sure Python 3.10 or later version is installed.
307
 
 
 
308
  #### Step 1: Prepare your environment
309
 
310
  To set up the environment and install the required packages, execute the following command.
@@ -372,7 +371,7 @@ You can perform inference with Yi chat or base models as below.
372
 
373
  ##### Perform inference with Yi base model
374
 
375
- The steps are similar to [Run Yi chat model](#run-yi-chat-model).
376
 
377
  You can use the existing file [`text_generation.py`](https://github.com/01-ai/Yi/tree/main/demo).
378
 
@@ -394,11 +393,45 @@ Then you can see an output similar to the one below. 🥳
394
 
395
  </details>
396
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
397
  ### Run Yi with llama.cpp
398
 
399
  If you have limited resources, you can try [llama.cpp](https://github.com/ggerganov/llama.cpp) or [ollama.cpp](https://ollama.ai/) (especially for Chinese users) to run Yi models in a few minutes locally.
400
 
401
- For a step-by-step tutorial,, see [Run Yi with llama.cpp](https://github.com/01-ai/Yi/edit/main/docs/yi_llama.cpp.md).
402
 
403
  ### Web demo
404
 
@@ -411,7 +444,7 @@ You can build a web UI demo for Yi **chat** models (note that Yi base models are
411
  Step 3. To start a web service locally, run the following command.
412
 
413
  ```bash
414
- python demo/web_demo.py --checkpoint-path <your-model-path>
415
  ```
416
 
417
  You can access the web UI by entering the address provided in the console into your browser.
 
120
 
121
  - For English language capability, the Yi series models ranked 2nd (just behind GPT-4), outperforming other LLMs (such as LLaMA2-chat-70B, Claude 2, and ChatGPT) on the [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) in Dec 2023.
122
 
123
+ - For Chinese language capability, the Yi series models landed in 2nd place (following GPT-4), surpassing other LLMs (such as Baidu ERNIE, Qwen, and Baichuan) on the [SuperCLUE](https://www.superclueai.com/) in Oct 2023.
124
 
125
  - 🙏 (Credits to LLaMA) Thanks to the Transformer and LLaMA open-source communities, as they reducing the efforts required to build from scratch and enabling the utilization of the same tools within the AI ecosystem. If you're interested in Yi's adoption of LLaMA architecture and license usage policy, see [Yi's relation with LLaMA](https://github.com/01-ai/Yi/blob/main/docs/yi_relation_llama.md).
126
 
 
130
 
131
  Yi models come in multiple sizes and cater to different use cases. You can also fine-tune Yi models to meet your specific requirements.
132
 
133
+ If you want to deploy Yi models, see [software and hardware requirements](https://github.com/01-ai/Yi/blob/main/docs/deployment.md#hardware-requirements).
134
 
135
  ### Chat models
136
 
 
296
 
297
  ### pip
298
 
299
+ This tutorial guides you through every step of running **Yi-34B-Chat locally on an A800 (80G)** and then performing inference.
300
 
301
  #### Step 0: Prerequistes
 
 
 
302
 
303
  - Make sure Python 3.10 or later version is installed.
304
 
305
+ - If you want to run other Yi models, see [software and hardware requirements](https://github.com/01-ai/Yi/blob/main/docs/deployment.md).
306
+
307
  #### Step 1: Prepare your environment
308
 
309
  To set up the environment and install the required packages, execute the following command.
 
371
 
372
  ##### Perform inference with Yi base model
373
 
374
+ The steps are similar to [pip - Perform inference with Yi chat model](#perform-inference-with-yi-chat-model).
375
 
376
  You can use the existing file [`text_generation.py`](https://github.com/01-ai/Yi/tree/main/demo).
377
 
 
393
 
394
  </details>
395
 
396
+ ### Docker
397
+
398
+ This tutorial guides you through every step of running **Yi-34B-Chat on an A800 GPU** locally and then performing inference.
399
+
400
+ #### Step 0: Prerequistes
401
+
402
+ - Make sure you've installed [Docker](https://docs.docker.com/engine/install/?open_in_browser=true) and [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
403
+
404
+ #### Step 1: Start Docker
405
+
406
+ ```bash
407
+ docker run -it --gpus all \
408
+ -v <your-model-path>: /models
409
+ ghcr.io/01-ai/yi:latest
410
+ ```
411
+
412
+ Alternatively, you can pull the Yi Docker image from `registry.lingyiwanwu.com/ci/01-ai/yi:latest`.
413
+
414
+ #### Step 2: Perform inference
415
+
416
+ You can perform inference with Yi chat or base models as below.
417
+
418
+ ##### Perform inference with Yi chat model
419
+
420
+ The steps are similar to [pip - Perform inference with Yi chat model](#perform-inference-with-yi-chat-model).
421
+
422
+ **Note** that the only difference is to set `model_path = '<your-model-mount-path>'` instead of `model_path = '<your-model-path>'`.
423
+
424
+ ##### Perform inference with Yi base model
425
+
426
+ The steps are similar to [pip - Perform inference with Yi base model](#perform-inference-with-yi-base-model).
427
+
428
+ **Note** that the only difference is to set `--model <your-model-mount-path>'` instead of `model <your-model-path>`.
429
+
430
  ### Run Yi with llama.cpp
431
 
432
  If you have limited resources, you can try [llama.cpp](https://github.com/ggerganov/llama.cpp) or [ollama.cpp](https://ollama.ai/) (especially for Chinese users) to run Yi models in a few minutes locally.
433
 
434
+ For a step-by-step tutorial, see [Run Yi with llama.cpp](https://github.com/01-ai/Yi/edit/main/docs/yi_llama.cpp.md).
435
 
436
  ### Web demo
437
 
 
444
  Step 3. To start a web service locally, run the following command.
445
 
446
  ```bash
447
+ python demo/web_demo.py -c <your-model-path>
448
  ```
449
 
450
  You can access the web UI by entering the address provided in the console into your browser.