writinwaters
commited on
Commit
·
72605cf
1
Parent(s):
c881efa
[doc] Updated document on max map count (#1037)
Browse files### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
- README.md +1 -1
- README_ja.md +1 -1
- README_zh.md +1 -1
- docs/guides/max_map_count.md +0 -71
- docs/{quickstart.md → quickstart.mdx} +98 -23
- docs/references/faq.md +14 -18
README.md
CHANGED
@@ -113,7 +113,7 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
|
|
113 |
|
114 |
### 🚀 Start up the server
|
115 |
|
116 |
-
1. Ensure `vm.max_map_count` >= 262144
|
117 |
|
118 |
> To check the value of `vm.max_map_count`:
|
119 |
>
|
|
|
113 |
|
114 |
### 🚀 Start up the server
|
115 |
|
116 |
+
1. Ensure `vm.max_map_count` >= 262144:
|
117 |
|
118 |
> To check the value of `vm.max_map_count`:
|
119 |
>
|
README_ja.md
CHANGED
@@ -95,7 +95,7 @@
|
|
95 |
|
96 |
### 🚀 サーバーを起動
|
97 |
|
98 |
-
1. `vm.max_map_count` >= 262144
|
99 |
|
100 |
> `vm.max_map_count` の値をチェックするには:
|
101 |
>
|
|
|
95 |
|
96 |
### 🚀 サーバーを起動
|
97 |
|
98 |
+
1. `vm.max_map_count` >= 262144 であることを確認する:
|
99 |
|
100 |
> `vm.max_map_count` の値をチェックするには:
|
101 |
>
|
README_zh.md
CHANGED
@@ -94,7 +94,7 @@
|
|
94 |
|
95 |
### 🚀 启动服务器
|
96 |
|
97 |
-
1. 确保 `vm.max_map_count` 不小于 262144
|
98 |
|
99 |
> 如需确认 `vm.max_map_count` 的大小:
|
100 |
>
|
|
|
94 |
|
95 |
### 🚀 启动服务器
|
96 |
|
97 |
+
1. 确保 `vm.max_map_count` 不小于 262144:
|
98 |
|
99 |
> 如需确认 `vm.max_map_count` 的大小:
|
100 |
>
|
docs/guides/max_map_count.md
DELETED
@@ -1,71 +0,0 @@
|
|
1 |
-
---
|
2 |
-
sidebar_position: 7
|
3 |
-
slug: /max_map_count
|
4 |
-
---
|
5 |
-
|
6 |
-
# Update vm.max_map_count
|
7 |
-
|
8 |
-
## Linux
|
9 |
-
|
10 |
-
To check the value of `vm.max_map_count`:
|
11 |
-
|
12 |
-
```bash
|
13 |
-
$ sysctl vm.max_map_count
|
14 |
-
```
|
15 |
-
|
16 |
-
Reset `vm.max_map_count` to a value at least 262144 if it is not.
|
17 |
-
|
18 |
-
```bash
|
19 |
-
# In this case, we set it to 262144:
|
20 |
-
$ sudo sysctl -w vm.max_map_count=262144
|
21 |
-
```
|
22 |
-
|
23 |
-
This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
|
24 |
-
|
25 |
-
```bash
|
26 |
-
vm.max_map_count=262144
|
27 |
-
```
|
28 |
-
|
29 |
-
## Mac
|
30 |
-
|
31 |
-
```bash
|
32 |
-
$ screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
|
33 |
-
$ sysctl -w vm.max_map_count=262144
|
34 |
-
```
|
35 |
-
To exit the screen session, type Ctrl a d.
|
36 |
-
|
37 |
-
## Windows and macOS with Docker Desktop
|
38 |
-
|
39 |
-
The vm.max_map_count setting must be set via docker-machine:
|
40 |
-
|
41 |
-
```bash
|
42 |
-
$ docker-machine ssh
|
43 |
-
$ sudo sysctl -w vm.max_map_count=262144
|
44 |
-
```
|
45 |
-
|
46 |
-
## Windows with Docker Desktop WSL 2 backend
|
47 |
-
|
48 |
-
To manually set it every time you reboot, you must run the following commands in a command prompt or PowerShell window every time you restart Docker:
|
49 |
-
|
50 |
-
```bash
|
51 |
-
$ wsl -d docker-desktop -u root
|
52 |
-
$ sysctl -w vm.max_map_count=262144
|
53 |
-
```
|
54 |
-
If you are on these versions of WSL and you do not want to have to run those commands every time you restart Docker, you can globally change every WSL distribution with this setting by modifying your %USERPROFILE%\.wslconfig as follows:
|
55 |
-
|
56 |
-
```bash
|
57 |
-
[wsl2]
|
58 |
-
kernelCommandLine = "sysctl.vm.max_map_count=262144"
|
59 |
-
```
|
60 |
-
This will cause all WSL2 VMs to have that setting assigned when they start.
|
61 |
-
|
62 |
-
If you are on Windows 11, or Windows 10 version 22H2 and have installed the Microsoft Store version of WSL, you can modify the /etc/sysctl.conf within the "docker-desktop" WSL distribution, perhaps with commands like this:
|
63 |
-
|
64 |
-
```bash
|
65 |
-
$ wsl -d docker-desktop -u root
|
66 |
-
$ vi /etc/sysctl.conf
|
67 |
-
```
|
68 |
-
and appending a line which reads:
|
69 |
-
```bash
|
70 |
-
vm.max_map_count = 262144
|
71 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
docs/{quickstart.md → quickstart.mdx}
RENAMED
@@ -4,6 +4,8 @@ slug: /
|
|
4 |
---
|
5 |
|
6 |
# Quick start
|
|
|
|
|
7 |
|
8 |
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
|
9 |
|
@@ -25,29 +27,102 @@ This quick start guide describes a general process from:
|
|
25 |
|
26 |
## Start up the server
|
27 |
|
28 |
-
This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
|
52 |
2. Clone the repo:
|
53 |
|
|
|
4 |
---
|
5 |
|
6 |
# Quick start
|
7 |
+
import Tabs from '@theme/Tabs';
|
8 |
+
import TabItem from '@theme/TabItem';
|
9 |
|
10 |
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
|
11 |
|
|
|
27 |
|
28 |
## Start up the server
|
29 |
|
30 |
+
This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.
|
31 |
+
|
32 |
+
<details>
|
33 |
+
<summary>1. Ensure <code>vm.max_map_count</code> >= 262144:</summary>
|
34 |
+
|
35 |
+
`vm.max_map_count`. This value sets the the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abmornal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.
|
36 |
+
|
37 |
+
RAGFlow v0.7.0 uses Elasticsearch for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning the Elasticsearch component.
|
38 |
+
|
39 |
+
<Tabs
|
40 |
+
defaultValue="linux"
|
41 |
+
values={[
|
42 |
+
{label: 'Linux', value: 'linux'},
|
43 |
+
{label: 'macOS', value: 'macos'},
|
44 |
+
{label: 'Windows', value: 'windows'},
|
45 |
+
]}>
|
46 |
+
<TabItem value="linux">
|
47 |
+
1.1. Check the value of `vm.max_map_count`:
|
48 |
+
|
49 |
+
```bash
|
50 |
+
$ sysctl vm.max_map_count
|
51 |
+
```
|
52 |
+
|
53 |
+
1.2. Reset `vm.max_map_count` to a value at least 262144 if it is not.
|
54 |
+
|
55 |
+
```bash
|
56 |
+
$ sudo sysctl -w vm.max_map_count=262144
|
57 |
+
```
|
58 |
+
|
59 |
+
:::caution WARNING
|
60 |
+
This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
|
61 |
+
:::
|
62 |
+
|
63 |
+
1.3. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
|
64 |
+
|
65 |
+
```bash
|
66 |
+
vm.max_map_count=262144
|
67 |
+
```
|
68 |
+
</TabItem>
|
69 |
+
<TabItem value="macos">
|
70 |
+
If you are on macOS with Docker Desktop, then you *must* use docker-machine to update `vm.max_map_count`:
|
71 |
+
|
72 |
+
```bash
|
73 |
+
$ docker-machine ssh
|
74 |
+
$ sudo sysctl -w vm.max_map_count=262144
|
75 |
+
```
|
76 |
+
|
77 |
+
:::caution WARNING
|
78 |
+
This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
|
79 |
+
:::
|
80 |
+
</TabItem>
|
81 |
+
<TabItem value="windows">
|
82 |
+
|
83 |
+
#### If you are on Windows with Docker Desktop, then you *must* use docker-machine to set `vm.max_map_count`:
|
84 |
+
|
85 |
+
```bash
|
86 |
+
$ docker-machine ssh
|
87 |
+
$ sudo sysctl -w vm.max_map_count=262144
|
88 |
+
```
|
89 |
+
#### If you are on Windows with Docker Desktop WSL 2 backend, then use docker-desktop to set `vm.max_map_count`:
|
90 |
+
|
91 |
+
1.1. Run the following in WSL:
|
92 |
+
```bash
|
93 |
+
$ wsl -d docker-desktop -u root
|
94 |
+
$ sysctl -w vm.max_map_count=262144
|
95 |
+
```
|
96 |
+
|
97 |
+
:::caution WARNING
|
98 |
+
This change will be reset after you restart Docker. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
|
99 |
+
:::
|
100 |
+
|
101 |
+
1.2. If you do not wish to have to run those commands each time you restart Docker, you can update your `%USERPROFILE%.wslconfig` as follows to keep your change permanent and globally for all WSL distributions:
|
102 |
+
|
103 |
+
```bash
|
104 |
+
[wsl2]
|
105 |
+
kernelCommandLine = "sysctl.vm.max_map_count=262144"
|
106 |
+
```
|
107 |
+
*This causes all WSL2 virtual machines to have that setting assigned when they start.*
|
108 |
+
|
109 |
+
:::note
|
110 |
+
If you are on Windows 11 or Windows 10 version 22H2, and have installed the Microsoft Store version of WSL, you can also update the **/etc/sysctl.conf** within the docker-desktop WSL distribution to keep your change permanent:
|
111 |
+
|
112 |
+
```bash
|
113 |
+
$ wsl -d docker-desktop -u root
|
114 |
+
$ vi /etc/sysctl.conf
|
115 |
+
```
|
116 |
+
|
117 |
+
```bash
|
118 |
+
# Append a line, which reads:
|
119 |
+
vm.max_map_count = 262144
|
120 |
+
```
|
121 |
+
:::
|
122 |
+
</TabItem>
|
123 |
+
</Tabs>
|
124 |
+
|
125 |
+
</details>
|
126 |
|
127 |
2. Clone the repo:
|
128 |
|
docs/references/faq.md
CHANGED
@@ -194,11 +194,7 @@ Ignore this warning and continue. All system warnings can be ignored.
|
|
194 |
|
195 |

|
196 |
|
197 |
-
#### 4.3 Why does
|
198 |
-
|
199 |
-
Parsing requests have to wait in queue due to limited server resources. We are currently enhancing our algorithms and increasing computing power.
|
200 |
-
|
201 |
-
#### 4.4 Why does my document parsing stall at under one percent?
|
202 |
|
203 |

|
204 |
|
@@ -211,7 +207,7 @@ docker logs -f ragflow-server
|
|
211 |
2. Check if the **task_executor.py** process exists.
|
212 |
3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com.
|
213 |
|
214 |
-
#### 4.
|
215 |
|
216 |
If your RAGFlow is deployed *locally*, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**.
|
217 |
|
@@ -225,17 +221,17 @@ If your RAGFlow is deployed *locally*, the parsing process is likely killed due
|
|
225 |
|
226 |

|
227 |
|
228 |
-
#### 4.
|
229 |
|
230 |
An index failure usually indicates an unavailable Elasticsearch service.
|
231 |
|
232 |
-
#### 4.
|
233 |
|
234 |
```bash
|
235 |
tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
|
236 |
```
|
237 |
|
238 |
-
#### 4.
|
239 |
|
240 |
```bash
|
241 |
$ docker ps
|
@@ -249,7 +245,7 @@ d8c86f06c56b mysql:5.7.18 "docker-entrypoint.s…" 7 days ago Up
|
|
249 |
cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
|
250 |
```
|
251 |
|
252 |
-
#### 4.
|
253 |
|
254 |
1. Check the status of your Elasticsearch component:
|
255 |
|
@@ -276,26 +272,26 @@ $ docker ps
|
|
276 |
curl http://<IP_OF_ES>:<PORT_OF_ES>
|
277 |
```
|
278 |
|
279 |
-
#### 4.
|
280 |
|
281 |
This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot.
|
282 |
|
283 |
-
#### 4.
|
284 |
|
285 |
Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://<IP_OF_YOUR_MACHINE>` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work.
|
286 |
|
287 |
-
#### 4.
|
288 |
|
289 |
A correct Ollama IP address and port is crucial to adding models to Ollama:
|
290 |
|
291 |
- If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
|
292 |
- If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.
|
293 |
|
294 |
-
#### 4.
|
295 |
|
296 |
Yes, we do. See the Python files under the **rag/app** folder.
|
297 |
|
298 |
-
#### 4.
|
299 |
|
300 |
You probably forgot to update the **MAX_CONTENT_LENGTH** environment variable:
|
301 |
|
@@ -314,7 +310,7 @@ docker compose up ragflow -d
|
|
314 |
```
|
315 |
*Now you should be able to upload files of sizes less than 100MB.*
|
316 |
|
317 |
-
#### 4.
|
318 |
|
319 |
This exception occurs when starting up the RAGFlow server. Try the following:
|
320 |
|
@@ -337,7 +333,7 @@ This exception occurs when starting up the RAGFlow server. Try the following:
|
|
337 |
docker compose up
|
338 |
```
|
339 |
|
340 |
-
#### 4.
|
341 |
|
342 |

|
343 |
|
@@ -345,7 +341,7 @@ This exception occurs when starting up the RAGFlow server. Try the following:
|
|
345 |
2. Do not forget to append **/v1/** to **http://IP:port**:
|
346 |
**http://IP:port/v1/**
|
347 |
|
348 |
-
#### 4.
|
349 |
|
350 |
1. Check if the status of your minio container is healthy:
|
351 |
```bash
|
|
|
194 |
|
195 |

|
196 |
|
197 |
+
#### 4.3 Why does my document parsing stall at under one percent?
|
|
|
|
|
|
|
|
|
198 |
|
199 |

|
200 |
|
|
|
207 |
2. Check if the **task_executor.py** process exists.
|
208 |
3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com.
|
209 |
|
210 |
+
#### 4.4 Why does my pdf parsing stall near completion, while the log does not show any error?
|
211 |
|
212 |
If your RAGFlow is deployed *locally*, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**.
|
213 |
|
|
|
221 |
|
222 |

|
223 |
|
224 |
+
#### 4.5 `Index failure`
|
225 |
|
226 |
An index failure usually indicates an unavailable Elasticsearch service.
|
227 |
|
228 |
+
#### 4.6 How to check the log of RAGFlow?
|
229 |
|
230 |
```bash
|
231 |
tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
|
232 |
```
|
233 |
|
234 |
+
#### 4.7 How to check the status of each component in RAGFlow?
|
235 |
|
236 |
```bash
|
237 |
$ docker ps
|
|
|
245 |
cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
|
246 |
```
|
247 |
|
248 |
+
#### 4.8 `Exception: Can't connect to ES cluster`
|
249 |
|
250 |
1. Check the status of your Elasticsearch component:
|
251 |
|
|
|
272 |
curl http://<IP_OF_ES>:<PORT_OF_ES>
|
273 |
```
|
274 |
|
275 |
+
#### 4.9 Can't start ES container and get `Elasticsearch did not exit normally`
|
276 |
|
277 |
This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot.
|
278 |
|
279 |
+
#### 4.10 `{"data":null,"retcode":100,"retmsg":"<NotFound '404: Not Found'>"}`
|
280 |
|
281 |
Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://<IP_OF_YOUR_MACHINE>` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work.
|
282 |
|
283 |
+
#### 4.11 `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`
|
284 |
|
285 |
A correct Ollama IP address and port is crucial to adding models to Ollama:
|
286 |
|
287 |
- If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
|
288 |
- If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.
|
289 |
|
290 |
+
#### 4.12 Do you offer examples of using deepdoc to parse PDF or other files?
|
291 |
|
292 |
Yes, we do. See the Python files under the **rag/app** folder.
|
293 |
|
294 |
+
#### 4.13 Why did I fail to upload a 10MB+ file to my locally deployed RAGFlow?
|
295 |
|
296 |
You probably forgot to update the **MAX_CONTENT_LENGTH** environment variable:
|
297 |
|
|
|
310 |
```
|
311 |
*Now you should be able to upload files of sizes less than 100MB.*
|
312 |
|
313 |
+
#### 4.14 `Table 'rag_flow.document' doesn't exist`
|
314 |
|
315 |
This exception occurs when starting up the RAGFlow server. Try the following:
|
316 |
|
|
|
333 |
docker compose up
|
334 |
```
|
335 |
|
336 |
+
#### 4.15 `hint : 102 Fail to access model Connection error`
|
337 |
|
338 |

|
339 |
|
|
|
341 |
2. Do not forget to append **/v1/** to **http://IP:port**:
|
342 |
**http://IP:port/v1/**
|
343 |
|
344 |
+
#### 4.16 `FileNotFoundError: [Errno 2] No such file or directory`
|
345 |
|
346 |
1. Check if the status of your minio container is healthy:
|
347 |
```bash
|