writinwaters commited on
Commit
72605cf
·
1 Parent(s): c881efa

[doc] Updated document on max map count (#1037)

Browse files

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Documentation Update

README.md CHANGED
@@ -113,7 +113,7 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
113
 
114
  ### 🚀 Start up the server
115
 
116
- 1. Ensure `vm.max_map_count` >= 262144 ([more](./docs/guides/max_map_count.md)):
117
 
118
  > To check the value of `vm.max_map_count`:
119
  >
 
113
 
114
  ### 🚀 Start up the server
115
 
116
+ 1. Ensure `vm.max_map_count` >= 262144:
117
 
118
  > To check the value of `vm.max_map_count`:
119
  >
README_ja.md CHANGED
@@ -95,7 +95,7 @@
95
 
96
  ### 🚀 サーバーを起動
97
 
98
- 1. `vm.max_map_count` >= 262144 であることを確認する【[もっと](./docs/guides/max_map_count.md)】:
99
 
100
  > `vm.max_map_count` の値をチェックするには:
101
  >
 
95
 
96
  ### 🚀 サーバーを起動
97
 
98
+ 1. `vm.max_map_count` >= 262144 であることを確認する:
99
 
100
  > `vm.max_map_count` の値をチェックするには:
101
  >
README_zh.md CHANGED
@@ -94,7 +94,7 @@
94
 
95
  ### 🚀 启动服务器
96
 
97
- 1. 确保 `vm.max_map_count` 不小于 262144 【[更多](./docs/guides/max_map_count.md)】:
98
 
99
  > 如需确认 `vm.max_map_count` 的大小:
100
  >
 
94
 
95
  ### 🚀 启动服务器
96
 
97
+ 1. 确保 `vm.max_map_count` 不小于 262144
98
 
99
  > 如需确认 `vm.max_map_count` 的大小:
100
  >
docs/guides/max_map_count.md DELETED
@@ -1,71 +0,0 @@
1
- ---
2
- sidebar_position: 7
3
- slug: /max_map_count
4
- ---
5
-
6
- # Update vm.max_map_count
7
-
8
- ## Linux
9
-
10
- To check the value of `vm.max_map_count`:
11
-
12
- ```bash
13
- $ sysctl vm.max_map_count
14
- ```
15
-
16
- Reset `vm.max_map_count` to a value at least 262144 if it is not.
17
-
18
- ```bash
19
- # In this case, we set it to 262144:
20
- $ sudo sysctl -w vm.max_map_count=262144
21
- ```
22
-
23
- This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
24
-
25
- ```bash
26
- vm.max_map_count=262144
27
- ```
28
-
29
- ## Mac
30
-
31
- ```bash
32
- $ screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
33
- $ sysctl -w vm.max_map_count=262144
34
- ```
35
- To exit the screen session, type Ctrl a d.
36
-
37
- ## Windows and macOS with Docker Desktop
38
-
39
- The vm.max_map_count setting must be set via docker-machine:
40
-
41
- ```bash
42
- $ docker-machine ssh
43
- $ sudo sysctl -w vm.max_map_count=262144
44
- ```
45
-
46
- ## Windows with Docker Desktop WSL 2 backend
47
-
48
- To manually set it every time you reboot, you must run the following commands in a command prompt or PowerShell window every time you restart Docker:
49
-
50
- ```bash
51
- $ wsl -d docker-desktop -u root
52
- $ sysctl -w vm.max_map_count=262144
53
- ```
54
- If you are on these versions of WSL and you do not want to have to run those commands every time you restart Docker, you can globally change every WSL distribution with this setting by modifying your %USERPROFILE%\.wslconfig as follows:
55
-
56
- ```bash
57
- [wsl2]
58
- kernelCommandLine = "sysctl.vm.max_map_count=262144"
59
- ```
60
- This will cause all WSL2 VMs to have that setting assigned when they start.
61
-
62
- If you are on Windows 11, or Windows 10 version 22H2 and have installed the Microsoft Store version of WSL, you can modify the /etc/sysctl.conf within the "docker-desktop" WSL distribution, perhaps with commands like this:
63
-
64
- ```bash
65
- $ wsl -d docker-desktop -u root
66
- $ vi /etc/sysctl.conf
67
- ```
68
- and appending a line which reads:
69
- ```bash
70
- vm.max_map_count = 262144
71
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/{quickstart.md → quickstart.mdx} RENAMED
@@ -4,6 +4,8 @@ slug: /
4
  ---
5
 
6
  # Quick start
 
 
7
 
8
  RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
9
 
@@ -25,29 +27,102 @@ This quick start guide describes a general process from:
25
 
26
  ## Start up the server
27
 
28
- This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.
29
-
30
- 1. Ensure `vm.max_map_count` >= 262144:
31
-
32
- > To check the value of `vm.max_map_count`:
33
- >
34
- > ```bash
35
- > $ sysctl vm.max_map_count
36
- > ```
37
- >
38
- > Reset `vm.max_map_count` to a value at least 262144 if it is not.
39
- >
40
- > ```bash
41
- > # In this case, we set it to 262144:
42
- > $ sudo sysctl -w vm.max_map_count=262144
43
- > ```
44
- >
45
- > This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
46
- >
47
- > ```bash
48
- > vm.max_map_count=262144
49
- > ```
50
- > See [this guide](./guides/max_map_count.md) for instructions on permanently setting `vm.max_map_count` on an operating system other than Linux.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  2. Clone the repo:
53
 
 
4
  ---
5
 
6
  # Quick start
7
+ import Tabs from '@theme/Tabs';
8
+ import TabItem from '@theme/TabItem';
9
 
10
  RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
11
 
 
27
 
28
  ## Start up the server
29
 
30
+ This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.
31
+
32
+ <details>
33
+ <summary>1. Ensure <code>vm.max_map_count</code> >= 262144:</summary>
34
+
35
+ `vm.max_map_count`. This value sets the the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abmornal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation.
36
+
37
+ RAGFlow v0.7.0 uses Elasticsearch for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning the Elasticsearch component.
38
+
39
+ <Tabs
40
+ defaultValue="linux"
41
+ values={[
42
+ {label: 'Linux', value: 'linux'},
43
+ {label: 'macOS', value: 'macos'},
44
+ {label: 'Windows', value: 'windows'},
45
+ ]}>
46
+ <TabItem value="linux">
47
+ 1.1. Check the value of `vm.max_map_count`:
48
+
49
+ ```bash
50
+ $ sysctl vm.max_map_count
51
+ ```
52
+
53
+ 1.2. Reset `vm.max_map_count` to a value at least 262144 if it is not.
54
+
55
+ ```bash
56
+ $ sudo sysctl -w vm.max_map_count=262144
57
+ ```
58
+
59
+ :::caution WARNING
60
+ This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
61
+ :::
62
+
63
+ 1.3. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
64
+
65
+ ```bash
66
+ vm.max_map_count=262144
67
+ ```
68
+ </TabItem>
69
+ <TabItem value="macos">
70
+ If you are on macOS with Docker Desktop, then you *must* use docker-machine to update `vm.max_map_count`:
71
+
72
+ ```bash
73
+ $ docker-machine ssh
74
+ $ sudo sysctl -w vm.max_map_count=262144
75
+ ```
76
+
77
+ :::caution WARNING
78
+ This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
79
+ :::
80
+ </TabItem>
81
+ <TabItem value="windows">
82
+
83
+ #### If you are on Windows with Docker Desktop, then you *must* use docker-machine to set `vm.max_map_count`:
84
+
85
+ ```bash
86
+ $ docker-machine ssh
87
+ $ sudo sysctl -w vm.max_map_count=262144
88
+ ```
89
+ #### If you are on Windows with Docker Desktop WSL 2 backend, then use docker-desktop to set `vm.max_map_count`:
90
+
91
+ 1.1. Run the following in WSL:
92
+ ```bash
93
+ $ wsl -d docker-desktop -u root
94
+ $ sysctl -w vm.max_map_count=262144
95
+ ```
96
+
97
+ :::caution WARNING
98
+ This change will be reset after you restart Docker. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception.
99
+ :::
100
+
101
+ 1.2. If you do not wish to have to run those commands each time you restart Docker, you can update your `%USERPROFILE%.wslconfig` as follows to keep your change permanent and globally for all WSL distributions:
102
+
103
+ ```bash
104
+ [wsl2]
105
+ kernelCommandLine = "sysctl.vm.max_map_count=262144"
106
+ ```
107
+ *This causes all WSL2 virtual machines to have that setting assigned when they start.*
108
+
109
+ :::note
110
+ If you are on Windows 11 or Windows 10 version 22H2, and have installed the Microsoft Store version of WSL, you can also update the **/etc/sysctl.conf** within the docker-desktop WSL distribution to keep your change permanent:
111
+
112
+ ```bash
113
+ $ wsl -d docker-desktop -u root
114
+ $ vi /etc/sysctl.conf
115
+ ```
116
+
117
+ ```bash
118
+ # Append a line, which reads:
119
+ vm.max_map_count = 262144
120
+ ```
121
+ :::
122
+ </TabItem>
123
+ </Tabs>
124
+
125
+ </details>
126
 
127
  2. Clone the repo:
128
 
docs/references/faq.md CHANGED
@@ -194,11 +194,7 @@ Ignore this warning and continue. All system warnings can be ignored.
194
 
195
  ![](https://github.com/infiniflow/ragflow/assets/93570324/ef5a6194-084a-4fe3-bdd5-1c025b40865c)
196
 
197
- #### 4.3 Why does it take so long to parse a 2MB document?
198
-
199
- Parsing requests have to wait in queue due to limited server resources. We are currently enhancing our algorithms and increasing computing power.
200
-
201
- #### 4.4 Why does my document parsing stall at under one percent?
202
 
203
  ![stall](https://github.com/infiniflow/ragflow/assets/93570324/3589cc25-c733-47d5-bbfc-fedb74a3da50)
204
 
@@ -211,7 +207,7 @@ docker logs -f ragflow-server
211
  2. Check if the **task_executor.py** process exists.
212
  3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com.
213
 
214
- #### 4.5 Why does my pdf parsing stall near completion, while the log does not show any error?
215
 
216
  If your RAGFlow is deployed *locally*, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**.
217
 
@@ -225,17 +221,17 @@ If your RAGFlow is deployed *locally*, the parsing process is likely killed due
225
 
226
  ![nearcompletion](https://github.com/infiniflow/ragflow/assets/93570324/563974c3-f8bb-4ec8-b241-adcda8929cbb)
227
 
228
- #### 4.6 `Index failure`
229
 
230
  An index failure usually indicates an unavailable Elasticsearch service.
231
 
232
- #### 4.7 How to check the log of RAGFlow?
233
 
234
  ```bash
235
  tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
236
  ```
237
 
238
- #### 4.8 How to check the status of each component in RAGFlow?
239
 
240
  ```bash
241
  $ docker ps
@@ -249,7 +245,7 @@ d8c86f06c56b mysql:5.7.18 "docker-entrypoint.s…" 7 days ago Up
249
  cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
250
  ```
251
 
252
- #### 4.9 `Exception: Can't connect to ES cluster`
253
 
254
  1. Check the status of your Elasticsearch component:
255
 
@@ -276,26 +272,26 @@ $ docker ps
276
  curl http://<IP_OF_ES>:<PORT_OF_ES>
277
  ```
278
 
279
- #### 4.10 Can't start ES container and get `Elasticsearch did not exit normally`
280
 
281
  This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot.
282
 
283
- #### 4.11 `{"data":null,"retcode":100,"retmsg":"<NotFound '404: Not Found'>"}`
284
 
285
  Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://<IP_OF_YOUR_MACHINE>` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work.
286
 
287
- #### 4.12 `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`
288
 
289
  A correct Ollama IP address and port is crucial to adding models to Ollama:
290
 
291
  - If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
292
  - If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.
293
 
294
- #### 4.13 Do you offer examples of using deepdoc to parse PDF or other files?
295
 
296
  Yes, we do. See the Python files under the **rag/app** folder.
297
 
298
- #### 4.14 Why did I fail to upload a 10MB+ file to my locally deployed RAGFlow?
299
 
300
  You probably forgot to update the **MAX_CONTENT_LENGTH** environment variable:
301
 
@@ -314,7 +310,7 @@ docker compose up ragflow -d
314
  ```
315
  *Now you should be able to upload files of sizes less than 100MB.*
316
 
317
- #### 4.15 `Table 'rag_flow.document' doesn't exist`
318
 
319
  This exception occurs when starting up the RAGFlow server. Try the following:
320
 
@@ -337,7 +333,7 @@ This exception occurs when starting up the RAGFlow server. Try the following:
337
  docker compose up
338
  ```
339
 
340
- #### 4.16 `hint : 102 Fail to access model Connection error`
341
 
342
  ![hint102](https://github.com/infiniflow/ragflow/assets/93570324/6633d892-b4f8-49b5-9a0a-37a0a8fba3d2)
343
 
@@ -345,7 +341,7 @@ This exception occurs when starting up the RAGFlow server. Try the following:
345
  2. Do not forget to append **/v1/** to **http://IP:port**:
346
  **http://IP:port/v1/**
347
 
348
- #### 4.17 `FileNotFoundError: [Errno 2] No such file or directory`
349
 
350
  1. Check if the status of your minio container is healthy:
351
  ```bash
 
194
 
195
  ![](https://github.com/infiniflow/ragflow/assets/93570324/ef5a6194-084a-4fe3-bdd5-1c025b40865c)
196
 
197
+ #### 4.3 Why does my document parsing stall at under one percent?
 
 
 
 
198
 
199
  ![stall](https://github.com/infiniflow/ragflow/assets/93570324/3589cc25-c733-47d5-bbfc-fedb74a3da50)
200
 
 
207
  2. Check if the **task_executor.py** process exists.
208
  3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com.
209
 
210
+ #### 4.4 Why does my pdf parsing stall near completion, while the log does not show any error?
211
 
212
  If your RAGFlow is deployed *locally*, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**.
213
 
 
221
 
222
  ![nearcompletion](https://github.com/infiniflow/ragflow/assets/93570324/563974c3-f8bb-4ec8-b241-adcda8929cbb)
223
 
224
+ #### 4.5 `Index failure`
225
 
226
  An index failure usually indicates an unavailable Elasticsearch service.
227
 
228
+ #### 4.6 How to check the log of RAGFlow?
229
 
230
  ```bash
231
  tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
232
  ```
233
 
234
+ #### 4.7 How to check the status of each component in RAGFlow?
235
 
236
  ```bash
237
  $ docker ps
 
245
  cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
246
  ```
247
 
248
+ #### 4.8 `Exception: Can't connect to ES cluster`
249
 
250
  1. Check the status of your Elasticsearch component:
251
 
 
272
  curl http://<IP_OF_ES>:<PORT_OF_ES>
273
  ```
274
 
275
+ #### 4.9 Can't start ES container and get `Elasticsearch did not exit normally`
276
 
277
  This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot.
278
 
279
+ #### 4.10 `{"data":null,"retcode":100,"retmsg":"<NotFound '404: Not Found'>"}`
280
 
281
  Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://<IP_OF_YOUR_MACHINE>` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work.
282
 
283
+ #### 4.11 `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`
284
 
285
  A correct Ollama IP address and port is crucial to adding models to Ollama:
286
 
287
  - If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
288
  - If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.
289
 
290
+ #### 4.12 Do you offer examples of using deepdoc to parse PDF or other files?
291
 
292
  Yes, we do. See the Python files under the **rag/app** folder.
293
 
294
+ #### 4.13 Why did I fail to upload a 10MB+ file to my locally deployed RAGFlow?
295
 
296
  You probably forgot to update the **MAX_CONTENT_LENGTH** environment variable:
297
 
 
310
  ```
311
  *Now you should be able to upload files of sizes less than 100MB.*
312
 
313
+ #### 4.14 `Table 'rag_flow.document' doesn't exist`
314
 
315
  This exception occurs when starting up the RAGFlow server. Try the following:
316
 
 
333
  docker compose up
334
  ```
335
 
336
+ #### 4.15 `hint : 102 Fail to access model Connection error`
337
 
338
  ![hint102](https://github.com/infiniflow/ragflow/assets/93570324/6633d892-b4f8-49b5-9a0a-37a0a8fba3d2)
339
 
 
341
  2. Do not forget to append **/v1/** to **http://IP:port**:
342
  **http://IP:port/v1/**
343
 
344
+ #### 4.16 `FileNotFoundError: [Errno 2] No such file or directory`
345
 
346
  1. Check if the status of your minio container is healthy:
347
  ```bash