writinwaters commited on
Commit
e587fd6
·
1 Parent(s): 4c39067

DRAFT: Miscellaneous proofedits on Python APIs (#2903)

Browse files

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

Files changed (1) hide show
  1. api/python_api_reference.md +167 -125
api/python_api_reference.md CHANGED
@@ -2,10 +2,14 @@
2
 
3
  **THE API REFERENCES BELOW ARE STILL UNDER DEVELOPMENT.**
4
 
 
 
5
  :::tip NOTE
6
  Dataset Management
7
  :::
8
 
 
 
9
  ## Create dataset
10
 
11
  ```python
@@ -55,11 +59,24 @@ The language setting of the dataset to create. Available options:
55
 
56
  #### permission
57
 
58
- Specifies who can operate on the dataset. You can set it only to `"me"` for now.
59
 
60
  #### chunk_method, `str`
61
 
62
- The default parsing method of the knwoledge . Defaults to `"naive"`.
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
  #### parser_config
65
 
@@ -67,7 +84,7 @@ The parser configuration of the dataset. A `ParserConfig` object contains the fo
67
 
68
  - `chunk_token_count`: Defaults to `128`.
69
  - `layout_recognize`: Defaults to `True`.
70
- - `delimiter`: Defaults to `'\n!?。;!?'`.
71
  - `task_page_size`: Defaults to `12`.
72
 
73
  ### Returns
@@ -81,7 +98,7 @@ The parser configuration of the dataset. A `ParserConfig` object contains the fo
81
  from ragflow import RAGFlow
82
 
83
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
84
- ds = rag_object.create_dataset(name="kb_1")
85
  ```
86
 
87
  ---
@@ -92,13 +109,13 @@ ds = rag_object.create_dataset(name="kb_1")
92
  RAGFlow.delete_datasets(ids: list[str] = None)
93
  ```
94
 
95
- Deletes datasets by name or ID.
96
 
97
  ### Parameters
98
 
99
- #### ids
100
 
101
- The IDs of the datasets to delete.
102
 
103
  ### Returns
104
 
@@ -108,7 +125,7 @@ The IDs of the datasets to delete.
108
  ### Examples
109
 
110
  ```python
111
- rag.delete_datasets(ids=["id_1","id_2"])
112
  ```
113
 
114
  ---
@@ -132,15 +149,18 @@ Retrieves a list of datasets.
132
 
133
  #### page: `int`
134
 
135
- The current page number to retrieve from the paginated results. Defaults to `1`.
136
 
137
  #### page_size: `int`
138
 
139
- The number of records on each page. Defaults to `1024`.
140
 
141
- #### order_by: `str`
142
 
143
- The field by which the records should be sorted. This specifies the attribute or column used to order the results. Defaults to `"create_time"`.
 
 
 
144
 
145
  #### desc: `bool`
146
 
@@ -148,15 +168,15 @@ Indicates whether the retrieved datasets should be sorted in descending order. D
148
 
149
  #### id: `str`
150
 
151
- The id of the dataset to be got. Defaults to `None`.
152
 
153
  #### name: `str`
154
 
155
- The name of the dataset to be got. Defaults to `None`.
156
 
157
  ### Returns
158
 
159
- - Success: A list of `DataSet` objects representing the retrieved datasets.
160
  - Failure: `Exception`.
161
 
162
  ### Examples
@@ -164,8 +184,8 @@ The name of the dataset to be got. Defaults to `None`.
164
  #### List all datasets
165
 
166
  ```python
167
- for ds in rag_object.list_datasets():
168
- print(ds)
169
  ```
170
 
171
  #### Retrieve a dataset by ID
@@ -183,16 +203,18 @@ print(dataset[0])
183
  DataSet.update(update_message: dict)
184
  ```
185
 
186
- Updates the current dataset.
187
 
188
  ### Parameters
189
 
190
  #### update_message: `dict[str, str|int]`, *Required*
191
 
 
 
192
  - `"name"`: `str` The name of the dataset to update.
193
- - `"embedding_model"`: `str` The embedding model for generating vector embeddings.
194
  - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
195
- - `"chunk_method"`: `str` The default parsing method for the dataset.
196
  - `"naive"`: General
197
  - `"manual`: Manual
198
  - `"qa"`: Q&A
@@ -216,8 +238,8 @@ Updates the current dataset.
216
  ```python
217
  from ragflow import RAGFlow
218
 
219
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
220
- dataset = rag.list_datasets(name="kb_name")
221
  dataset.update({"embedding_model":"BAAI/bge-zh-v1.5", "chunk_method":"manual"})
222
  ```
223
 
@@ -239,7 +261,7 @@ Uploads documents to the current dataset.
239
 
240
  ### Parameters
241
 
242
- #### document_list
243
 
244
  A list of dictionaries representing the documents to upload, each containing the following keys:
245
 
@@ -272,6 +294,8 @@ Updates configurations for the current document.
272
 
273
  #### update_message: `dict[str, str|dict[]]`, *Required*
274
 
 
 
275
  - `"name"`: `str` The name of the document to update.
276
  - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
277
  - `"chunk_token_count"`: Defaults to `128`.
@@ -302,9 +326,9 @@ Updates configurations for the current document.
302
  ```python
303
  from ragflow import RAGFlow
304
 
305
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
306
- dataset=rag.list_datasets(id='id')
307
- dataset=dataset[0]
308
  doc = dataset.list_documents(id="wdfxb5t547d")
309
  doc = doc[0]
310
  doc.update([{"parser_config": {"chunk_token_count": 256}}, {"chunk_method": "manual"}])
@@ -318,7 +342,7 @@ doc.update([{"parser_config": {"chunk_token_count": 256}}, {"chunk_method": "man
318
  Document.download() -> bytes
319
  ```
320
 
321
- Downloads the current document from RAGFlow.
322
 
323
  ### Returns
324
 
@@ -350,30 +374,30 @@ Retrieves a list of documents from the current dataset.
350
 
351
  ### Parameters
352
 
353
- #### id
354
 
355
  The ID of the document to retrieve. Defaults to `None`.
356
 
357
- #### keywords
358
 
359
  The keywords to match document titles. Defaults to `None`.
360
 
361
- #### offset
362
 
363
- The beginning number of records for paging. Defaults to `0`.
364
 
365
- #### limit
366
 
367
- Records number to return, `-1` means all of them. Records number to return, `-1` means all of them.
368
 
369
- #### orderby
370
 
371
- The field by which the documents should be sorted. Available options:
372
 
373
- - `"create_time"` (Default)
374
  - `"update_time"`
375
 
376
- #### desc
377
 
378
  Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`.
379
 
@@ -384,22 +408,24 @@ Indicates whether the retrieved documents should be sorted in descending order.
384
 
385
  A `Document` object contains the following attributes:
386
 
387
- - `id` Id of the retrieved document. Defaults to `""`.
388
- - `thumbnail` Thumbnail image of the retrieved document. Defaults to `""`.
389
- - `knowledgebase_id` Dataset ID related to the document. Defaults to `""`.
390
- - `chunk_method` Method used to parse the document. Defaults to `""`.
391
- - `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `None`.
392
- - `source_type`: Source type of the document. Defaults to `""`.
393
- - `type`: Type or category of the document. Defaults to `""`.
394
- - `created_by`: `str` Creator of the document. Defaults to `""`.
395
- - `name` Name or title of the document. Defaults to `""`.
396
- - `size`: `int` Size of the document in bytes or some other unit. Defaults to `0`.
397
- - `token_count`: `int` Number of tokens in the document. Defaults to `""`.
398
- - `chunk_count`: `int` Number of chunks the document is split into. Defaults to `0`.
399
- - `progress`: `float` Current processing progress as a percentage. Defaults to `0.0`.
400
- - `progress_msg`: `str` Message indicating current progress status. Defaults to `""`.
401
- - `process_begin_at`: `datetime` Start time of the document processing. Defaults to `None`.
402
- - `process_duation`: `float` Duration of the processing in seconds or minutes. Defaults to `0.0`.
 
 
403
 
404
  ### Examples
405
 
@@ -410,11 +436,10 @@ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
410
  dataset = rag.create_dataset(name="kb_1")
411
 
412
  filename1 = "~/ragflow.txt"
413
- blob=open(filename1 , "rb").read()
414
- list_files=[{"name":filename1,"blob":blob}]
415
- dataset.upload_documents(list_files)
416
- for d in dataset.list_documents(keywords="rag", offset=0, limit=12):
417
- print(d)
418
  ```
419
 
420
  ---
@@ -425,7 +450,13 @@ for d in dataset.list_documents(keywords="rag", offset=0, limit=12):
425
  DataSet.delete_documents(ids: list[str] = None)
426
  ```
427
 
428
- Deletes specified documents or all documents from the current dataset.
 
 
 
 
 
 
429
 
430
  ### Returns
431
 
@@ -437,10 +468,10 @@ Deletes specified documents or all documents from the current dataset.
437
  ```python
438
  from ragflow import RAGFlow
439
 
440
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
441
- ds = rag.list_datasets(name="kb_1")
442
- ds = ds[0]
443
- ds.delete_documents(ids=["id_1","id_2"])
444
  ```
445
 
446
  ---
@@ -453,7 +484,7 @@ DataSet.async_parse_documents(document_ids:list[str]) -> None
453
 
454
  ### Parameters
455
 
456
- #### document_ids: `list[str]`
457
 
458
  The IDs of the documents to parse.
459
 
@@ -465,23 +496,20 @@ The IDs of the documents to parse.
465
  ### Examples
466
 
467
  ```python
468
- #documents parse and cancel
469
- rag = RAGFlow(API_KEY, HOST_ADDRESS)
470
- ds = rag.create_dataset(name="dataset_name")
471
  documents = [
472
  {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
473
  {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
474
  {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
475
  ]
476
- ds.upload_documents(documents)
477
- documents=ds.list_documents(keywords="test")
478
- ids=[]
479
  for document in documents:
480
  ids.append(document.id)
481
- ds.async_parse_documents(ids)
482
- print("Async bulk parsing initiated")
483
- ds.async_cancel_parse_documents(ids)
484
- print("Async bulk parsing cancelled")
485
  ```
486
 
487
  ---
@@ -494,9 +522,9 @@ DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
494
 
495
  ### Parameters
496
 
497
- #### document_ids: `list[str]`
498
 
499
- The IDs of the documents to stop parsing.
500
 
501
  ### Returns
502
 
@@ -506,23 +534,22 @@ The IDs of the documents to stop parsing.
506
  ### Examples
507
 
508
  ```python
509
- #documents parse and cancel
510
- rag = RAGFlow(API_KEY, HOST_ADDRESS)
511
- ds = rag.create_dataset(name="dataset_name")
512
  documents = [
513
  {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
514
  {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
515
  {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
516
  ]
517
- ds.upload_documents(documents)
518
- documents=ds.list_documents(keywords="test")
519
- ids=[]
520
  for document in documents:
521
  ids.append(document.id)
522
- ds.async_parse_documents(ids)
523
- print("Async bulk parsing initiated")
524
- ds.async_cancel_parse_documents(ids)
525
- print("Async bulk parsing cancelled")
526
  ```
527
 
528
  ---
@@ -533,19 +560,21 @@ print("Async bulk parsing cancelled")
533
  Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
534
  ```
535
 
 
 
536
  ### Parameters
537
 
538
- #### keywords
539
 
540
  List chunks whose name has the given keywords. Defaults to `None`
541
 
542
- #### offset
543
 
544
- The beginning number of records for paging. Defaults to `1`
545
 
546
  #### limit
547
 
548
- Records number to return. Default: `30`
549
 
550
  #### id
551
 
@@ -553,19 +582,20 @@ The ID of the chunk to retrieve. Default: `None`
553
 
554
  ### Returns
555
 
556
- list[chunk]
 
557
 
558
  ### Examples
559
 
560
  ```python
561
  from ragflow import RAGFlow
562
 
563
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
564
- ds = rag.list_datasets("123")
565
- ds = ds[0]
566
- ds.async_parse_documents(["wdfxb5t547d"])
567
- for c in doc.list_chunks(keywords="rag", offset=0, limit=12):
568
- print(c)
569
  ```
570
 
571
  ## Add chunk
@@ -578,7 +608,7 @@ Document.add_chunk(content:str) -> Chunk
578
 
579
  #### content: *Required*
580
 
581
- The main text or information of the chunk.
582
 
583
  #### important_keywords :`list[str]`
584
 
@@ -609,11 +639,13 @@ chunk = doc.add_chunk(content="xxxxxxx")
609
  Document.delete_chunks(chunk_ids: list[str])
610
  ```
611
 
 
 
612
  ### Parameters
613
 
614
- #### chunk_ids:`list[str]`
615
 
616
- A list of chunk_id.
617
 
618
  ### Returns
619
 
@@ -642,15 +674,17 @@ doc.delete_chunks(["id_1","id_2"])
642
  Chunk.update(update_message: dict)
643
  ```
644
 
645
- Updates the current chunk.
646
 
647
  ### Parameters
648
 
649
  #### update_message: `dict[str, str|list[str]|int]` *Required*
650
 
 
 
651
  - `"content"`: `str` Content of the chunk.
652
  - `"important_keywords"`: `list[str]` A list of key terms to attach to the chunk.
653
- - `"available"`: `int` The chunk's availability status in the dataset.
654
  - `0`: Unavailable
655
  - `1`: Available
656
 
@@ -697,11 +731,11 @@ The documents to search from. `None` means no limitation. Defaults to `None`.
697
 
698
  #### offset: `int`
699
 
700
- The beginning point of retrieved chunks. Defaults to `0`.
701
 
702
  #### limit: `int`
703
 
704
- The maximum number of chunks to return. Defaults to `6`.
705
 
706
  #### Similarity_threshold: `float`
707
 
@@ -764,6 +798,8 @@ for c in rag_object.retrieve(question="What's ragflow?",
764
  Chat Assistant Management
765
  :::
766
 
 
 
767
  ## Create chat assistant
768
 
769
  ```python
@@ -856,15 +892,17 @@ assi = rag.create_chat("Miss R", knowledgebases=list_kb)
856
  Chat.update(update_message: dict)
857
  ```
858
 
859
- Updates the current chat assistant.
860
 
861
  ### Parameters
862
 
863
- #### update_message: `dict[str, Any]`, *Required*
 
 
864
 
865
  - `"name"`: `str` The name of the chat assistant to update.
866
  - `"avatar"`: `str` Base64 encoding of the avatar. Defaults to `""`
867
- - `"knowledgebases"`: `list[str]` datasets to update.
868
  - `"llm"`: `dict` The LLM settings:
869
  - `"model_name"`, `str` The chat model name.
870
  - `"temperature"`, `float` Controls the randomness of the model's predictions.
@@ -906,17 +944,17 @@ assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top
906
 
907
  ## Delete chats
908
 
909
- Deletes specified chat assistants.
910
-
911
  ```python
912
  RAGFlow.delete_chats(ids: list[str] = None)
913
  ```
914
 
 
 
915
  ### Parameters
916
 
917
- #### ids
918
 
919
- IDs of the chat assistants to delete. If not specified, all chat assistants will be deleted.
920
 
921
  ### Returns
922
 
@@ -953,11 +991,11 @@ Retrieves a list of chat assistants.
953
 
954
  #### page
955
 
956
- Specifies the page on which the records will be displayed. Defaults to `1`.
957
 
958
  #### page_size
959
 
960
- The number of records on each page. Defaults to `1024`.
961
 
962
  #### order_by
963
 
@@ -985,8 +1023,8 @@ The name of the chat to retrieve. Defaults to `None`.
985
  ```python
986
  from ragflow import RAGFlow
987
 
988
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
989
- for assistant in rag.list_chats():
990
  print(assistant)
991
  ```
992
 
@@ -996,6 +1034,8 @@ for assistant in rag.list_chats():
996
  Chat-session APIs
997
  :::
998
 
 
 
999
  ## Create session
1000
 
1001
  ```python
@@ -1036,12 +1076,14 @@ session = assistant.create_session()
1036
  Session.update(update_message: dict)
1037
  ```
1038
 
1039
- Updates the current session.
1040
 
1041
  ### Parameters
1042
 
1043
  #### update_message: `dict[str, Any]`, *Required*
1044
 
 
 
1045
  - `"name"`: `str` The name of the session to update.
1046
 
1047
  ### Returns
@@ -1169,17 +1211,17 @@ Lists sessions associated with the current chat assistant.
1169
 
1170
  #### page
1171
 
1172
- Specifies the page on which records will be displayed. Defaults to `1`.
1173
 
1174
  #### page_size
1175
 
1176
- The number of records on each page. Defaults to `1024`.
1177
 
1178
  #### orderby
1179
 
1180
- The field by which the sessions should be sorted. Available options:
1181
 
1182
- - `"create_time"` (Default)
1183
  - `"update_time"`
1184
 
1185
  #### desc
@@ -1204,8 +1246,8 @@ The name of the chat to retrieve. Defaults to `None`.
1204
  ```python
1205
  from ragflow import RAGFlow
1206
 
1207
- rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
1208
- assistant = rag.list_chats(name="Miss R")
1209
  assistant = assistant[0]
1210
  for session in assistant.list_sessions():
1211
  print(session)
@@ -1219,13 +1261,13 @@ for session in assistant.list_sessions():
1219
  Chat.delete_sessions(ids:list[str] = None)
1220
  ```
1221
 
1222
- Deletes specified sessions or all sessions associated with the current chat assistant.
1223
 
1224
  ### Parameters
1225
 
1226
- #### ids
1227
 
1228
- IDs of the sessions to delete. If not specified, all sessions associated with the current chat assistant will be deleted.
1229
 
1230
  ### Returns
1231
 
 
2
 
3
  **THE API REFERENCES BELOW ARE STILL UNDER DEVELOPMENT.**
4
 
5
+ ---
6
+
7
  :::tip NOTE
8
  Dataset Management
9
  :::
10
 
11
+ ---
12
+
13
  ## Create dataset
14
 
15
  ```python
 
59
 
60
  #### permission
61
 
62
+ Specifies who can access the dataset to create. You can set it only to `"me"` for now.
63
 
64
  #### chunk_method, `str`
65
 
66
+ The chunking method of the dataset to create. Available options:
67
+
68
+ - `"naive"`: General (default)
69
+ - `"manual`: Manual
70
+ - `"qa"`: Q&A
71
+ - `"table"`: Table
72
+ - `"paper"`: Paper
73
+ - `"book"`: Book
74
+ - `"laws"`: Laws
75
+ - `"presentation"`: Presentation
76
+ - `"picture"`: Picture
77
+ - `"one"`:One
78
+ - `"knowledge_graph"`: Knowledge Graph
79
+ - `"email"`: Email
80
 
81
  #### parser_config
82
 
 
84
 
85
  - `chunk_token_count`: Defaults to `128`.
86
  - `layout_recognize`: Defaults to `True`.
87
+ - `delimiter`: Defaults to `"\n!?。;!?"`.
88
  - `task_page_size`: Defaults to `12`.
89
 
90
  ### Returns
 
98
  from ragflow import RAGFlow
99
 
100
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
101
+ dataset = rag_object.create_dataset(name="kb_1")
102
  ```
103
 
104
  ---
 
109
  RAGFlow.delete_datasets(ids: list[str] = None)
110
  ```
111
 
112
+ Deletes specified datasets or all datasets in the system.
113
 
114
  ### Parameters
115
 
116
+ #### ids: `list[str]`
117
 
118
+ The IDs of the datasets to delete. Defaults to `None`. If not specified, all datasets in the system will be deleted.
119
 
120
  ### Returns
121
 
 
125
  ### Examples
126
 
127
  ```python
128
+ rag_object.delete_datasets(ids=["id_1","id_2"])
129
  ```
130
 
131
  ---
 
149
 
150
  #### page: `int`
151
 
152
+ Specifies the page on which the datasets will be displayed. Defaults to `1`.
153
 
154
  #### page_size: `int`
155
 
156
+ The number of datasets on each page. Defaults to `1024`.
157
 
158
+ #### orderby: `str`
159
 
160
+ The field by which datasets should be sorted. Available options:
161
+
162
+ - `"create_time"` (default)
163
+ - `"update_time"`
164
 
165
  #### desc: `bool`
166
 
 
168
 
169
  #### id: `str`
170
 
171
+ The ID of the dataset to retrieve. Defaults to `None`.
172
 
173
  #### name: `str`
174
 
175
+ The name of the dataset to retrieve. Defaults to `None`.
176
 
177
  ### Returns
178
 
179
+ - Success: A list of `DataSet` objects.
180
  - Failure: `Exception`.
181
 
182
  ### Examples
 
184
  #### List all datasets
185
 
186
  ```python
187
+ for dataset in rag_object.list_datasets():
188
+ print(dataset)
189
  ```
190
 
191
  #### Retrieve a dataset by ID
 
203
  DataSet.update(update_message: dict)
204
  ```
205
 
206
+ Updates configurations for the current dataset.
207
 
208
  ### Parameters
209
 
210
  #### update_message: `dict[str, str|int]`, *Required*
211
 
212
+ A dictionary representing the attributes to update, with the following keys:
213
+
214
  - `"name"`: `str` The name of the dataset to update.
215
+ - `"embedding_model"`: `str` The embedding model name to update.
216
  - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
217
+ - `"chunk_method"`: `str` The chunking method for the dataset. Available options:
218
  - `"naive"`: General
219
  - `"manual`: Manual
220
  - `"qa"`: Q&A
 
238
  ```python
239
  from ragflow import RAGFlow
240
 
241
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
242
+ dataset = rag_object.list_datasets(name="kb_name")
243
  dataset.update({"embedding_model":"BAAI/bge-zh-v1.5", "chunk_method":"manual"})
244
  ```
245
 
 
261
 
262
  ### Parameters
263
 
264
+ #### document_list: `list[dict]`, *Required*
265
 
266
  A list of dictionaries representing the documents to upload, each containing the following keys:
267
 
 
294
 
295
  #### update_message: `dict[str, str|dict[]]`, *Required*
296
 
297
+ A dictionary representing the attributes to update, with the following keys:
298
+
299
  - `"name"`: `str` The name of the document to update.
300
  - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
301
  - `"chunk_token_count"`: Defaults to `128`.
 
326
  ```python
327
  from ragflow import RAGFlow
328
 
329
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
330
+ dataset = rag_object.list_datasets(id='id')
331
+ dataset = dataset[0]
332
  doc = dataset.list_documents(id="wdfxb5t547d")
333
  doc = doc[0]
334
  doc.update([{"parser_config": {"chunk_token_count": 256}}, {"chunk_method": "manual"}])
 
342
  Document.download() -> bytes
343
  ```
344
 
345
+ Downloads the current document.
346
 
347
  ### Returns
348
 
 
374
 
375
  ### Parameters
376
 
377
+ #### id: `str`
378
 
379
  The ID of the document to retrieve. Defaults to `None`.
380
 
381
+ #### keywords: `str`
382
 
383
  The keywords to match document titles. Defaults to `None`.
384
 
385
+ #### offset: `int`
386
 
387
+ The starting index for the documents to retrieve. Typically used in confunction with `limit`. Defaults to `0`.
388
 
389
+ #### limit: `int`
390
 
391
+ The maximum number of documents to retrieve. Defaults to `1024`. A value of `-1` indicates that all documents should be returned.
392
 
393
+ #### orderby: `str`
394
 
395
+ The field by which documents should be sorted. Available options:
396
 
397
+ - `"create_time"` (default)
398
  - `"update_time"`
399
 
400
+ #### desc: `bool`
401
 
402
  Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`.
403
 
 
408
 
409
  A `Document` object contains the following attributes:
410
 
411
+ - `id`: The document ID. Defaults to `""`.
412
+ - `name`: The document name. Defaults to `""`.
413
+ - `thumbnail`: The thumbnail image of the document. Defaults to `None`.
414
+ - `knowledgebase_id`: The dataset ID associated with the document. Defaults to `None`.
415
+ - `chunk_method` The chunk method name. Defaults to `""`. ?????naive??????
416
+ - `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `{"pages": [[1, 1000000]]}`.
417
+ - `source_type`: The source type of the document. Defaults to `"local"`.
418
+ - `type`: Type or category of the document???????????. Defaults to `""`.
419
+ - `created_by`: `str` The creator of the document. Defaults to `""`.
420
+ - `size`: `int` The document size in bytes. Defaults to `0`.
421
+ - `token_count`: `int` The number of tokens in the document. Defaults to `0`.
422
+ - `chunk_count`: `int` The number of chunks that the document is split into. Defaults to `0`.
423
+ - `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`.
424
+ - `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`.
425
+ - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
426
+ - `process_duation`: `float` Duration of the processing in seconds or minutes.??????? Defaults to `0.0`.
427
+ - `run`: `str` ?????????????????? Defaults to `"0"`.
428
+ - `status`: `str` ??????????????????? Defaults to `"1"`.
429
 
430
  ### Examples
431
 
 
436
  dataset = rag.create_dataset(name="kb_1")
437
 
438
  filename1 = "~/ragflow.txt"
439
+ blob = open(filename1 , "rb").read()
440
+ dataset.upload_documents([{"name":filename1,"blob":blob}])
441
+ for doc in dataset.list_documents(keywords="rag", offset=0, limit=12):
442
+ print(doc)
 
443
  ```
444
 
445
  ---
 
450
  DataSet.delete_documents(ids: list[str] = None)
451
  ```
452
 
453
+ Deletes documents by ID.
454
+
455
+ ### Parameters
456
+
457
+ #### ids: `list[list]`
458
+
459
+ The IDs of the documents to delete. Defaults to `None`. If not specified, all documents in the dataset will be deleted.
460
 
461
  ### Returns
462
 
 
468
  ```python
469
  from ragflow import RAGFlow
470
 
471
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
472
+ dataset = rag_object.list_datasets(name="kb_1")
473
+ dataset = dataset[0]
474
+ dataset.delete_documents(ids=["id_1","id_2"])
475
  ```
476
 
477
  ---
 
484
 
485
  ### Parameters
486
 
487
+ #### document_ids: `list[str]`, *Required*
488
 
489
  The IDs of the documents to parse.
490
 
 
496
  ### Examples
497
 
498
  ```python
499
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
500
+ dataset = rag_object.create_dataset(name="dataset_name")
 
501
  documents = [
502
  {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
503
  {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
504
  {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
505
  ]
506
+ dataset.upload_documents(documents)
507
+ documents = dataset.list_documents(keywords="test")
508
+ ids = []
509
  for document in documents:
510
  ids.append(document.id)
511
+ dataset.async_parse_documents(ids)
512
+ print("Async bulk parsing initiated.")
 
 
513
  ```
514
 
515
  ---
 
522
 
523
  ### Parameters
524
 
525
+ #### document_ids: `list[str]`, *Required*
526
 
527
+ The IDs of the documents for which parsing should be stopped.
528
 
529
  ### Returns
530
 
 
534
  ### Examples
535
 
536
  ```python
537
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
538
+ dataset = rag_object.create_dataset(name="dataset_name")
 
539
  documents = [
540
  {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
541
  {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
542
  {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
543
  ]
544
+ dataset.upload_documents(documents)
545
+ documents = dataset.list_documents(keywords="test")
546
+ ids = []
547
  for document in documents:
548
  ids.append(document.id)
549
+ dataset.async_parse_documents(ids)
550
+ print("Async bulk parsing initiated.")
551
+ dataset.async_cancel_parse_documents(ids)
552
+ print("Async bulk parsing cancelled.")
553
  ```
554
 
555
  ---
 
560
  Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
561
  ```
562
 
563
+ Retrieves a list of document chunks.
564
+
565
  ### Parameters
566
 
567
+ #### keywords: `str`
568
 
569
  List chunks whose name has the given keywords. Defaults to `None`
570
 
571
+ #### offset: `int`
572
 
573
+ The starting index for the chunks to retrieve. Defaults to `1`
574
 
575
  #### limit
576
 
577
+ The maximum number of chunks to retrieve. Default: `30`
578
 
579
  #### id
580
 
 
582
 
583
  ### Returns
584
 
585
+ - Success: A list of `Chunk` objects.
586
+ - Failure: `Exception`.
587
 
588
  ### Examples
589
 
590
  ```python
591
  from ragflow import RAGFlow
592
 
593
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
594
+ dataset = rag_object.list_datasets("123")
595
+ dataset = dataset[0]
596
+ dataset.async_parse_documents(["wdfxb5t547d"])
597
+ for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
598
+ print(chunk)
599
  ```
600
 
601
  ## Add chunk
 
608
 
609
  #### content: *Required*
610
 
611
+ The text content of the chunk.
612
 
613
  #### important_keywords :`list[str]`
614
 
 
639
  Document.delete_chunks(chunk_ids: list[str])
640
  ```
641
 
642
+ Deletes chunks by ID.
643
+
644
  ### Parameters
645
 
646
+ #### chunk_ids: `list[str]`
647
 
648
+ The IDs of the chunks to delete. Defaults to `None`. If not specified, all chunks of the current document will be deleted.
649
 
650
  ### Returns
651
 
 
674
  Chunk.update(update_message: dict)
675
  ```
676
 
677
+ Updates content or configurations for the current chunk.
678
 
679
  ### Parameters
680
 
681
  #### update_message: `dict[str, str|list[str]|int]` *Required*
682
 
683
+ A dictionary representing the attributes to update, with the following keys:
684
+
685
  - `"content"`: `str` Content of the chunk.
686
  - `"important_keywords"`: `list[str]` A list of key terms to attach to the chunk.
687
+ - `"available"`: `int` The chunk's availability status in the dataset. Value options:
688
  - `0`: Unavailable
689
  - `1`: Available
690
 
 
731
 
732
  #### offset: `int`
733
 
734
+ The starting index for the documents to retrieve. Defaults to `0`??????.
735
 
736
  #### limit: `int`
737
 
738
+ The maximum number of chunks to retrieve. Defaults to `6`.
739
 
740
  #### Similarity_threshold: `float`
741
 
 
798
  Chat Assistant Management
799
  :::
800
 
801
+ ---
802
+
803
  ## Create chat assistant
804
 
805
  ```python
 
892
  Chat.update(update_message: dict)
893
  ```
894
 
895
+ Updates configurations for the current chat assistant.
896
 
897
  ### Parameters
898
 
899
+ #### update_message: `dict[str, str|list[str]|dict[]]`, *Required*
900
+
901
+ A dictionary representing the attributes to update, with the following keys:
902
 
903
  - `"name"`: `str` The name of the chat assistant to update.
904
  - `"avatar"`: `str` Base64 encoding of the avatar. Defaults to `""`
905
+ - `"knowledgebases"`: `list[str]` The datasets to update.
906
  - `"llm"`: `dict` The LLM settings:
907
  - `"model_name"`, `str` The chat model name.
908
  - `"temperature"`, `float` Controls the randomness of the model's predictions.
 
944
 
945
  ## Delete chats
946
 
 
 
947
  ```python
948
  RAGFlow.delete_chats(ids: list[str] = None)
949
  ```
950
 
951
+ Deletes chat assistants by ID.
952
+
953
  ### Parameters
954
 
955
+ #### ids: `list[str]`
956
 
957
+ The IDs of the chat assistants to delete. Defaults to `None`. If not specified, all chat assistants in the system will be deleted.
958
 
959
  ### Returns
960
 
 
991
 
992
  #### page
993
 
994
+ Specifies the page on which the chat assistants will be displayed. Defaults to `1`.
995
 
996
  #### page_size
997
 
998
+ The number of chat assistants on each page. Defaults to `1024`.
999
 
1000
  #### order_by
1001
 
 
1023
  ```python
1024
  from ragflow import RAGFlow
1025
 
1026
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
1027
+ for assistant in rag_object.list_chats():
1028
  print(assistant)
1029
  ```
1030
 
 
1034
  Chat-session APIs
1035
  :::
1036
 
1037
+ ---
1038
+
1039
  ## Create session
1040
 
1041
  ```python
 
1076
  Session.update(update_message: dict)
1077
  ```
1078
 
1079
+ Updates the current session name.
1080
 
1081
  ### Parameters
1082
 
1083
  #### update_message: `dict[str, Any]`, *Required*
1084
 
1085
+ A dictionary representing the attributes to update, with only one key:
1086
+
1087
  - `"name"`: `str` The name of the session to update.
1088
 
1089
  ### Returns
 
1211
 
1212
  #### page
1213
 
1214
+ Specifies the page on which the sessions will be displayed. Defaults to `1`.
1215
 
1216
  #### page_size
1217
 
1218
+ The number of sessions on each page. Defaults to `1024`.
1219
 
1220
  #### orderby
1221
 
1222
+ The field by which sessions should be sorted. Available options:
1223
 
1224
+ - `"create_time"` (default)
1225
  - `"update_time"`
1226
 
1227
  #### desc
 
1246
  ```python
1247
  from ragflow import RAGFlow
1248
 
1249
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
1250
+ assistant = rag_object.list_chats(name="Miss R")
1251
  assistant = assistant[0]
1252
  for session in assistant.list_sessions():
1253
  print(session)
 
1261
  Chat.delete_sessions(ids:list[str] = None)
1262
  ```
1263
 
1264
+ Deletes sessions by ID.
1265
 
1266
  ### Parameters
1267
 
1268
+ #### ids: `list[str]`
1269
 
1270
+ The IDs of the sessions to delete. Defaults to `None`. If not specified, all sessions associated with the current chat assistant will be deleted.
1271
 
1272
  ### Returns
1273