Spaces:

rajesh1501
/

embedchain

No application file

App Files Files Community

embedchain / docs /api-reference /app /add.mdx

rajesh1501

Upload folder using huggingface_hub

a85c9b8 verified over 1 year ago

raw

history blame contribute delete

1.79 kB

	---
	title: '📊 add'
	---

	`add()` method is used to load the data sources from different data sources to a RAG pipeline. You can find the signature below:

	### Parameters

	<ParamField path="source" type="str">
	The data to embed, can be a URL, local file or raw content, depending on the data type.. You can find the full list of supported data sources [here](/components/data-sources/overview).
	</ParamField>
	<ParamField path="data_type" type="str" optional>
	Type of data source. It can be automatically detected but user can force what data type to load as.
	</ParamField>
	<ParamField path="metadata" type="dict" optional>
	Any metadata that you want to store with the data source. Metadata is generally really useful for doing metadata filtering on top of semantic search to yield faster search and better results.
	</ParamField>

	## Usage

	### Load data from webpage

	```python Code example
	from embedchain import App

	app = App()
	app.add("https://www.forbes.com/profile/elon-musk")
	# Inserting batches in chromadb: 100%\|███████████████\| 1/1 [00:00<00:00, 1.19it/s]
	# Successfully saved https://www.forbes.com/profile/elon-musk (DataType.WEB_PAGE). New chunks count: 4
	```

	### Load data from sitemap

	```python Code example
	from embedchain import App

	app = App()
	app.add("https://python.langchain.com/sitemap.xml", data_type="sitemap")
	# Loading pages: 100%\|█████████████\| 1108/1108 [00:47<00:00, 23.17it/s]
	# Inserting batches in chromadb: 100%\|█████████\| 111/111 [04:41<00:00, 2.54s/it]
	# Successfully saved https://python.langchain.com/sitemap.xml (DataType.SITEMAP). New chunks count: 11024
	```

	You can find complete list of supported data sources [here](/components/data-sources/overview).