docs: base README.md introduction to the space

#1
by laverdes - opened
Files changed (1) hide show
  1. README.md +21 -3
README.md CHANGED
@@ -1,10 +1,28 @@
1
  ---
2
  title: README
3
- emoji: 🐠
4
  colorFrom: yellow
5
  colorTo: indigo
6
- sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card πŸ”₯
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: README
3
+ emoji: πŸ’ 
4
  colorFrom: yellow
5
  colorTo: indigo
6
+ sdk: streamlit
7
  pinned: false
8
  ---
9
 
10
+ Welcome to our space! 🎊
11
+
12
+ The [Unstructured.io](www.unstructured.io) Team provides libraries with open-source components for pre-processing text documents
13
+ such as **PDFs**, **HTML** and **Word** Documents. These components are packaged as *bricks* 🧱, which provide
14
+ users the building blocks they need to build pipelines targeted at the documents they care
15
+ about. Bricks in the library fall into three categories:
16
+
17
+ - 🧩 ***Partitioning bricks*** that break raw documents down into standard, structured
18
+ elements.
19
+ - 🧹 ***Cleaning bricks*** that remove unwanted text from documents, such as boilerplate and
20
+ sentence
21
+ fragments.
22
+ - 🎭 ***Staging bricks*** that format data for downstream tasks, such as ML inference
23
+ and data labeling.
24
+
25
+ In this space we explore different settings of deep-learning models fine-tuned with several datasets containing a
26
+ specific document type and corresponding annotations.
27
+
28
+ Main GitHub repository link: [here](https://github.com/Unstructured-IO/unstructured)