PierreBrunelle commited on
Commit
bc20728
·
verified ·
1 Parent(s): f720f89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -186
README.md CHANGED
@@ -13,21 +13,6 @@ thumbnail: >-
13
  https://cdn-uploads.huggingface.co/production/uploads/669ee023c7e62283cb5c51e0/MpLp6QMlriY25tezXwOYr.png
14
  ---
15
 
16
- <div align="center">
17
- <img src="https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/source/data/pixeltable-logo-large.png" alt="Pixeltable" width="50%" />
18
- <br></br>
19
-
20
- [![License](https://img.shields.io/badge/License-Apache%202.0-darkblue.svg)](https://opensource.org/licenses/Apache-2.0)
21
- ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pixeltable?logo=python&logoColor=white)
22
- ![Platform Support](https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-8A2BE2)
23
- <br>
24
- [![tests status](https://github.com/pixeltable/pixeltable/actions/workflows/pytest.yml/badge.svg)](https://github.com/pixeltable/pixeltable/actions/workflows/pytest.yml)
25
- [![tests status](https://github.com/pixeltable/pixeltable/actions/workflows/nightly.yml/badge.svg)](https://github.com/pixeltable/pixeltable/actions/workflows/nightly.yml)
26
- [![PyPI Package](https://img.shields.io/pypi/v/pixeltable?color=darkorange)](https://pypi.org/project/pixeltable/)
27
-
28
- [Installation](https://pixeltable.github.io/pixeltable/getting-started/) | [Documentation](https://pixeltable.readme.io/) | [API Reference](https://pixeltable.github.io/pixeltable/) | [Code Samples](https://github.com/pixeltable/pixeltable?tab=readme-ov-file#-code-samples) | [Computer Vision](https://docs.pixeltable.com/docs/object-detection-in-videos) | [LLM](https://docs.pixeltable.com/docs/document-indexing-and-rag)
29
- </div>
30
-
31
  Pixeltable is a Python library providing a declarative interface for multimodal data (text, images, audio, video). It features built-in versioning, lineage tracking, and incremental updates, enabling users to **store**, **transform**, **index**, and **iterate** on data for their ML workflows.
32
 
33
  Data transformations, model inference, and custom logic are embedded as **computed columns**.
@@ -44,164 +29,6 @@ pip install pixeltable
44
  ```
45
  **Pixeltable is persistent. Unlike in-memory Python libraries such as Pandas, Pixeltable is a database.**
46
 
47
- ## 💡 Getting Started
48
- Learn how to create tables, populate them with data, and enhance them with built-in or user-defined transformations.
49
-
50
- | Topic | Notebook | Topic | Notebook |
51
- |:----------|:-----------------|:-------------------------|:---------------------------------:|
52
- | 10-Minute Tour of Pixeltable | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/tutorials/pixeltable-basics.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a> | Tables and Data Operations | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/fundamentals/tables-and-data-operations.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>
53
- | User-Defined Functions (UDFs) | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/howto/udfs-in-pixeltable.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a> | Object Detection Models | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/tutorials/object-detection-in-videos.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>
54
- | Experimenting with Chunking (RAG) | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/tutorials/rag-operations.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> | Working with External Files | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/howto/working-with-external-files.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>
55
- | Integrating with Label Studio | <a target="_blank" href="https://pixeltable.readme.io/docs/label-studio"> <img src="https://img.shields.io/badge/Docs-Label Studio-blue" alt="Visit our documentation"/></a> | Audio/Video Transcript Indexing | <a target="_blank" href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/tutorials/audio-transcriptions.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> </a>
56
-
57
- ## 🧱 Code Samples
58
-
59
- ### Import media data into Pixeltable (videos, images, audio...)
60
- ```python
61
- import pixeltable as pxt
62
-
63
- v = pxt.create_table('external_data.videos', {'video': pxt.VideoType()})
64
-
65
- prefix = 's3://multimedia-commons/'
66
- paths = [
67
- 'data/videos/mp4/ffe/ffb/ffeffbef41bbc269810b2a1a888de.mp4',
68
- 'data/videos/mp4/ffe/feb/ffefebb41485539f964760e6115fbc44.mp4',
69
- 'data/videos/mp4/ffe/f73/ffef7384d698b5f70d411c696247169.mp4'
70
- ]
71
- v.insert({'video': prefix + p} for p in paths)
72
- ```
73
- Learn how to [work with data in Pixeltable](https://pixeltable.readme.io/docs/working-with-external-files).
74
-
75
- ### Object detection in images using DETR model
76
- ```python
77
- import pixeltable as pxt
78
- from pixeltable.functions import huggingface
79
-
80
- # Create a table to store data persistently
81
- t = pxt.create_table('image', {'image': pxt.ImageType()})
82
-
83
- # Insert some images
84
- prefix = 'https://upload.wikimedia.org/wikipedia/commons'
85
- paths = [
86
- '/1/15/Cat_August_2010-4.jpg',
87
- '/e/e1/Example_of_a_Dog.jpg',
88
- '/thumb/b/bf/Bird_Diversity_2013.png/300px-Bird_Diversity_2013.png'
89
- ]
90
- t.insert({'image': prefix + p} for p in paths)
91
-
92
- # Add a computed column for image classification
93
- t['classification'] = huggingface.detr_for_object_detection(
94
- (t.image), model_id='facebook/detr-resnet-50'
95
- )
96
-
97
- # Retrieve the rows where cats have been identified
98
- t.select(animal = t.image,
99
- classification = t.classification.label_text[0]) \
100
- .where(t.classification.label_text[0]=='cat').head()
101
- ```
102
- Learn about computed columns and object detection: [Comparing object detection models](https://pixeltable.readme.io/docs/object-detection-in-videos).
103
-
104
- ### Extend Pixeltable's capabilities with user-defined functions
105
- ```python
106
- @pxt.udf
107
- def draw_boxes(img: PIL.Image.Image, boxes: list[list[float]]) -> PIL.Image.Image:
108
- result = img.copy() # Create a copy of `img`
109
- d = PIL.ImageDraw.Draw(result)
110
- for box in boxes:
111
- d.rectangle(box, width=3) # Draw bounding box rectangles on the copied image
112
- return result
113
- ```
114
- Learn more about user-defined functions: [UDFs in Pixeltable](https://pixeltable.readme.io/docs/user-defined-functions-udfs).
115
-
116
- ### Automate data operations with views, e.g., split documents into chunks
117
- ```python
118
- # In this example, the view is defined by iteration over the chunks of a DocumentSplitter
119
- chunks_table = pxt.create_view(
120
- 'rag_demo.chunks',
121
- documents_table,
122
- iterator=DocumentSplitter.create(
123
- document=documents_table.document,
124
- separators='token_limit', limit=300)
125
- )
126
- ```
127
- Learn how to leverage views to build your [RAG workflow](https://pixeltable.readme.io/docs/document-indexing-and-rag).
128
-
129
- ### Evaluate model performance
130
- ```python
131
- # The computation of the mAP metric can become a query over the evaluation output
132
- frames_view.select(mean_ap(frames_view.eval_yolox_tiny), mean_ap(frames_view.eval_yolox_m)).show()
133
- ```
134
- Learn how to leverage Pixeltable for [Model analytics](https://pixeltable.readme.io/docs/object-detection-in-videos).
135
-
136
- ### Working with inference services
137
- ```python
138
- chat_table = pxt.create_table('together_demo.chat', {'input': pxt.StringType()})
139
-
140
- # The chat-completions API expects JSON-formatted input:
141
- messages = [{'role': 'user', 'content': chat_table.input}]
142
-
143
- # This example shows how additional parameters from the Together API can be used in Pixeltable
144
- chat_table['output'] = chat_completions(
145
- messages=messages,
146
- model='mistralai/Mixtral-8x7B-Instruct-v0.1',
147
- max_tokens=300,
148
- stop=['\n'],
149
- temperature=0.7,
150
- top_p=0.9,
151
- top_k=40,
152
- repetition_penalty=1.1,
153
- logprobs=1,
154
- echo=True
155
- )
156
- chat_table['response'] = chat_table.output.choices[0].message.content
157
-
158
- # Start a conversation
159
- chat_table.insert([
160
- {'input': 'How many species of felids have been classified?'},
161
- {'input': 'Can you make me a coffee?'}
162
- ])
163
- chat_table.select(chat_table.input, chat_table.response).head()
164
- ```
165
- Learn how to interact with inference services such as [Together AI](https://pixeltable.readme.io/docs/together-ai) in Pixeltable.
166
-
167
- ### Text and image similarity search on video frames with embedding indexes
168
- ```python
169
- import pixeltable as pxt
170
- from pixeltable.functions.huggingface import clip_image, clip_text
171
- from pixeltable.iterators import FrameIterator
172
- import PIL.Image
173
-
174
- video_table = pxt.create_table('videos', {'video': pxt.VideoType()})
175
-
176
- video_table.insert([{'video': '/video.mp4'}])
177
-
178
- frames_view = pxt.create_view(
179
- 'frames', video_table, iterator=FrameIterator.create(video=video_table.video))
180
-
181
- @pxt.expr_udf
182
- def embed_image(img: PIL.Image.Image):
183
- return clip_image(img, model_id='openai/clip-vit-base-patch32')
184
-
185
- @pxt.expr_udf
186
- def str_embed(s: str):
187
- return clip_text(s, model_id='openai/clip-vit-base-patch32')
188
-
189
- # Create an index on the 'frame' column that allows text and image search
190
- frames_view.add_embedding_index('frame', string_embed=str_embed, image_embed=embed_image)
191
-
192
- # Now we will retrieve images based on a sample image
193
- sample_image = '/image.jpeg'
194
- sim = frames_view.frame.similarity(sample_image)
195
- frames_view.order_by(sim, asc=False).limit(5).select(frames_view.frame, sim=sim).collect()
196
-
197
- # Now we will retrieve images based on a string
198
- sample_text = 'red truck'
199
- sim = frames_view.frame.similarity(sample_text)
200
- frames_view.order_by(sim, asc=False).limit(5).select(frames_view.frame, sim=sim).collect()
201
-
202
- ```
203
- Learn how to work with [Embedding and Vector Indexes](https://docs.pixeltable.com/docs/embedding-vector-indexes).
204
-
205
  ## ❓ FAQ
206
 
207
  ### What is Pixeltable?
@@ -236,16 +63,4 @@ Today's solutions for AI app development require extensive custom coding and inf
236
  ### What is Pixeltable not providing?
237
 
238
  - Pixeltable is not a low-code, prescriptive AI solution. We empower you to use the best frameworks and techniques for your specific needs.
239
- - We do not aim to replace your existing AI toolkit, but rather enhance it by streamlining the underlying data infrastructure and orchestration.
240
-
241
- > [!TIP]
242
- > Check out the [Integrations](https://pixeltable.readme.io/docs/working-with-openai) section, and feel free to submit a request for additional ones.
243
-
244
- ## 🐛 Contributions & Feedback
245
-
246
- Are you experiencing issues or bugs with Pixeltable? File an [Issue](https://github.com/pixeltable/pixeltable/issues).
247
- </br>Do you want to contribute? Feel free to open a [PR](https://github.com/pixeltable/pixeltable/pulls).
248
-
249
- ## :classical_building: License
250
-
251
- This library is licensed under the Apache 2.0 License.
 
13
  https://cdn-uploads.huggingface.co/production/uploads/669ee023c7e62283cb5c51e0/MpLp6QMlriY25tezXwOYr.png
14
  ---
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  Pixeltable is a Python library providing a declarative interface for multimodal data (text, images, audio, video). It features built-in versioning, lineage tracking, and incremental updates, enabling users to **store**, **transform**, **index**, and **iterate** on data for their ML workflows.
17
 
18
  Data transformations, model inference, and custom logic are embedded as **computed columns**.
 
29
  ```
30
  **Pixeltable is persistent. Unlike in-memory Python libraries such as Pandas, Pixeltable is a database.**
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## ❓ FAQ
33
 
34
  ### What is Pixeltable?
 
63
  ### What is Pixeltable not providing?
64
 
65
  - Pixeltable is not a low-code, prescriptive AI solution. We empower you to use the best frameworks and techniques for your specific needs.
66
+ - We do not aim to replace your existing AI toolkit, but rather enhance it by streamlining the underlying data infrastructure and orchestration.