Spaces:
Sleeping
Sleeping
comet-team
commited on
Commit
•
83d3fa2
0
Parent(s):
Duplicate from comet-team/kangas-direct
Browse files- .gitattributes +35 -0
- Dockerfile +12 -0
- README.md +107 -0
- coco-500.datagrid +3 -0
- cppe-5-test.datagrid +3 -0
- requirements.txt +2 -0
.gitattributes
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
29 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
30 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
31 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
32 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
33 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
34 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
35 |
+
*.datagrid filter=lfs diff=lfs merge=lfs -text
|
Dockerfile
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.9
|
2 |
+
|
3 |
+
WORKDIR /code
|
4 |
+
|
5 |
+
COPY ./requirements.txt /code/requirements.txt
|
6 |
+
|
7 |
+
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
|
8 |
+
RUN kangas upgrade /code/*.datagrid
|
9 |
+
|
10 |
+
COPY . .
|
11 |
+
|
12 |
+
CMD kangas server cppe-5-test.datagrid --frontend-port=7860
|
README.md
ADDED
@@ -0,0 +1,107 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: Kangas Datagrid
|
3 |
+
emoji: 👁
|
4 |
+
colorFrom: indigo
|
5 |
+
colorTo: red
|
6 |
+
sdk: docker
|
7 |
+
pinned: false
|
8 |
+
license: apache-2.0
|
9 |
+
duplicated_from: comet-team/kangas-direct
|
10 |
+
---
|
11 |
+
|
12 |
+
# Kangas: Explore Multimedia Datasets at Scale :kangaroo:
|
13 |
+
|
14 |
+
Kangas is a tool for exploring, analyzing, and visualizing large-scale multimedia data. It provides a straightforward Python API
|
15 |
+
for logging large tables of data, along with an intuitive visual interface for performing complex queries against your dataset.
|
16 |
+
|
17 |
+
The key features of Kangas include:
|
18 |
+
|
19 |
+
- **Scalability**. Kangas DataGrid, the fundamental class for representing datasets, can easily store millions of rows of data.
|
20 |
+
- **Performance**. Group, sort, and filter across millions of data points in seconds with a simple, fast UI.
|
21 |
+
- **Interoperability**. Any data, any environment. Kangas can run in a notebook or as a standalone app, both locally and remotely.
|
22 |
+
- **Integrated computer vision support**. Visualize and filter bounding boxes, labels, and metadata without any extra setup.
|
23 |
+
|
24 |
+
You can access a live demo of Kangas at <a href="https://kangas.comet.com?datagrid=/data/coco-500.datagrid">kangas.comet.com</a>.
|
25 |
+
|
26 |
+
## Getting Started
|
27 |
+
|
28 |
+
Kangas is accessible as a Python library via pip
|
29 |
+
```
|
30 |
+
pip install kangas
|
31 |
+
```
|
32 |
+
|
33 |
+
Once installed, there are many ways to load or create a DataGrid.
|
34 |
+
|
35 |
+
Without writing any code, you can even download a DataGrid and begin exploring the data. At the console:
|
36 |
+
|
37 |
+
```
|
38 |
+
kangas server https://github.com/caleb-kaiser/kangas_examples/raw/master/coco-500.datagrid.zip
|
39 |
+
```
|
40 |
+
|
41 |
+
That's it!
|
42 |
+
|
43 |
+
In the next example, we load a publicly available DataGrid file, but the Kangas API also provides methods for ingesting CSVs, Pandas DataFrames, and for manually constructing a new DataGrid:
|
44 |
+
|
45 |
+
```python
|
46 |
+
import kangas as kg
|
47 |
+
|
48 |
+
# Load an existing DataGrid
|
49 |
+
dg = kg.read_datagrid("https://github.com/caleb-kaiser/kangas_examples/raw/master/coco-500.datagrid.zip")
|
50 |
+
```
|
51 |
+
|
52 |
+
After your DataGrid is initialized, you can render it within the Kangas Viewer directly from Python:
|
53 |
+
|
54 |
+
```python
|
55 |
+
dg.show()
|
56 |
+
```
|
57 |
+
<img width="1789" alt="image" src="https://user-images.githubusercontent.com/42076840/197875668-5519d504-2209-472f-952e-ed09554ecb7a.png">
|
58 |
+
|
59 |
+
From the Kangas Viewer, you can group, sort, and filter data. In addition, Kangas will do its best to parse any metadata attached to your assets. For example, if you're using the COCO-500 DataGrid from the quickstart above, Kangas will automatically parse labels and scores for each image:
|
60 |
+
|
61 |
+
<img src="https://github.com/caleb-kaiser/kangas_examples/blob/master/Oct-25-2022%2016-43-56.gif">
|
62 |
+
|
63 |
+
And voilà! Now you're started using Kangas.
|
64 |
+
|
65 |
+
### Pandas DataFrames
|
66 |
+
|
67 |
+
Kangas can also read Pandas DataFrame objects directly:
|
68 |
+
|
69 |
+
```python
|
70 |
+
import kangas as kg
|
71 |
+
import pandas as pd
|
72 |
+
|
73 |
+
df = pd.DataFrame({"hidden_layer_size": [8, 16, 64], "loss": [0.97, 0.53, 0.12]})
|
74 |
+
dg = kg.read_dataframe(df)
|
75 |
+
```
|
76 |
+
### HuggingFace Datasets
|
77 |
+
|
78 |
+
HuggingFace's datasets can also be loaded into DataGrid directly because they use
|
79 |
+
rows of dictionaries, and images are represented by PIL images. DataGrid will
|
80 |
+
automatically convert PIL images into a [Kangas Image](https://github.com/comet-ml/kangas/wiki/Image#image):
|
81 |
+
|
82 |
+
```python
|
83 |
+
import kangas as kg
|
84 |
+
from datasets import load_dataset
|
85 |
+
|
86 |
+
dataset = load_dataset("beans", split="train")
|
87 |
+
dg = kg.DataGrid(dataset)
|
88 |
+
```
|
89 |
+
|
90 |
+
### Parquet files
|
91 |
+
|
92 |
+
> **Note**: You will need to have pyarrow installed to read parquet files.
|
93 |
+
|
94 |
+
```python
|
95 |
+
import kangas as kg
|
96 |
+
|
97 |
+
dg = kg.read_parquet("https://github.com/Teradata/kylo/raw/master/samples/sample-data/parquet/userdata5.parquet")
|
98 |
+
```
|
99 |
+
|
100 |
+
If you'd like to explore further, take a look at our example notebooks below:
|
101 |
+
|
102 |
+
## Documentation
|
103 |
+
|
104 |
+
1. <a href="https://github.com/comet-ml/kangas/wiki">Documentation Homepage</a>
|
105 |
+
2. <a href="https://github.com/comet-ml/kangas/blob/main/notebooks/DataGrid-Getting%20Started.ipynb">Quickstart Notebook</a> <a href="https://colab.research.google.com/github/comet-ml/kangas/blob/main/notebooks/DataGrid-Getting%20Started.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
|
106 |
+
3. <a href="https://github.com/comet-ml/kangas/blob/main/notebooks/Integrations.ipynb">Integrations Notebook</a> <a href="https://colab.research.google.com/github/comet-ml/kangas/blob/main/notebooks/Integrations.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
|
107 |
+
4. <a href="https://github.com/comet-ml/kangas/blob/main/examples/mnist_script.py"> MNIST Classification Example</a>
|
coco-500.datagrid
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0a1cb31af558c1e80e54599685f70770e7d9adc0bba9387e06d460710c3c6d98
|
3 |
+
size 93057024
|
cppe-5-test.datagrid
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5e9252eac21a9bda4a8c8b7cb4e162e08e2b5bcc55dbb58d6b18b71375523077
|
3 |
+
size 25305088
|
requirements.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
kangas>=2.3.5
|
2 |
+
datasets
|