CalebCometML comet-team commited on
Commit
dbac830
0 Parent(s):

Duplicate from comet-team/kangas-direct

Browse files

Co-authored-by: Comet ML Team <comet-team@users.noreply.huggingface.co>

Files changed (6) hide show
  1. .gitattributes +35 -0
  2. Dockerfile +12 -0
  3. README.md +107 -0
  4. coco-500.datagrid +3 -0
  5. cppe-5-test.datagrid +3 -0
  6. requirements.txt +2 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tflite filter=lfs diff=lfs merge=lfs -text
29
+ *.tgz filter=lfs diff=lfs merge=lfs -text
30
+ *.wasm filter=lfs diff=lfs merge=lfs -text
31
+ *.xz filter=lfs diff=lfs merge=lfs -text
32
+ *.zip filter=lfs diff=lfs merge=lfs -text
33
+ *.zst filter=lfs diff=lfs merge=lfs -text
34
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
35
+ *.datagrid filter=lfs diff=lfs merge=lfs -text
Dockerfile ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.9
2
+
3
+ WORKDIR /code
4
+
5
+ COPY ./requirements.txt /code/requirements.txt
6
+
7
+ RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
8
+ RUN kangas upgrade /code/*.datagrid
9
+
10
+ COPY . .
11
+
12
+ CMD kangas server cppe-5-test.datagrid --frontend-port=7860
README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Kangas Datagrid
3
+ emoji: 👁
4
+ colorFrom: indigo
5
+ colorTo: red
6
+ sdk: docker
7
+ pinned: false
8
+ license: apache-2.0
9
+ duplicated_from: comet-team/kangas-direct
10
+ ---
11
+
12
+ # Kangas: Explore Multimedia Datasets at Scale :kangaroo:
13
+
14
+ Kangas is a tool for exploring, analyzing, and visualizing large-scale multimedia data. It provides a straightforward Python API
15
+ for logging large tables of data, along with an intuitive visual interface for performing complex queries against your dataset.
16
+
17
+ The key features of Kangas include:
18
+
19
+ - **Scalability**. Kangas DataGrid, the fundamental class for representing datasets, can easily store millions of rows of data.
20
+ - **Performance**. Group, sort, and filter across millions of data points in seconds with a simple, fast UI.
21
+ - **Interoperability**. Any data, any environment. Kangas can run in a notebook or as a standalone app, both locally and remotely.
22
+ - **Integrated computer vision support**. Visualize and filter bounding boxes, labels, and metadata without any extra setup.
23
+
24
+ You can access a live demo of Kangas at <a href="https://kangas.comet.com?datagrid=/data/coco-500.datagrid">kangas.comet.com</a>.
25
+
26
+ ## Getting Started
27
+
28
+ Kangas is accessible as a Python library via pip
29
+ ```
30
+ pip install kangas
31
+ ```
32
+
33
+ Once installed, there are many ways to load or create a DataGrid.
34
+
35
+ Without writing any code, you can even download a DataGrid and begin exploring the data. At the console:
36
+
37
+ ```
38
+ kangas server https://github.com/caleb-kaiser/kangas_examples/raw/master/coco-500.datagrid.zip
39
+ ```
40
+
41
+ That's it!
42
+
43
+ In the next example, we load a publicly available DataGrid file, but the Kangas API also provides methods for ingesting CSVs, Pandas DataFrames, and for manually constructing a new DataGrid:
44
+
45
+ ```python
46
+ import kangas as kg
47
+
48
+ # Load an existing DataGrid
49
+ dg = kg.read_datagrid("https://github.com/caleb-kaiser/kangas_examples/raw/master/coco-500.datagrid.zip")
50
+ ```
51
+
52
+ After your DataGrid is initialized, you can render it within the Kangas Viewer directly from Python:
53
+
54
+ ```python
55
+ dg.show()
56
+ ```
57
+ <img width="1789" alt="image" src="https://user-images.githubusercontent.com/42076840/197875668-5519d504-2209-472f-952e-ed09554ecb7a.png">
58
+
59
+ From the Kangas Viewer, you can group, sort, and filter data. In addition, Kangas will do its best to parse any metadata attached to your assets. For example, if you're using the COCO-500 DataGrid from the quickstart above, Kangas will automatically parse labels and scores for each image:
60
+
61
+ <img src="https://github.com/caleb-kaiser/kangas_examples/blob/master/Oct-25-2022%2016-43-56.gif">
62
+
63
+ And voil&agrave;! Now you're started using Kangas.
64
+
65
+ ### Pandas DataFrames
66
+
67
+ Kangas can also read Pandas DataFrame objects directly:
68
+
69
+ ```python
70
+ import kangas as kg
71
+ import pandas as pd
72
+
73
+ df = pd.DataFrame({"hidden_layer_size": [8, 16, 64], "loss": [0.97, 0.53, 0.12]})
74
+ dg = kg.read_dataframe(df)
75
+ ```
76
+ ### HuggingFace Datasets
77
+
78
+ HuggingFace's datasets can also be loaded into DataGrid directly because they use
79
+ rows of dictionaries, and images are represented by PIL images. DataGrid will
80
+ automatically convert PIL images into a [Kangas Image](https://github.com/comet-ml/kangas/wiki/Image#image):
81
+
82
+ ```python
83
+ import kangas as kg
84
+ from datasets import load_dataset
85
+
86
+ dataset = load_dataset("beans", split="train")
87
+ dg = kg.DataGrid(dataset)
88
+ ```
89
+
90
+ ### Parquet files
91
+
92
+ > **Note**: You will need to have pyarrow installed to read parquet files.
93
+
94
+ ```python
95
+ import kangas as kg
96
+
97
+ dg = kg.read_parquet("https://github.com/Teradata/kylo/raw/master/samples/sample-data/parquet/userdata5.parquet")
98
+ ```
99
+
100
+ If you'd like to explore further, take a look at our example notebooks below:
101
+
102
+ ## Documentation
103
+
104
+ 1. <a href="https://github.com/comet-ml/kangas/wiki">Documentation Homepage</a>
105
+ 2. <a href="https://github.com/comet-ml/kangas/blob/main/notebooks/DataGrid-Getting%20Started.ipynb">Quickstart Notebook</a> <a href="https://colab.research.google.com/github/comet-ml/kangas/blob/main/notebooks/DataGrid-Getting%20Started.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
106
+ 3. <a href="https://github.com/comet-ml/kangas/blob/main/notebooks/Integrations.ipynb">Integrations Notebook</a> <a href="https://colab.research.google.com/github/comet-ml/kangas/blob/main/notebooks/Integrations.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
107
+ 4. <a href="https://github.com/comet-ml/kangas/blob/main/examples/mnist_script.py"> MNIST Classification Example</a>
coco-500.datagrid ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a1cb31af558c1e80e54599685f70770e7d9adc0bba9387e06d460710c3c6d98
3
+ size 93057024
cppe-5-test.datagrid ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e9252eac21a9bda4a8c8b7cb4e162e08e2b5bcc55dbb58d6b18b71375523077
3
+ size 25305088
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ kangas>=2.3.5
2
+ datasets