kmewhort
/

beit-sketch-classifier

Image Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

kmewhort commited on Dec 19, 2022

Commit

dff82eb

·

1 Parent(s): 7a3df5e

Update README.md

Files changed (1) hide show

README.md +52 -9

README.md CHANGED Viewed

@@ -14,22 +14,65 @@ should probably proofread and complete it, then remove this comment. -->
 # beit-sketch-classifier
-This model is a fine-tuned version of [microsoft/beit-base-patch16-224-pt22k-ft22k](https://huggingface.co/microsoft/beit-base-patch16-224-pt22k-ft22k) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.6083
 - Accuracy: 0.7480
-## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 # beit-sketch-classifier
+This model is a version of [microsoft/beit-base-patch16-224-pt22k-ft22k](https://huggingface.co/microsoft/beit-base-patch16-224-pt22k-ft22k) fine-tuned on a dataset of Quick!Draw! sketches ([1 percent of the 50M sketches](https://huggingface.co/datasets/kmewhort/quickdraw-bins-1pct-sample)).
 It achieves the following results on the evaluation set:
 - Loss: 1.6083
 - Accuracy: 0.7480
 ## Intended uses & limitations
+It's intended to be used to classifier sketches with a line-segment input format (there's no data augmentation in the fine-tuning; the input raster images ideally need to be generated from line-vector format very similarly to the training images).
+You can generate the requisite PIL images from Quickdraw `bin` format with the following:
+```
+# packed bytes -> dict (fro mhttps://github.com/googlecreativelab/quickdraw-dataset/blob/master/examples/binary_file_parser.py)
+def unpack_drawing(file_handle):
+    key_id, = unpack('Q', file_handle.read(8))
+    country_code, = unpack('2s', file_handle.read(2))
+    recognized, = unpack('b', file_handle.read(1))
+    timestamp, = unpack('I', file_handle.read(4))
+    n_strokes, = unpack('H', file_handle.read(2))
+    image = []
+    n_bytes = 17
+    for i in range(n_strokes):
+        n_points, = unpack('H', file_handle.read(2))
+        fmt = str(n_points) + 'B'
+        x = unpack(fmt, file_handle.read(n_points))
+        y = unpack(fmt, file_handle.read(n_points))
+        image.append((x, y))
+        n_bytes += 2 + 2*n_points
+    result = {
+        'key_id': key_id,
+        'country_code': country_code,
+        'recognized': recognized,
+        'timestamp': timestamp,
+        'image': image,
+    }
+    return result
+# packed bin -> RGB PIL
+def binToPIL(packed_drawing):
+    padding = 8
+    radius = 7
+    scale = (224.0-(2*padding)) / 256
+    unpacked = unpack_drawing(io.BytesIO(packed_drawing))
+    unpacked_image = unpacked['image']
+    image = np.full((224,224), 255, np.uint8)
+    for stroke in unpacked['image']:
+        prevX = round(stroke[0][0]*scale)
+        prevY = round(stroke[1][0]*scale)
+        for i in range(1, len(stroke[0])):
+            x = round(stroke[0][i]*scale)
+            y = round(stroke[1][i]*scale)
+            cv2.line(image, (padding+prevX, padding+prevY), (padding+x, padding+y), 0, radius, -1)
+            prevX = x
+            prevY = y
+    pilImage = Image.fromarray(image).convert("RGB")
+    return pilImage
+```
 ## Training procedure