bawolf commited on
Commit
c850c95
Β·
1 Parent(s): e2a6f07
README.md CHANGED
@@ -44,6 +44,12 @@ cog push
44
 
45
  ## Training
46
 
 
 
 
 
 
 
47
  ```bash
48
  # Run training with default configuration
49
  python scripts/train.py
@@ -105,7 +111,11 @@ To run predictions with cog or locally on an existing checkpoint, you can find a
105
 
106
  ## License
107
 
108
- [Your License Here]
 
 
 
 
109
 
110
  ## Citation
111
 
@@ -113,4 +123,4 @@ If you use this model in your research, please cite:
113
 
114
  ```bibtex
115
  [Your Citation Here]
116
- ```
 
44
 
45
  ## Training
46
 
47
+ download the training data
48
+
49
+ ```bash
50
+ gdown https://drive.google.com/uc?id=11M6nSuSuvoU2wpcV_-6KFqCzEMGP75q6?usp=drive_link -O ./data/
51
+ ```
52
+
53
  ```bash
54
  # Run training with default configuration
55
  python scripts/train.py
 
111
 
112
  ## License
113
 
114
+ MIT License
115
+
116
+ Copyright (c) 2024 Bryant Wolf
117
+
118
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
119
 
120
  ## Citation
121
 
 
123
 
124
  ```bibtex
125
  [Your Citation Here]
126
+ ```
model-card.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - clip
5
+ - breakdance
6
+ - video-classification
7
+ - dance
8
+ license: MIT
9
+ datasets:
10
+ - custom
11
+ ---
12
+
13
+ # CLIP-Based Break Dance Move Classifier
14
+
15
+ This model is a fine-tuned version of CLIP (ViT-Large/14) specialized in classifying break dance power moves from video frames, including windmills, halos, and swipes.
16
+
17
+ ## Model Description
18
+
19
+ - **Model Type:** Fine-tuned CLIP model
20
+ - **Base Model:** ViT-Large/14
21
+ - **Task:** Video Classification
22
+ - **Training Data:** Custom break dance video dataset
23
+ - **Output:** 3 classes of break dance moves
24
+
25
+ ## Usage
26
+
27
+ ```python
28
+ from transformers import CLIPProcessor, CLIPModel
29
+ import torch
30
+ import cv2
31
+ from PIL import Image
32
+
33
+ # Load model and processor
34
+ processor = CLIPProcessor.from_pretrained("[your-username]/clip-breakdance-classifier")
35
+ model = CLIPModel.from_pretrained("[your-username]/clip-breakdance-classifier")
36
+
37
+ # Load video and process frames
38
+ video = cv2.VideoCapture("breakdance_move.mp4")
39
+ predictions = []
40
+
41
+ while video.isOpened():
42
+ ret, frame = video.read()
43
+ if not ret:
44
+ break
45
+
46
+ # Convert BGR to RGB and to PIL Image
47
+ frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
48
+ frame_pil = Image.fromarray(frame_rgb)
49
+
50
+ # Process frame
51
+ inputs = processor(images=frame_pil, return_tensors="pt")
52
+ outputs = model(**inputs)
53
+ predictions.append(outputs)
54
+
55
+ video.release()
56
+ ```
57
+
58
+ ## Limitations
59
+
60
+ - Model performance may vary with video quality and lighting conditions
61
+ - Best results are achieved with clear, centered shots of the dance moves
62
+ - May have difficulty distinguishing between similar power moves
63
+ - Performance may be affected by unusual camera angles or partial views
64
+ - Currently only supports three specific power moves (windmills, halos, and swipes)
65
+
66
+ ## Training Procedure
67
+
68
+ - Fine-tuned on CLIP ViT-Large/14 architecture
69
+ - Training dataset: Custom dataset of break dance videos
70
+ - Dataset size: [specify number] frames from [specify number] different videos
71
+ - Training epochs: [specify number]
72
+ - Learning rate: [specify rate]
73
+ - Batch size: [specify size]
74
+ - Hardware used: [specify GPU/CPU details]
75
+
76
+ ## Evaluation Results
77
+
78
+ - Overall accuracy: [specify %]
79
+ Per-class performance:
80
+ - Windmills: [specify precision/recall]
81
+ - Halos: [specify precision/recall]
82
+ - Swipes: [specify precision/recall]
83
+
84
+ ## Citation
85
+
86
+ If you use this model in your research or project, please cite:
87
+
88
+ ```bibtex
89
+ @misc{clip-breakdance-classifier,
90
+ author = {Bryant Wolf},
91
+ title = {CLIP-Based Break Dance Move Classifier},
92
+ year = {2024},
93
+ publisher = {Hugging Face},
94
+ journal = {Hugging Face Model Hub},
95
+ howpublished = {\url{https://huggingface.co/[your-username]/clip-breakdance-classifier}}
96
+ }
97
+ ```
{script β†’ scripts}/hyperparameter_tuning.py RENAMED
@@ -8,7 +8,7 @@ import math
8
 
9
  import sys
10
  sys.path.append(os.path.dirname(os.path.dirname(__file__)))
11
- from script.train import train_and_evaluate
12
  from src.utils.utils import create_run_directory
13
 
14
  def create_hyperparam_directory():
 
8
 
9
  import sys
10
  sys.path.append(os.path.dirname(os.path.dirname(__file__)))
11
+ from scripts.train import train_and_evaluate
12
  from src.utils.utils import create_run_directory
13
 
14
  def create_hyperparam_directory():
{script β†’ scripts}/inference.py RENAMED
File without changes
{script β†’ scripts}/train.py RENAMED
File without changes
scripts/upload_to_hub.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import CLIPProcessor, CLIPModel
2
+ from huggingface_hub import HfApi
3
+
4
+ def upload_model_to_hub():
5
+ # Initialize huggingface api
6
+ api = HfApi()
7
+
8
+ # Load your fine-tuned model
9
+ model = CLIPModel.from_pretrained("./checkpoints/")
10
+ processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
11
+
12
+ # Push to hub
13
+ model.push_to_hub("[your-username]/clip-breakdance-classifier")
14
+ processor.push_to_hub("[your-username]/clip-breakdance-classifier")
15
+
16
+ if __name__ == "__main__":
17
+ upload_model_to_hub()
{script β†’ scripts}/visualization/analyze_trials.py RENAMED
File without changes
{script β†’ scripts}/visualization/miscalculations_report.py RENAMED
File without changes
{script β†’ scripts}/visualization/visualize.py RENAMED
File without changes
{script β†’ scripts}/visualization/viz_cross_compare.py RENAMED
File without changes