SeoyeonPark1223
commited on
Commit
•
922645d
1
Parent(s):
4fac4a3
Update README.md
Browse files
README.md
CHANGED
@@ -75,7 +75,7 @@ tags:
|
|
75 |
- The model processes four distinct inputs, each corresponding to a specific set of features derived from video keypoints. Here’s how they are divided:
|
76 |
- Input 1, Input 2, Input 3:
|
77 |
- For each detected individual (up to 3 people), the model extracts 30 keypoints using MediaPipe.
|
78 |
-
- Each keypoint contains 3 features (x, y, z), resulting in 30
|
79 |
- Input 4:
|
80 |
- Represents relative positional coordinates calculated from the 10 most important key joints (e.g., shoulders, elbows, knees) for all 3 individuals.
|
81 |
- These relative coordinates capture spatial relationships among individuals, crucial for contextual understanding.
|
|
|
75 |
- The model processes four distinct inputs, each corresponding to a specific set of features derived from video keypoints. Here’s how they are divided:
|
76 |
- Input 1, Input 2, Input 3:
|
77 |
- For each detected individual (up to 3 people), the model extracts 30 keypoints using MediaPipe.
|
78 |
+
- Each keypoint contains 3 features (x, y, z), resulting in 30 x 3 = 90 features per frame.
|
79 |
- Input 4:
|
80 |
- Represents relative positional coordinates calculated from the 10 most important key joints (e.g., shoulders, elbows, knees) for all 3 individuals.
|
81 |
- These relative coordinates capture spatial relationships among individuals, crucial for contextual understanding.
|