SilverAvocado
/

Silver-Multimodal

Model card Files Files and versions Community

SeoyeonPark1223 commited on 16 days ago

Commit

922645d

•

1 Parent(s): 4fac4a3

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -75,7 +75,7 @@ tags:
             - The model processes four distinct inputs, each corresponding to a specific set of features derived from video keypoints. Here’s how they are divided:
             - Input 1, Input 2, Input 3:
                 - For each detected individual (up to 3 people), the model extracts 30 keypoints using MediaPipe.
-                - Each keypoint contains 3 features (x, y, z), resulting in 30 \times 3 = 90 features per frame.
             - Input 4:
                 - Represents relative positional coordinates calculated from the 10 most important key joints (e.g., shoulders, elbows, knees) for all 3 individuals.
                 - These relative coordinates capture spatial relationships among individuals, crucial for contextual understanding.

             - The model processes four distinct inputs, each corresponding to a specific set of features derived from video keypoints. Here’s how they are divided:
             - Input 1, Input 2, Input 3:
                 - For each detected individual (up to 3 people), the model extracts 30 keypoints using MediaPipe.
+                - Each keypoint contains 3 features (x, y, z), resulting in 30 x 3 = 90 features per frame.
             - Input 4:
                 - Represents relative positional coordinates calculated from the 10 most important key joints (e.g., shoulders, elbows, knees) for all 3 individuals.
                 - These relative coordinates capture spatial relationships among individuals, crucial for contextual understanding.