vagheshpatel commited on
Commit
69e7b1f
·
verified ·
1 Parent(s): a611ddc

Sync motion-tracking from metro-analytics-catalog

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ expected_output_dlstreamer.gif filter=lfs diff=lfs merge=lfs -text
37
+ expected_output_openvino.gif filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,7 +1,5 @@
1
  # Motion Tracking
2
 
3
- > **Validated with:** OpenVINO 2026.1.0, NNCF 3.0.0, DLStreamer 2026.0, Ultralytics 8.4.46, Python 3.11+
4
-
5
  | Property | Value |
6
  |---|---|
7
  | **Category** | Object Detection + Multi-Object Tracking |
@@ -19,7 +17,6 @@
19
  Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
20
  It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
21
 
22
- - **OpenVINO pipeline:** YOLO26 INT8 detection + Ultralytics built-in [BoT-SORT](https://github.com/NirAharon/BoT-SORT) or [ByteTrack](https://github.com/FoundationVision/ByteTrack) tracker via `model.track()`.
23
  - **DLStreamer pipeline:** YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.
24
 
25
  Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
@@ -42,7 +39,7 @@ The default tracker is BoT-SORT; ByteTrack is available as an alternative with l
42
  ## Prerequisites
43
 
44
  - Python 3.11+
45
- - `ffmpeg` (`sudo apt install ffmpeg`) used by both samples to encode output video
46
  - [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
47
  - [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
48
 
@@ -86,7 +83,7 @@ The second argument selects the precision (`FP32`, `FP16`, `INT8`); the default
86
  The script performs the following steps:
87
 
88
  1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
89
- 2. Downloads the sample test video (`test_video.mp4`).
90
  3. Downloads the PyTorch weights and exports to OpenVINO IR.
91
  4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
92
 
@@ -107,128 +104,13 @@ Output files:
107
  > For production accuracy, replace it with a representative set of frames from
108
  > the target deployment site.
109
 
110
- ### OpenVINO Sample
111
-
112
- The sample below uses the Ultralytics `model.track()` API with the PyTorch
113
- weights to detect and track objects in a video, assigning persistent track IDs
114
- via the built-in BoT-SORT tracker.
115
- Each annotated frame -- with bounding boxes, track IDs, and per-track
116
- trajectory polylines -- is written to `output.mp4`.
117
-
118
- > **Important:** The `model.track()` API requires PyTorch weights (`.pt`).
119
- > Using the OpenVINO model directory with `model.track()` produces zero
120
- > detections in Ultralytics 8.4.x due to an incompatibility in the tracker
121
- > integration. Use `model.predict()` for single-frame inference with the
122
- > OpenVINO backend, or use the DLStreamer sample below for OpenVINO-accelerated
123
- > tracking.
124
- >
125
- > The INT8 model (`yolo26n_tracking_int8.xml`) can be used directly with the
126
- > OpenVINO Python API but not with the Ultralytics `YOLO()` wrapper.
127
-
128
- ```python
129
- import subprocess
130
- from collections import defaultdict
131
-
132
- import cv2
133
- import numpy as np
134
- from ultralytics import YOLO
135
-
136
- # Use PyTorch weights for tracking -- model.track() requires the .pt backend.
137
- # The OpenVINO model directory works with model.predict() but not model.track().
138
- model = YOLO("yolo26n.pt", task="detect")
139
-
140
- video_path = "test_video.mp4"
141
- cap = cv2.VideoCapture(video_path)
142
- fps = cap.get(cv2.CAP_PROP_FPS) or 30.0
143
- width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
144
- height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
145
-
146
- # Pipe frames to ffmpeg for H.264 output (universally playable).
147
- proc = subprocess.Popen(
148
- ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
149
- "-s", f"{width}x{height}", "-r", str(fps),
150
- "-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
151
- "-movflags", "+faststart", "output.mp4"],
152
- stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
153
- )
154
-
155
- # Distinct colors for trajectory lines (one per track ID).
156
- COLORS = [
157
- (255, 0, 0), (0, 255, 0), (0, 0, 255), (255, 255, 0),
158
- (255, 0, 255), (0, 255, 255), (128, 0, 255), (255, 128, 0),
159
- ]
160
- track_history: dict[int, list[tuple[float, float]]] = defaultdict(list)
161
-
162
- while cap.isOpened():
163
- success, frame = cap.read()
164
- if not success:
165
- break
166
-
167
- # Run YOLO26 tracking with BoT-SORT (default).
168
- # Use tracker="bytetrack.yaml" for ByteTrack alternative.
169
- results = model.track(frame, persist=True, conf=0.4, tracker="botsort.yaml")
170
- result = results[0]
171
-
172
- annotated = result.plot()
173
-
174
- if result.boxes and result.boxes.is_track:
175
- boxes = result.boxes.xywh.cpu()
176
- track_ids = result.boxes.id.int().cpu().tolist()
177
- classes = result.boxes.cls.int().cpu().tolist()
178
-
179
- for box, track_id in zip(boxes, track_ids):
180
- x, y, _w, _h = box
181
- track = track_history[track_id]
182
- track.append((float(x), float(y)))
183
- if len(track) > 30:
184
- track.pop(0)
185
-
186
- color = COLORS[track_id % len(COLORS)]
187
- points = np.array(track, dtype=np.int32).reshape((-1, 1, 2))
188
- cv2.polylines(annotated, [points], False, color, 2)
189
-
190
- for tid, cls_id in zip(track_ids, classes):
191
- cx, cy = track_history[tid][-1]
192
- print(f" Track {tid}: class={cls_id} center=({cx:.0f},{cy:.0f})", flush=True)
193
-
194
- proc.stdin.write(annotated.tobytes())
195
-
196
- cap.release()
197
- proc.stdin.close()
198
- proc.wait()
199
- print("Wrote output.mp4", flush=True)
200
- ```
201
-
202
- **Device targets:**
203
-
204
- - Default runs on CPU via OpenVINO.
205
- - For GPU: set `device="gpu:0"` in the `model.track()` call.
206
- - For NPU: set `device="npu:0"` (validate availability with `benchmark_app -d NPU`).
207
-
208
- ### Try It on a Sample Video
209
-
210
- The `export_and_quantize.sh` script downloads `test_video.mp4` automatically.
211
- Run the OpenVINO sample above.
212
- The script processes each frame, prints per-track positions to the console,
213
- and writes the annotated video to `output.mp4`.
214
-
215
- Expected console output (representative):
216
-
217
- ```text
218
- Track 1: class=0 center=(320,240)
219
- Track 2: class=0 center=(450,300)
220
- ```
221
-
222
- `output.mp4` shows bounding boxes with track IDs and colored trajectory
223
- polylines for each tracked object.
224
-
225
  ### DLStreamer Sample
226
 
227
  The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
228
  `test_video.mp4`, attaches persistent track IDs with `gvatrack`
229
  (`short-term-imageless` tracker), and overlays bounding boxes with
230
  `gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
231
- polylines are drawn with OpenCV, and the result is muxed to `output.mp4`
232
  (H.264 via ffmpeg).
233
 
234
  > **Notes on running this sample:**
@@ -259,15 +141,14 @@ from gstgva import VideoFrame
259
 
260
  Gst.init(None)
261
 
262
- # For GPU: change device=CPU to device=GPU, add vapostproc !
263
- # video/x-raw(memory:VASurface) after decodebin3, and set
264
- # pre-process-backend=vaapi-surface-sharing on gvadetect.
265
- # For NPU: change device=CPU to device=NPU (batch-size=1, nireq=4 recommended).
266
  pipeline_str = (
267
- "filesrc location=test_video.mp4 ! decodebin3 ! videoconvert ! "
268
- "video/x-raw,format=BGR ! "
269
  "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
270
- "device=CPU threshold=0.4 ! queue ! "
 
271
  "gvatrack tracking-type=short-term-imageless ! queue ! "
272
  "gvawatermark ! appsink name=sink emit-signals=false sync=false"
273
  )
@@ -304,7 +185,7 @@ while True:
304
  ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
305
  "-s", f"{width}x{height}", "-r", str(fps),
306
  "-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
307
- "-movflags", "+faststart", "output.mp4"],
308
  stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
309
  )
310
 
@@ -344,14 +225,18 @@ pipeline.set_state(Gst.State.NULL)
344
  if proc:
345
  proc.stdin.close()
346
  proc.wait()
347
- print("Wrote output.mp4", flush=True)
348
  ```
349
 
 
 
 
 
350
  **Device targets:**
351
 
352
- - `device=CPU` -- default in the sample code.
353
- - `device=GPU` -- add `vapostproc ! video/x-raw(memory:VASurface)` after `decodebin3` and set `pre-process-backend=vaapi-surface-sharing` on `gvadetect`.
354
- - `device=NPU` -- use `batch-size=1` and `nireq=4` for best NPU utilization.
355
 
356
  ---
357
 
 
1
  # Motion Tracking
2
 
 
 
3
  | Property | Value |
4
  |---|---|
5
  | **Category** | Object Detection + Multi-Object Tracking |
 
17
  Motion Tracking is a Metro Analytics use case that detects objects and assigns persistent track IDs across frames, enabling trajectory analysis and temporal event detection.
18
  It is built on [YOLO26](https://docs.ultralytics.com/models/yolo26/), a state-of-the-art real-time object detector quantized to INT8, paired with a multi-object tracker:
19
 
 
20
  - **DLStreamer pipeline:** YOLO26 FP16 detection via `gvadetect` + `gvatrack` element with `tracking-type=short-term-imageless`.
21
 
22
  Each detected object receives a unique `track_id` that persists across frames as long as the object remains visible.
 
39
  ## Prerequisites
40
 
41
  - Python 3.11+
42
+ - `ffmpeg` (`sudo apt install ffmpeg`) -- used by both samples to encode output video
43
  - [Install OpenVINO](https://docs.openvino.ai/2026/get-started/install-openvino.html) (latest version)
44
  - [Install Intel DLStreamer](https://docs.openedgeplatform.intel.com/2026.0/edge-ai-libraries/dlstreamer/get_started/install/install_guide_ubuntu.html) (latest version)
45
 
 
83
  The script performs the following steps:
84
 
85
  1. Installs dependencies (`openvino`, `ultralytics`; adds `nncf` for INT8).
86
+ 2. Downloads the sample test video (`test_video.mp4`) and a sample test image (`test.jpg`).
87
  3. Downloads the PyTorch weights and exports to OpenVINO IR.
88
  4. *(INT8 only)* Quantizes the model using NNCF post-training quantization.
89
 
 
104
  > For production accuracy, replace it with a representative set of frames from
105
  > the target deployment site.
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  ### DLStreamer Sample
108
 
109
  The pipeline below runs the YOLO26 FP16 detector via `gvadetect` on
110
  `test_video.mp4`, attaches persistent track IDs with `gvatrack`
111
  (`short-term-imageless` tracker), and overlays bounding boxes with
112
  `gvawatermark`. Frames are pulled from an `appsink`, per-track trajectory
113
+ polylines are drawn with OpenCV, and the result is muxed to `output_dlstreamer.mp4`
114
  (H.264 via ffmpeg).
115
 
116
  > **Notes on running this sample:**
 
141
 
142
  Gst.init(None)
143
 
144
+ # For CPU: change device=GPU to device=CPU.
145
+ # For NPU: change device=GPU to device=NPU (batch-size=1, nireq=4 recommended).
 
 
146
  pipeline_str = (
147
+ "filesrc location=test_video.mp4 ! decodebin3 ! "
148
+ "videoconvert ! "
149
  "gvadetect model=yolo26n_openvino_model/yolo26n.xml "
150
+ "device=GPU "
151
+ "threshold=0.4 ! queue ! "
152
  "gvatrack tracking-type=short-term-imageless ! queue ! "
153
  "gvawatermark ! appsink name=sink emit-signals=false sync=false"
154
  )
 
185
  ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "bgr24",
186
  "-s", f"{width}x{height}", "-r", str(fps),
187
  "-i", "pipe:0", "-c:v", "libx264", "-pix_fmt", "yuv420p",
188
+ "-movflags", "+faststart", "output_dlstreamer.mp4"],
189
  stdin=subprocess.PIPE, stderr=subprocess.DEVNULL,
190
  )
191
 
 
225
  if proc:
226
  proc.stdin.close()
227
  proc.wait()
228
+ print("Wrote output_dlstreamer.mp4", flush=True)
229
  ```
230
 
231
+ #### Expected Output
232
+
233
+ ![DLStreamer expected output](expected_output_dlstreamer.gif)
234
+
235
  **Device targets:**
236
 
237
+ - `device=GPU` -- default in the sample code.
238
+ - `device=CPU` -- change `device=GPU` to `device=CPU`.
239
+ - `device=NPU` -- change `device=GPU` to `device=NPU`; use `batch-size=1` and `nireq=4` for best NPU utilization.
240
 
241
  ---
242
 
expected_output_dlstreamer.gif ADDED

Git LFS Details

  • SHA256: 987a4b6152414e29729778e4a71290c567d518f19c556badce1d2574ebbedf8c
  • Pointer size: 132 Bytes
  • Size of remote file: 6.77 MB
expected_output_openvino.gif ADDED

Git LFS Details

  • SHA256: 001ae7d6abfa928172ecdd570c9d1757a62e93ffdc1dfbbd539ac313091cbeec
  • Pointer size: 132 Bytes
  • Size of remote file: 3.42 MB
export_and_quantize.sh CHANGED
@@ -47,6 +47,14 @@ else
47
  echo "Already present: test_video.mp4"
48
  fi
49
 
 
 
 
 
 
 
 
 
50
  if [[ "${PRECISION}" == "FP32" ]]; then
51
  HALF_FLAG="False"
52
  EXPORT_LABEL="FP32"
 
47
  echo "Already present: test_video.mp4"
48
  fi
49
 
50
+ echo "--- Downloading sample test image ---"
51
+ if [[ ! -f test.jpg ]]; then
52
+ wget -q -O test.jpg https://ultralytics.com/images/bus.jpg
53
+ echo "Downloaded: test.jpg"
54
+ else
55
+ echo "Already present: test.jpg"
56
+ fi
57
+
58
  if [[ "${PRECISION}" == "FP32" ]]; then
59
  HALF_FLAG="False"
60
  EXPORT_LABEL="FP32"