File size: 2,516 Bytes
5209465 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
# Inference hrnet
Inferencing the deep-high-resolution-net.pytoch without using Docker.
## Prep
1. Download the researchers' pretrained pose estimator from [google drive](https://drive.google.com/drive/folders/1hOTihvbyIxsm5ygDpbUuJ7O_tzv4oXjC?usp=sharing) to this directory under `models/`
2. Put the video file you'd like to infer on in this directory under `videos`
3. (OPTIONAL) build the docker container in this directory with `./build-docker.sh` (this can take time because it involves compiling opencv)
4. update the `inference-config.yaml` file to reflect the number of GPUs you have available and which trained model you want to use.
## Running the Model
### 1. Running on the video
```
python demo/inference.py --cfg demo/inference-config.yaml \
--videoFile ../../multi_people.mp4 \
--writeBoxFrames \
--outputDir output \
TEST.MODEL_FILE ../models/pytorch/pose_coco/pose_hrnet_w32_256x192.pth
```
The above command will create a video under *output* directory and a lot of pose image under *output/pose* directory.
Even with usage of GPU (GTX1080 in my case), the person detection will take nearly **0.06 sec**, the person pose match will
take nearly **0.07 sec**. In total. inference time per frame will be **0.13 sec**, nearly 10fps. So if you prefer a real-time (fps >= 20)
pose estimation then you should try other approach.
**===Result===**
Some output images are as:
![1 person](inference_1.jpg)
Fig: 1 person inference
![3 person](inference_3.jpg)
Fig: 3 person inference
![3 person](inference_5.jpg)
Fig: 3 person inference
### 2. Demo with more common functions
Remember to update` TEST.MODEL_FILE` in `demo/inference-config.yaml `according to your model path.
`demo.py` provides the following functions:
- use `--webcam` when the input is a real-time camera.
- use `--video [video-path]` when the input is a video.
- use `--image [image-path]` when the input is an image.
- use `--write` to save the image, camera or video result.
- use `--showFps` to show the fps (this fps includes the detection part).
- draw connections between joints.
#### (1) the input is a real-time carema
```python
python demo/demo.py --webcam --showFps --write
```
#### (2) the input is a video
```python
python demo/demo.py --video test.mp4 --showFps --write
```
#### (3) the input is a image
```python
python demo/demo.py --image test.jpg --showFps --write
```
**===Result===**
![show_fps](inference_6.jpg)
Fig: show fps
![multi-people](inference_7.jpg)
Fig: multi-people |