--- license: apache-2.0 tags: - RyzenAI - object-detection - vision - YOLO - anchor-free - pytorch datasets: - coco metrics: - mAP --- # YOLOX-small model trained on COCO YOLOX-small is the small version of YOLOX model trained on COCO object detection (118k annotated images) at resolution 640x640. It was introduced in the paper [YOLOX: Exceeding YOLO Series in 2021](https://arxiv.org/abs/2107.08430) by Zheng Ge et al. and first released in [this repository](https://github.com/Megvii-BaseDetection/YOLOX). We develop a modified version that could be supported by [AMD Ryzen AI](https://ryzenai.docs.amd.com). ## Model description Based on YOLO detector, the YOLOX model adopts anchor-free head and conducts other advanced detection techniques including decoupled head and the leading label assignment strategy SimOTA to achieve state-of-the-art results across a large scale range of models. The series of models were developed by Megvii Inc. and won the 1st Place on Streaming Perception Challenge (WAD at CVPR 2021). ## Intended uses & limitations You can use the raw model for object detection. See the [model hub](https://huggingface.co/models?search=amd/yolox) to look for all available YOLOX models. ## How to use ### Installation Follow [Ryzen AI Installation](https://ryzenai.docs.amd.com/en/latest/inst.html) to prepare the environment for Ryzen AI. Run the following script to install pre-requisites for this model. ```sh pip install -r requirements.txt ``` ### Data Preparation (optional: for accuracy evaluation) The dataset MSCOCO2017 contains 118287 images for training and 5000 images for validation. Download the validation set of COCO dataset ([val2017.zip](http://images.cocodataset.org/zips/val2017.zip) and [annotations_trainval2017.zip](http://images.cocodataset.org/annotations/annotations_trainval2017.zip)). Then unzip the files and move them to the following directories (or create soft links): ```plain └── data └── COCO ├── annotations | ├── instances_val2017.json | └── ... └── val2017 ├── 000000000139.jpg ├── 000000000285.jpg └── ... ``` ### Test & Evaluation - Code snippet from [`infer_onnx.py`](infer_onnx.py) on how to use ```python args = make_parser().parse_args() input_shape = tuple(map(int, args.input_shape.split(','))) origin_img = cv2.imread(args.image_path) img, ratio = preprocess(origin_img, input_shape) if args.ipu: providers = ["VitisAIExecutionProvider"] provider_options = [{"config_file": args.provider_config}] else: providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] provider_options = None session = ort.InferenceSession(args.model, providers=providers, provider_options=provider_options) # NCHW format # ort_inputs = {session.get_inputs()[0].name: img[None, :, :, :]} # NHWC format ort_inputs = {session.get_inputs()[0].name: np.transpose(img[None, :, :, :], (0, 2 ,3, 1))} outputs = session.run(None, ort_inputs) outputs = [np.transpose(out, (0, 3, 1, 2)) for out in outputs] # for NHWC format dets = postprocess(outputs, input_shape, ratio) if dets is not None: final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5] origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds, conf=args.score_thr, class_names=COCO_CLASSES) mkdir(args.output_dir) output_path = os.path.join(args.output_dir, os.path.basename(args.image_path)) cv2.imwrite(output_path, origin_img) ``` - Run inference for a single image ```sh python infer_onnx.py -m yolox-s-int8.onnx -i Path\To\Your\Image --ipu --provider_config Path\To\vaip_config.json ``` *Note: __vaip_config.json__ is located at the setup package of Ryzen AI (refer to [Installation](#installation))* - Test accuracy of the quantized model ```sh python eval_onnx.py -m yolox-s-int8.onnx --ipu --provider_config Path\To\vaip_config.json ``` ### Performance |Metric | Accuracy on IPU| | :----: | :----: | |AP\@0.50:0.95|0.370| ```bibtex @article{yolox2021, title={YOLOX: Exceeding YOLO Series in 2021}, author={Ge, Zheng and Liu, Songtao and Wang, Feng and Li, Zeming and Sun, Jian}, journal={arXiv preprint arXiv:2107.08430}, year={2021} } ```