OSError
keep getting this error “OSError: Couldn't reach server at './model/layoutlmv3-base-finetuned-publaynet/config.json' to download configuration file or configuration file is not a valid JSON file. Please check network or file content here: ./model/layoutlmv3-base-finetuned-publaynet/config.json.”
when I run the sugested command in md file in https://github.com/microsoft/unilm/tree/master/layoutlmv3 for testing the fine-tuned model for object detection
python train_net.py --config-file cascade_layoutlmv3.yaml --eval-only --num-gpus 1
MODEL.WEIGHTS /path/to/layoutlmv3-base-finetuned-publaynet/model_final.pth
OUTPUT_DIR /path/to/layoutlmv3-base-finetuned-publaynet
I downloaded the model by using wget https://huggingface.co/HYPJUDY/layoutlmv3-base-finetuned-publaynet
As the error message suggested, it seems that your './model/layoutlmv3-base-finetuned-publaynet/config.json' path is invalid. Have you checked the file content in the path? If there is no valid config.json
file in the path, please try to download files (config.json, config.yaml, model_final.pth) manually and saved them to your path.
Good luck with your experiments!
The model needs to be used for the object detection task. For this task, the model needs segmentation but our dataset is annotated by just bounding boxes around the objects. Is there a tool recommended to do the segmentation the same as it is expected to use for this model?
We did not use segmentation data for the object detection task. However, if you only have bounding boxes and need segmentation data, you can also treat the bounding boxes as rectangular segmentation boxes.
When I ran the pipeline in the git for object detection task (https://github.com/microsoft/unilm/tree/master/layoutlmv3/examples/object_detectio), it throws an error asking for the segmentations values in the JSON file for the data. It is also mentioned in the paper that for PubLayNet dataset which is used for object detection task, "PubLayNet dataset [51].The dataset contains research paper images, each annotated with bounding boxes and polygonal segmentation across five typical document layout categories: text, title, list, figure, and table." Anyway, if you think that treating the bounding boxes as rectangular segmentation boxes work, how can we generate the x,y coordinates of the pixels within the bounding boxes to use them as segmentations for training?