Spaces:

Netrava
/

omniparser-api

Runtime error

App Files Files Community

Netrava commited on Aug 3, 2025

Commit

aa9ab92

verified ·

1 Parent(s): 0b851ec

Delete README.md

Browse files

Files changed (1) hide show

README.md +0 -101

README.md DELETED Viewed

@@ -1,101 +0,0 @@
----
-title: OmniParser v2.0 API
-emoji: 🖼️
-colorFrom: blue
-colorTo: indigo
-sdk: gradio
-sdk_version: 4.0.0
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
-# OmniParser v2.0 API
-This is a public API endpoint for Microsoft's OmniParser v2.0, which can parse UI screenshots and return structured data.
-## Features
-- Parses UI screenshots into structured JSON data
-- Identifies interactive elements (buttons, menus, icons, etc.)
-- Provides captions describing the functionality of each element
-- Returns visualization of detected elements
-- Accessible via a simple REST API
-## API Usage
-You can use this API by sending a POST request with a file upload:
-```python
-import requests
-# Replace with your actual API URL after deployment
-OMNIPARSER_API_URL = "https://your-username-omniparser-api.hf.space/api/parse"
-# Upload a file
-files = {'image': open('screenshot.png', 'rb')}
-# Send request
-response = requests.post(OMNIPARSER_API_URL, files=files)
-# Get JSON result
-result = response.json()
-# Access parsed elements
-elements = result["elements"]
-for element in elements:
-    print(f"Element {element['id']}: {element['text']} - {element['caption']}")
-    print(f"Coordinates: {element['coordinates']}")
-    print(f"Interactable: {element['is_interactable']}")
-    print(f"Confidence: {element['confidence']}")
-    print("---")
-# Access visualization (base64 encoded image)
-visualization_base64 = result["visualization"]
-```
-## Response Format
-The API returns a JSON object with the following structure:
-```json
-{
-  "status": "success",
-  "elements": [
-    {
-      "id": 0,
-      "text": "Button 1",
-      "caption": "Click to submit form",
-      "coordinates": [0.1, 0.1, 0.3, 0.2],
-      "is_interactable": true,
-      "confidence": 0.95
-    },
-    {
-      "id": 1,
-      "text": "Menu",
-      "caption": "Navigation menu",
-      "coordinates": [0.4, 0.5, 0.6, 0.6],
-      "is_interactable": true,
-      "confidence": 0.87
-    }
-  ],
-  "visualization": "base64_encoded_image_string"
-}
-```
-## Deployment
-This API is deployed on Hugging Face Spaces using Gradio. The deployment is free and provides a public URL that you can use in your applications.
-## Credits
-This API uses Microsoft's OmniParser v2.0, which is a screen parsing tool for pure vision-based GUI agents. For more information, visit the [OmniParser GitHub repository](https://github.com/microsoft/OmniParser).
-## License
-Please note that the OmniParser models have specific licenses:
-- icon_detect model is under AGPL license
-- icon_caption is under MIT license
-Please refer to the LICENSE file in the folder of each model in the original repository.