@Cuiunbo on Hugging Face: "Introducing GUICourse! 🎉 By leveraging extensive OCR pretraining with…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

Cuiunbo

posted an update Jun 11, 2024

Post

2541

Introducing GUICourse! 🎉
By leveraging extensive OCR pretraining with grounding ability, we unlock the potential of parsing-free methods for GUIAgent.
📄 Paper: ( GUICourse: From General Vision Language Models to Versatile GUI Agents (2406.11317))
🌐 Github Repo: (https://github.com/yiye3/GUICourse)
📖 Dataset: ( yiye2023/GUIAct) / ( yiye2023/GUIChat) / ( yiye2023/GUIEnv)
🎯 Model: ( RhapsodyAI/minicpm-guidance) / ( RhapsodyAI/qwen_vl_guidance)

anothercoder2

Jul 1, 2024

Any idea when the models might be released?
Also the GUIEnv-local dataset download and viewer has errors.
Any plans to release the GUIEnv-global dataset?
Thanks

Cuiunbo

Jul 1, 2024

In this week, we'll be releasing our model!
But because GUIEnv-Global is too large, we will only open source our rendering code.

In this post

Cuiunbo Cuiunbo
anothercoder2 Another Coder