Update README.md
Browse files
README.md
CHANGED
@@ -87,8 +87,13 @@ We enhance the model's contextual understanding using image-based question-answe
|
|
87 |
**Direct Preference Optimization (DPO):**
|
88 |
The final stage implements DPO by first generating responses to images using the base model. A teacher model then produces minimally edited corrections while maintaining high semantic similarity with the original responses, focusing specifically on accuracy-critical elements. These original and corrected outputs form chosen-rejected pairs. The fine-tuning targeted at essential model output improvements without altering the model's core response characteristics
|
89 |
|
90 |
-
## What's next?
|
91 |
-
|
|
|
|
|
|
|
|
|
|
|
92 |
|
93 |
### Follow us
|
94 |
[Blogs](https://nexa.ai) | [Discord](https://discord.gg/nexa-ai) | [X(Twitter)](https://x.com/alanzhuly)
|
|
|
87 |
**Direct Preference Optimization (DPO):**
|
88 |
The final stage implements DPO by first generating responses to images using the base model. A teacher model then produces minimally edited corrections while maintaining high semantic similarity with the original responses, focusing specifically on accuracy-critical elements. These original and corrected outputs form chosen-rejected pairs. The fine-tuning targeted at essential model output improvements without altering the model's core response characteristics
|
89 |
|
90 |
+
## What's next for Omnivision?
|
91 |
+
Omnivision is in early development and we are working to address current limitations:
|
92 |
+
- Expand DPO Training: Increase the scope of DPO (Direct Preference Optimization) training in an iterative process to continually improve model performance and response quality.
|
93 |
+
- Develop an Action + Conversation Model: Leverage Omnivision’s vision and conversational capacities to build an action model capable of understanding and interacting with visual and text inputs.
|
94 |
+
- Improve document and text understanding
|
95 |
+
|
96 |
+
In the long term, we aim to develop Omnivision as a fully optimized, production-ready solution for edge AI multimodal applications.
|
97 |
|
98 |
### Follow us
|
99 |
[Blogs](https://nexa.ai) | [Discord](https://discord.gg/nexa-ai) | [X(Twitter)](https://x.com/alanzhuly)
|