alanzhuly commited on
Commit
48fbc9a
1 Parent(s): 39535c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -87,8 +87,13 @@ We enhance the model's contextual understanding using image-based question-answe
87
  **Direct Preference Optimization (DPO):**
88
  The final stage implements DPO by first generating responses to images using the base model. A teacher model then produces minimally edited corrections while maintaining high semantic similarity with the original responses, focusing specifically on accuracy-critical elements. These original and corrected outputs form chosen-rejected pairs. The fine-tuning targeted at essential model output improvements without altering the model's core response characteristics
89
 
90
- ## What's next?
91
- We are continually improving Omnivision for better on-device performance. Stay tuned.
 
 
 
 
 
92
 
93
  ### Follow us
94
  [Blogs](https://nexa.ai) | [Discord](https://discord.gg/nexa-ai) | [X(Twitter)](https://x.com/alanzhuly)
 
87
  **Direct Preference Optimization (DPO):**
88
  The final stage implements DPO by first generating responses to images using the base model. A teacher model then produces minimally edited corrections while maintaining high semantic similarity with the original responses, focusing specifically on accuracy-critical elements. These original and corrected outputs form chosen-rejected pairs. The fine-tuning targeted at essential model output improvements without altering the model's core response characteristics
89
 
90
+ ## What's next for Omnivision?
91
+ Omnivision is in early development and we are working to address current limitations:
92
+ - Expand DPO Training: Increase the scope of DPO (Direct Preference Optimization) training in an iterative process to continually improve model performance and response quality.
93
+ - Develop an Action + Conversation Model: Leverage Omnivision’s vision and conversational capacities to build an action model capable of understanding and interacting with visual and text inputs.
94
+ - Improve document and text understanding
95
+
96
+ In the long term, we aim to develop Omnivision as a fully optimized, production-ready solution for edge AI multimodal applications.
97
 
98
  ### Follow us
99
  [Blogs](https://nexa.ai) | [Discord](https://discord.gg/nexa-ai) | [X(Twitter)](https://x.com/alanzhuly)