Image-Text-to-Text
nexaml commited on
Commit
3e8cdd1
·
verified ·
1 Parent(s): 6eda227

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  ## **Introduction**
4
 
5
- **AutoNeural** is a next-generation, **NPU-native multimodal vision–language model** co-designed from the ground up for real-time, on-device inference. Instead of adapting GPU-first architectures, AutoNeural redesigns both **vision encoding** and **language modeling** for the constraints and capabilities of NPUs—achieving **14× faster latency**, **7× lower quantization error**, and **real-time automotive performance** even under aggressive low-precision settings.
6
 
7
  AutoNeural integrates:
8
 
@@ -11,13 +11,13 @@ AutoNeural integrates:
11
  * A **normalization-free MLP connector** tailored for quantization stability.
12
  * Mixed-precision **W8A16 (vision)** and **W4A16 (language)** inference validated on real Qualcomm NPUs.
13
 
14
- AutoNeural powers real-time cockpit intelligence including **in-cabin safety**, **out-of-cabin awareness**, **HMI understanding**, and **visual + conversational function calls**.
15
-
16
  ---
17
 
18
  ## Use Cases
19
 
 
20
 
 
21
 
22
  ---
23
 
@@ -33,6 +33,8 @@ Validated on **Qualcomm SA8295P NPU**:
33
  | **Decode Throughput** | 15 tok/s | **44 tok/s** |
34
  | **Context Length** | 1024 | **4096** |
35
 
 
 
36
  ---
37
 
38
  # **How to Use**
@@ -70,6 +72,8 @@ Multiple images can be processed with a single query.
70
 
71
  ## **Key Features**
72
 
 
 
73
  ### 🔍 **MobileNetV5 Vision Encoder (300M)**
74
 
75
  Optimized for edge hardware, with:
 
2
 
3
  ## **Introduction**
4
 
5
+ **AutoNeural** is an **NPU-native multimodal vision–language model** co-designed from the ground up for real-time, on-device inference on NPU. Instead of adapting GPU-first architectures, AutoNeural redesigns both **vision encoding** and **language modeling** for the constraints and capabilities of NPUs—achieving **14× faster latency**, **3× higher input resolution**, **7× lower quantization error**, and **real-time automotive performance** even under aggressive low-precision settings.
6
 
7
  AutoNeural integrates:
8
 
 
11
  * A **normalization-free MLP connector** tailored for quantization stability.
12
  * Mixed-precision **W8A16 (vision)** and **W4A16 (language)** inference validated on real Qualcomm NPUs.
13
 
 
 
14
  ---
15
 
16
  ## Use Cases
17
 
18
+ AutoNeural powers real-time cockpit intelligence including **in-cabin detection**, **out-cabin awareness**, **HMI understanding**, and **visual + conversational function calls**.
19
 
20
+ ![Use_Cases](https://cdn-uploads.huggingface.co/production/uploads/6851901ea43b4824f79e27a9/hBbX11v67NNsZ7XLsm_RD.png)
21
 
22
  ---
23
 
 
33
  | **Decode Throughput** | 15 tok/s | **44 tok/s** |
34
  | **Context Length** | 1024 | **4096** |
35
 
36
+
37
+
38
  ---
39
 
40
  # **How to Use**
 
72
 
73
  ## **Key Features**
74
 
75
+ ![Model Architecture](https://cdn-uploads.huggingface.co/production/uploads/6851901ea43b4824f79e27a9/eHNdopWWaoir2IP3Cu_AF.png)
76
+
77
  ### 🔍 **MobileNetV5 Vision Encoder (300M)**
78
 
79
  Optimized for edge hardware, with: