Me: I want on device AI: fast, without latency, with real privacy, convenient for use and development.
Microsoft: The best I can do is Copilot+. You need a special Qualcomm chip and Windows 11 24H2. Today I can give you only Recall, taking screenshots and running a visual model to write context about what you are doing in the unencrypted Semantic Index database for embeddings. I'm giving you SLMs Phi Silica, accessible only via API and SDK. In the autumn I can give you the developer tools for C#/C++ and you can use them.
Apple: The best I can do is Apple Intelligence. You need a special Apple chip and macOS 15. Today I can give you only marketing. In the autumn I can give you on-device 3B quantized to 3.5bit mysterious SLMs and diffusion models with LoRA adapters. We will have an encrypted Semantic Index database for embeddings and agentic flows with function calling. We will call all of them with different names. In the autumn I will give you the developer tools in Swift and you can use them.
Open Source: The best I can do is llama.cpp. You can run it on any chip and OS. Today you can run AI inferencing on device and add other open source components for your solution. I can give you local AI models SLMs/LLMs - from wqen2-0.5B to Llama3-70B. You can have an encrypted local embeddings database with PostgreSQL/pgvector or SQLite-Vec. I can give you a wide choice of integrations and open-source components for your solution- from UIs to agentic workflows with function calling. Today I can give you the developer tools in Python/C/C++/Rust/Go/Node.js/JS/C#/Scala/Java and you can use them.
Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.