I'm thrilled to be releasing the first iteration of a project I've been working on for quite awhile now. It's called Taproot, and it's a seamlessly scalable open-source AI/ML inference engine designed for letting developers build real-time experiences clustered across a small-to-mid-sized cluster, without the burden of hyperscale infrastructure.
Along with the server and task framework is a client library for node and the browser. And what good is a server and client without an app to go alongside it? To that end, I'm also releasing Anachrovox, a fun, real-time hands-free voice assistant that can run on mid-level devices in <12GB VRAM, with web search, weather, and other tools. It uses my real-time browser wake-word library to detect utterances of the phrase 'Hey Vox', 'Hi Vox', 'Okay Vox', 'Anachrovox' or just 'Vox' (alongside some others.)
Releasing this many things at once will definitely result in bugs, so please report them when sighted! Thank you all!
The Anachrovox Spaces are networked together, balancing load across them to keep all front-ends responsive. You only have to choose what color you like the most!
š Meta teams use a fine-tuned Llama model to fix production issues in seconds
One of Meta's engineering teams shared how they use a fine-tuned small Llama (Llama-2-7B, so not even a very recent model) to identify the root cause of production issues with 42% accuracy.
š¤ 42%, is that not too low? ā”ļø Usually, whenever there's an issue in production, engineers dive into recent code changes to find the offending commit. At Meta's scale (thousands of daily changes), this is like finding a needle in a haystack. š” So when the LLM-based suggestion is right, it cuts incident resolution time from hours to seconds!
How did they do it?
š Two-step approach: ā£ Heuristics (code ownership, directory structure, runtime graphs) reduce thousands of potential changes to a manageable set ā£ Fine-tuned Llama 2 7B ranks the most likely culprits
š Training pipeline: ā£ Continued pre-training on Meta's internal docs and wikis ā£ Supervised fine-tuning on past incident investigations ā£ Training data mimicked real-world constraints (2-20 potential changes per incident)
š® Now future developments await: ā£ Language models could handle more of the incident response workflow (runbooks, mitigation, post-mortems) ā£ Improvements in model reasoning should boost accuracy further