AI & ML interests
ETL, AI, Agents, NLP
Recent Activity
Virgo Machine Labs
We build intelligence engines for data pipelines.
Data engineers are the plumbers of the internet. When a pipeline breaks at 2am, someone gets paged. They open four terminals, dig through four log formats, and spend 47 minutes finding what should take 8 seconds. We built the tool that lets them stop being on-call at 3am.
Toph Engine
Toph Engine is an automated root cause analysis system for enterprise ETL pipelines. It reads logs across every step of a pipeline simultaneously, identifies the origin of a failure, and files a plain-English ticket with the root cause and fix. Nobody gets paged.
Built for health tech and government data teams where the pipeline can't go down.
Open Source
toph-eval
An open taxonomy and evaluation framework for automated pipeline root cause analysis. 63 failure types across 10 categories, derived from observed failure patterns in enterprise health technology pipelines.
Pipeline RCA has no scoring function. The field cannot improve in a direction it cannot measure. By defining the scoring function in the open, we establish a standard against which all systems in this category can be measured.
ā github.com/vaishsagar-cfo/toph-eval
Datasets
| Dataset | Description |
|---|---|
| toph-eval-scenarios | Eval benchmark: 10 simulated pipeline failure scenarios with ground truth answer keys |
| toph-eval-knowledge | RAG knowledge base: structured reference documents for 10 pipeline failure types |