Santiago Viquez

santiviquez

AI & ML interests

ML @ NannyML. A bit of everything. NLP, RL, and, of course, tabular. In the GenAI era, how can you not love tabular data? Educational content and OSS.

Articles

Organizations

Posts 17

view post
Post
2010
More open research updates 🧡

Performance estimation is currently the best way to quantify the impact of data drift on model performance. πŸ’‘

I've been benchmarking performance estimation methods (CBPE and M-CBPE) against data drift signals.

I'm using drift results as features for many regression algorithms, and then I'm taking those to estimate the model's performance. Finally, I'm measuring the Mean Absolute Error (MAE) between the regression models' predictions and actual performance.

So far, for all my experiments, performance estimation methods do better than drift signals. πŸ‘¨β€πŸ”¬

Bear in mind that these are some early results, I'm running the flow on more datasets as we speak.

Hopefully, by next week, I will have more results to share πŸ‘€
view post
Post
1338
How would you benchmark performance estimation algorithms vs data drift signals?

I'm working on a benchmarking analysis, and I'm currently doing the following:

- Get univariate and multivariate drift signals and measure their correlation with realized performance.
- Use drift signals as features of a regression model to predict the model's performance.
- Use drift signals as features of a classification model to predict a performance drop.
- Compare all the above experiments with results from Performance Estimation algorithms.

Any other ideas?