Director of Machine Learning Insights [Part 4]

Published November 23, 2022

If you're interested in building ML solutions faster visit: hf.co/support today!

👋 Welcome back to our Director of ML Insights Series! If you missed earlier Editions you can find them here:

🚀 In this fourth installment, you’ll hear what the following top Machine Learning Directors say about Machine Learning’s impact on their respective industries: Javier Mansilla, Shaun Gittens, Samuel Franklin, and Evan Castle. —All are currently Directors of Machine Learning with rich field insights.

Disclaimer: All views are from individuals and not from any past or current employers.

Javier Mansilla

Background: Seasoned entrepreneur and leader, Javier was co-founder and CTO of Machinalis, a high-end company building Machine Learning since 2010 (yes, before the breakthrough of neural nets). When Machinalis was acquired by Mercado Libre, that small team evolved to enable Machine Learning as a capability for a tech giant with more than 10k devs, impacting the lives of almost 100 million direct users. Daily, Javier leads not only the tech and product roadmap of their Machine Learning Platform (NASDAQ MELI), but also their users' tracking system, the AB Testing framework, and the open-source office. Javier is an active member & contributor of Python-Argentina non-profit PyAr, he loves hanging out with family and friends, python, biking, football, carpentry, and slow-paced holidays in nature!

Fun Fact: I love reading science fiction, and my idea of retirement includes resuming the teenage dream of writing short stories.📚

Mercado Libre: The biggest company in Latam and the eCommerce & fintech omnipresent solution for the continent

1. How has ML made a positive impact on e-commerce?

I would say that ML made the impossible possible in specific cases like fraud prevention and optimized processes and flows in ways we couldn't have imagined in a vast majority of other areas.

In the middle, there are applications where ML enabled a next-level of UX that otherwise would be very expensive (but maybe possible). For example, the discovery and serendipity added to users' journey navigating between listings and offers.

We ran search, recommendations, ads, credit-scoring, moderations, forecasting of several key aspects, logistics, and a lot more core units with Machine Learning optimizing at least one of its fundamental metrics.

We even use ML to optimize the way we reserve and use infrastructure.

2. What are the biggest ML challenges within e-commerce?

Besides all the technical challenges ahead (for instance, more and more real timeless and personalization), the biggest challenge is the always present focus on the end-user.

E-commerce is scaling its share of the market year after year, and Machine Learning is always a probabilistic approach that doesn't provide 100% perfection. We need to be careful to keep optimizing our products while still paying attention to the long tail and the experience of each individual person.

Finally, a growing challenge is coordinating and fostering data (inputs and outputs) co-existence in a multi-channel and multi-business world—marketplace, logistics, credits, insurance, payments on brick-and-mortar stores, etc.

3. A common mistake you see people make trying to integrate ML into e-commerce?

The most common mistakes are related to using the wrong tool for the wrong problem.

For instance, starting complex instead of with the simplest baseline possible. For instance not measuring the with/without machine learning impact. For instance, investing in tech without having a clear clue of the boundaries of the expected gain.

Last but not least: thinking only in the short term, forgetting about the hidden impacts, technical debts, maintenance, and so on.

4. What excites you most about the future of ML?

Talking from the perspective of being on the trench crafting technology with our bare hands like we used to do ten years ago, definitely what I like the most is to see that we as an industry are solving most of the slow, repetitive and boring pieces of the challenge.

It’s of course an ever-moving target, and new difficulties arise. But we are getting better at incorporating mature tools and practices that will lead to shorter cycles of model-building which, at the end of the day, reduces time to market.

Shaun Gittens

Background: Dr. Shaun Gittens is the Director of the Machine Learning Capability of MasterPeace Solutions, Ltd., a company specializing in providing advanced technology and mission-critical cyber services to its clients. In this role, he is:

Growing the core of machine learning experts and practitioners at the company.
Increasing the knowledge of bleeding-edge machine learning practices among its existing employees.
Ensuring the delivery of effective machine learning solutions and consulting support not only to the company’s clientele but also to the start-up companies currently being nurtured from within MasterPeace. Before joining MasterPeace, Dr. Gittens served as Principal Data Scientist for the Applied Technology Group, LLC. He built his career on training and deploying machine learning solutions on distributed big data and streaming platforms such as Apache Hadoop, Apache Spark, and Apache Storm. As a postdoctoral fellow at Auburn University, he investigated effective methods for visualizing the knowledge gained from trained non-linear machine-learned models.

Fun Fact: Addicted to playing tennis & Huge anime fan. 🎾

MasterPeace Solutions: MasterPeace Solutions has emerged as one of the fastest-growing advanced technology companies in the Mid-Atlantic region. The company designs and develops software, systems, solutions and products to solve some of the most pressing challenges facing the Intelligence Community.

1. How has ML made a positive impact on Engineering?

Engineering is vast in its applications and can encompass a great many areas. That said, more recently, we are seeing ML affect a range of engineering facets addressing obvious fields such as robotics and automobile engineering to not-so-obvious fields such as chemical and civil engineering. ML is so broad in its application that merely the very existence of training data consisting of prior recorded labor processes is all required to attempt to have ML affect your bottom line. In essence, we are in an age where ML has significantly impacted the automation of all sorts of previously human-only-operated engineering processes.

2. What are the biggest ML challenges within Engineering?

The biggest challenges come with the operationalization and deployment of ML-trained solutions in a manner in which human operations can be replaced with minimal consequences. We’re seeing it now with fully self-driving automobiles. It’s challenging to automate processes with little to no fear of jeopardizing humans or processes that humans rely on. One of the most significant examples of this phenomenon that concerns me is ML and Bias. It is a reality that ML models trained on data containing, even if unaware, prejudiced decision-making can reproduce said bias in operation. Bias needs to be put front and center in the attempt to incorporate ML into engineering such that systemic racism isn’t propagated into future technological advances to then cause harm to disadvantaged populations. ML systems trained on data emanating from biased processes are doomed to repeat them, mainly if those training the ML solutions aren’t acutely aware of all forms of data present in the process to be automated.
Another critical challenge regarding ML in engineering is that the field is mainly categorized by the need for problem-solving, which often requires creativity. As of now, few great cases exist today of ML agents being truly “creative” and capable of “thinking out-of-the-box” since current ML solutions tend to result merely from a search through all possible solutions. In my humble opinion, though a great many solutions can be found via these methods, ML will have somewhat of a ceiling in engineering until the former can consistently demonstrate creativity in a variety of problem spaces. That said, that ceiling is still pretty high, and there is much left to be accomplished in ML applications in engineering.

3. What’s a common mistake you see people make when trying to integrate ML into Engineering?

Using an overpowered ML technique on a small problem dataset is one common mistake I see people making in integrating ML into Engineering. Deep Learning, for example, is moving AI and ML to heights unimagined in such a short period, but it may not be one’s best method for solving a problem, depending on your problem space. Often more straightforward methods work just as well or better when working with small training datasets on limited hardware.

Also, not setting up an effective CI/CD (continuous integration/ continuous deployment) structure for your ML solution is another mistake I see. Very often, a once-trained model won’t suffice not only because data changes over time but resources and personnel do as well. Today’s ML practitioner needs to:

secure consistent flow of data as it changes and continuously retrain new models to keep it accurate and useful,
ensure the structure is in place to allow for seamless replacement of older models by newly trained models while,
allowing for minimal disruption to the consumer of the ML model outputs.

4. What excites you most about the future of ML?

The future of ML continues to be exciting and seemingly every month there are advances reported in the field that even wow the experts to this day. As 1) ML techniques improve and become more accessible to established practitioners and novices alike, 2) everyday hardware becomes faster, 3) power consumption becomes less problematic for miniaturized edge devices, and 4) memory limitations diminish over time, the ceiling for ML in Engineering will be bright for years to come.

Samuel Franklin

Background: Samuel is a senior Data Science and ML Engineering leader at Pluralsight with a Ph.D. in cognitive science. He leads talented teams of Data Scientists and ML Engineers building intelligent services that power Pluralsight’s Skills platform.

Outside the virtual office, Dr. Franklin teaches Data Science and Machine Learning seminars for Emory University. He also serves as Chairman of the Board of Directors for the Atlanta Humane Society.

Fun Fact: I live in a log cabin on top of a mountain in the Appalachian range.

Pluralsight: We are a technology workforce development company and our Skills platform is used by 70% of the Fortune 500 to help their employees build business-critical tech skills.

1. How has ML made a positive impact on Education?

Online, on-demand educational content has made lifelong learning more accessible than ever for billions of people globally. Decades of cognitive research show that the relevance, format, and sequence of educational content significantly impact students’ success. Advances in deep learning content search and recommendation algorithms have greatly improved our ability to create customized, efficient learning paths at-scale that can adapt to individual student’s needs over time.

2. What are the biggest ML challenges within Education?

I see MLOps technology as a key opportunity area for improving ML across industries. The state of MLOps technology today reminds me of the Container Orchestration Wars circa 2015-16. There are competing visions for the ML Train-Deploy-Monitor stack, each evangelized by enthusiastic communities and supported by large organizations. If a predominant vision eventually emerges, then consensus on MLOps engineering patterns could follow, reducing the decision-making complexity that currently creates friction for ML teams.

3. What’s a common mistake you see people make trying to integrate ML into existing products?

There are two critical mistakes that I’ve seen organizations of all sizes make when getting started with ML. The first mistake is underestimating the importance of investing in senior leaders with substantial hands-on ML experience. ML strategy and operations leadership benefits from a depth of technical expertise beyond what is typically found in the BI / Analytics domain or provided by educational programs that offer a limited introduction to the field. The second mistake is waiting too long to design, test, and implement production deployment pipelines. Effective prototype models can languish in repos for months – even years – while waiting on ML pipeline development. This can impose significant opportunity costs on an organization and frustrate ML teams to the point of increasing attrition risk.

4. What excites you most about the future of ML?

I’m excited about the opportunity to mentor the next generation of ML leaders. My career began when cloud computing platforms were just getting started and ML tooling was much less mature than it is now. It was exciting to explore different engineering patterns for ML experimentation and deployment, since established best practices were rare. But, that exploration included learning too many technical and people leadership lessons the hard way. Sharing those lessons with the next generation of ML leaders will help empower them to advance the field farther and faster than what we’ve seen over the past 10+ years.

Evan Castle

Background: Over a decade of leadership experience in the intersection of data science, product, and strategy. Evan worked in various industries, from building risk models at Fortune 100s like Capital One to launching ML products at Sisense and Elastic.

Fun Fact: Met Paul McCarthy. 🎤

1. How has ML made a positive impact on SaaS?

Machine learning has become truly operational in SaaS, powering multiple uses from personalization, semantic and image search, recommendations to anomaly detection, and a ton of other business scenarios. The real impact is that ML comes baked right into more and more applications. It's becoming an expectation and more often than not it's invisible to end users. For example, at Elastic we invested in ML for anomaly detection, optimized for endpoint security and SIEM. It delivers some heavy firepower out of the box with an amalgamation of different techniques like time series decomposition, clustering, correlation analysis, and Bayesian distribution modeling. The big benefit for security analysts is threat detection is automated in many different ways. So anomalies are quickly bubbled up related to temporal deviations, unusual geographic locations, statistical rarity, and many other factors. That's the huge positive impact of integrating ML.

2. What are the biggest ML challenges within SaaS?

To maximize the benefits of ML there is a double challenge of delivering value to users that are new to machine learning and also to seasoned data scientists. There's obviously a huge difference in demands for these two folks. If an ML capability is a total black box it's likely to be too rigid or simple to have a real impact. On the other hand, if you solely deliver a developer toolkit it's only useful if you have a data science team in-house. Striking the right balance is about making sure ML is open enough for the data science team to have transparency and control over models and also packing in battle-tested models that are easy to configure and deploy without being a pro.

3. What’s a common mistake you see people make trying to integrate ML into SaaS?

To get it right, any integrated model has to work at scale, which means support for massive data sets while ensuring results are still performant and accurate. Let's illustrate this with a real example. There has been a surge in interest in vector search. All sorts of things can be represented in vectors from text, and images to events. Vectors can be used to capture similarities between content and are great for things like search relevance and recommendations. The challenge is developing algorithms that can compare vectors taking into account trade-offs in speed, complexity, and cost. At Elastic, we spent a lot of time evaluating and benchmarking the performance of models for vector search. We decided on an approach for the approximate nearest neighbor (ANN) algorithm called Hierarchical Navigable Small World graphs (HNSW), which basically maps vectors into a graph based on their similarity to each other. HNSW delivers an order of magnitude increase in speed and accuracy across a variety of ANN-benchmarks. This is just one example of non-trivial decisions more and more product and engineering teams need to take to successfully integrate ML into their products.

4. What excites you most about the future of ML?

Machine learning will become as simple as ordering online. The big advances in NLP especially have made ML more human by understanding context, intent, and meaning. I think we are in an era of foundational models that will blossom into many interesting directions. At Elastic we are thrilled with our own integration to Hugging Face and excited to already see how our customers are leveraging NLP for observability, security, and search.

🤗 Thank you for joining us in this fourth installment of ML Director Insights.

Big thanks to Javier Mansilla, Shaun Gittens, Samuel Franklin, and Evan Castle for their brilliant insights and participation in this piece. We look forward to watching your continued success and will be cheering you on each step of the way. 🎉

If you're' interested in accelerating your ML roadmap with Hugging Face Experts please visit hf.co/support to learn more.

Evaluating Audio Reasoning with Big Bench Audio

By December 20, 2024 guest • 20

Finally, a Replacement for BERT: Introducing ModernBERT

By December 19, 2024 guest • 539

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote