An advanced MLOps lifecycle
Machine learning is exploratory and data-driven [1]. It’s about extracting patterns from data and reapplying those patterns to new data. If all goes well, you’ll get good predictions. The exploratory part is finding the right pattern for the data you intend to predict.
When the data is not well structured or predictable, the lifecycle of MLOps can look very different from mainstream DevOps. Then we see the emergence of a series of MLOps specific methods (Figure 1**).
Figure 1: Image from Seldon/CC BY-SA 4.0 license
Let’s look at each stage in turn. We’ll take a look at the approaches that work and see what MLOps needs to motivate each approach.
training
Training is about finding the best patterns to extract from training data. We encapsulate these patterns in models. The parameters of the training run can be adjusted to produce different models. Encapsulating the best patterns in the model is exploratory (Figure 2**).
Figure 2: Image from Seldon/CC BY-SA 4.0 license
To explore the parameters of the optimal model, it makes sense to run multiple jobs in parallel. This is done in a hosted training environment running a professional training platform. Then, you need to select and package the best model for deployment.
Data is a big reason these platforms are hosted rather than run on data scientists’ laptops. The amount of data can be large. And data is rarely ready to be used in training models from the start. That means a lot of preparation. This can take a lot of time and hardware resources. Preparation operations may also need to be tracked for administrative and repeatability reasons.
The deployment of
When we choose a new model, we need to figure out how to make it work. This means determining whether it really is better than the version already running. It may perform better on training data, but the real-time data may be different ** (Figure 3**).
Figure 3: Image source: Seldon/CC BY-SA 4.0 License
MLOps’ rollout strategy tends to be cautious. Traffic may be split between new and old patterns and monitored for A period of time (using A/B testing or canary). Or the traffic can be replicated so that the new model can receive the request, but only track its response rather than use it (shadow deployment). Then, the new model is rolled out only if it proves to perform well.
We need to know that a model is being implemented safely in order to generalize it. This means that deployment requires monitoring support. We can also see that the deployment may need to support a feedback mechanism for optimal monitoring. Sometimes, a model makes a prediction that turns out to be right or wrong, for example, whether a customer chooses a proposal. To take advantage of this, we need a feedback mechanism.
An advanced example of splitting traffic for optimization is the use of multi-armed bandits. In a multi-armed bandit, traffic is segmented in a way that is constantly adjusted. The best-performing models get most of the traffic, and other models continue to get a small amount of traffic. This is handled by an algorithmic router in the inference diagram. If the data changes later, a poorly performing model may become the dominant model.
Deployment can be closely tied to monitoring. As a result, deployment tools such as Seldon not only support the capabilities of the deployment phase, but also integrate the MLOps requirements of the monitoring phase.
monitoring
Monitoring the accuracy of the model is only possible if you have feedback. This is a good example of monitoring functionality requiring deploy-phase functionality. In some cases, real-time accuracy may be the key metric; in others, custom business metrics may be more important. But that’s only part of the picture.
Figure 4: Image source: Seldon/CC BY-SA 4.0 License
The other side of ML monitoring is to see why a model is performing well or badly. This requires insight into data.
One of the primary reasons model performance can degrade is changes in real-time data. If the data distribution deviates from the training data, performance degrades. This is called data drift or concept drift.
Even if the overall distribution is consistent with the training data, some predictions can still be wildly wrong. This can happen if some individual data points are out of range. These outliers can be damaging in cases where forecasts need to be fully reliable.
Fully understanding why a model makes a certain prediction may require studying how a model makes a prediction, not just the input data. Interpretive techniques can reveal the key patterns on which a model’s predictions depend and tell us which ones apply to a particular case. Achieving this level of insight is itself a data science challenge.
There are different ways to implement advanced monitoring. At Seldon, we make heavy use of asynchronous records requested. Recorded requests can be fed into the detector assembly to detect drifts or outliers. Requests can also be stored for later analysis, such as interpretation.
Understand the lifecycle of MLOps
There are a lot of things we didn’t cover here in the process of working on the MLOps project. We haven’t talked about estimates, schedules, or team composition. We haven’t even looked into the tools yet [2]. Hopefully what we have achieved is an understanding of the key motivations.
We have learned to understand the lifecycle of MLOps in terms of a set of requirements. As we have seen, ML is about extracting patterns from data and reapplying those patterns. Data can be unpredictable, which may mean we have to work carefully, and we have to monitor at the level of data, not just errors.
Links and literature
[1] hackernoon.com/why-is-devo…
[2] github.com/EthicalML/a…
The postNext Level Automationappeared first onDevOps Conference.