MLOps has become a buzzword in the machine learning community. However, it’s often thrown around without a full grasp of its meaning.
Real-world machine learning systems have multiple components, most of which don’t include the code itself. To effectively develop and maintain such complex systems, crucial DevOps principles were adopted. This has led to the creation of Machine Learning Operations or MLOps for short.
This article will provide a thorough explanation of MLOps and its importance for machine learning teams, based on extensive research and analysis of multiple sources.
Here’s what we’ll cover:
Train ML models and solve any computer vision task faster with V7.
Don't start empty-handed. Explore our repository of 500+ open datasets and test-drive V7's tools.
And if you're ready to start automating your machine learning pipeline right away, check out:
Machine learning operations (MLOps) is a new paradigm and set of practices that help organize, maintain and build machine learning systems. It aims to move machine learning models from design to production with agility and minimal cost, while also monitoring that models meet the expected goals.
CDFoundtation states that it’s “the extension of the DevOps methodology to include machine learning and sata science assets as first-class citizens within the DevOps ecology.”
Machine learning operations emphasize automation, reproducibility, traceability, and quality assurance of machine learning pipelines and models.
MLOps pipelines cover several stages of the machine learning lifecycles. However, you may choose to implement MLOPs methodologies on only certain parts of the machine learning lifecycle:
Since machine learning operations was inspired by DevOps, they share many similarities.
While DevOps focuses on software systems as a whole, MLOps places particular emphasis on machine learning models. It requires specialized treatment and high expertise due to the significance of data and models in the systems.
To understand MLOPs better, it helps to emphasize the core differences between MLOps and DevOps.
DevOps teams are populated mainly by software engineers and system administrators. In contrast, MLOPs teams are more diverse—they must include:
Scoping entails the preparation for the project. Before starting, you must decide if a given problem requires a machine learning solution—and if it does, what kind of machine learning models are suitable. Are datasets available or need to be gathered? Are they representative of reality or biased? What tradeoffs need to be respected (e.g., precision vs. inference speed)? Which deployment method fits the best? A machine learning operations team needs to address these issues and plan a project’s roadmap accordingly.
Being able to reproduce models, results, and even bugs is essential in any software development project. Using code versioning tools is a must.
However, in machine learning and data science, versioning of datasets and models is also essential. You must ensure your datasets and models are tied to specific code versions.
Open source data versioning tools such as DVC or MLOPs platforms are crucial to any machine learning operations pipeline. In contrast, DevOps pipelines rarely need to deal with data or models.
The MLOps community adopted all the basic principles of the unit and integration testing from DevOps.
However, the MLOPs pipeline must also include tests for both model and data validation. Ensuring the training and serving data are in the correct state to be processed is essential. Moreover, model tests guarantee that deployed models meet the expected criteria for success.
The example above shows an automated ML workflow that compares the performance of two versions of an object detection model. If the results are not overlapping, a data scientist can review them to gather insights.
Deploying offline-trained models as a prediction service is rarely suitable for most ML products. Multi-step ml pipelines responsible for retraining and deployment must be deployed instead. This complexity requires automation of previously manual tasks performed by data scientists.
Model evaluation needs to be a continuous process. Not unlike food and other products, machine learning models have expiration dates. Seasonality and data drifts can degrade the performance of a live model. Ensuring production models are up-to-date and on par with the anticipated performance is crucial.
MLOps pipelines must include automated processes that frequently evaluate models and trigger retraining processes when necessary. This is an essential step to implementing machine learning feedback loops. For example, in computer vision tasks Mean Average Precision can be used as one of the key metrics.
Implementing MLOPs benefits your organization's machine learning system in the following ways:
Since the field is relatively young and best practices are still being developed, organizations face many challenges in implementing MLOPs. Let’s go through three main challenge verticals.
The current state of ML culture is model-driven. Research revolves around devising intricate models and topping benchmark datasets, while education focuses on mathematics and model training. However, the ML community should devote some of its attention to training on up-to-date open-source production technologies.
Adopting a product-oriented culture in industrial ML is still an ongoing process that meets resistance, which might make it more difficult to adopt it into an organization seamlessly.
Moreover, the multi-disciplinary nature of MLOPs teams creates friction. Highly specialized terminology across different IT fields and differing levels of knowledge make communication inside hybrid teams difficult. Additionally, forming hybrid teams consisting of data scientists, MLEs, DevOps, and SWEs is very costly and time-consuming.
Most machine learning models are served on the cloud with requests by users. Demand may be high during certain periods and fall back drastically during others.
Dealing with a fluctuating demand in the most cost-efficient way is an ongoing challenge. Architecture and system designers also have to deal with developing infrastructure solutions that offer flexibility and the potential for fast scaling.
Machine learning operations lifecycles generate many artifacts, metadata, and logs. Managing all these artifacts with efficiency and structure is a difficult task.
The reproducibility of operations is still an ongoing challenge. Better practices and tools are being continuously invented.
Let’s go through a few of the MLOPs best practices, sorted by the stages of the pipeline.
Data gathering and validation
Exploratory data analysis (EDA)
Data prep and feature engineering
Model training and tuning
Model review and governance
Model deployment and monitoring
Model inference and serving
Automated model retraining
According to Google, there are three levels of MLOPs, depending on the automation scale for each step of the pipeline. Let’s go through each of them.
Level 0 includes setting up a basic machine learning pipeline.
This is the starting point for most practitioners. The workflow is fully manual. If scripts are used, they are often executed and require ad-hoc changes for different experiments. It’s the basic level of maturity and the bare minimum to start building an ML product.
This manual pipeline takes care of EDA, data preparation, model training, evaluation, fine-tuning, and deployment. Deployment is usually in the form of a simple prediction service API. Logging, model and experiment tracking are either absent or implemented in inefficient ways, such as storage in .csv files.
Researchers and organizations who are just starting with ML use machine learning as a very small part of their product/service. This pipeline may work if models rarely need to be updated.
Basic deployment. Only the model is deployed as a prediction service (such as REST API).
Any organization that wishes to scale up its machine learning services or requires frequent model updates must implement MLOPs at level 1.
This level solves and automates the process of training ML models through continuous training (CT) pipelines. Orchestrated experiments take care of the training. Feedback loops assure high model quality.
New data is automatically processed and prepared for training. Production models are monitored, and retraining pipelines are triggered upon detecting performance drops are detected. Manual interventions are minimal inside the orchestrated experiments.
Organizations that operate in fast-changing environments, such as trading or media, that must update their models constantly (on a daily or even hourly basis). Moreover, data is often characterized by seasonality, so all trends must be taken into account to ensure high-quality production models.
To achieve level 1 of MLOPs, you need to set up:
Complementing continuous training with CI/CD allows data scientists to rapidly experiment with feature engineering, new model architectures, and hyperparameters. The CI/CD pipeline will automatically build, test, and deploy the new pipeline components.
The whole system is very robust, version controlled, reproducible, and easier to scale up.
Any organization that bears ML as its core product and requires constant innovation. It allows for rapid experimentation on every part of the ML pipeline while being robust and reproducible.
MLOPS level 2 builds upon the components introduced in level 2, with the following new core components added:
CI. Continuous integration is responsible for automatically building source code, running tests, and packaging after new code is committed. The result is ML pipeline components ready for deployment. Additionally, it can include tests such as:
CD. The role of continuous deployment is two-fold. It consists of Pipeline continuous delivery (fig. 5 #3) and Model continuous delivery (fig. 5 #5). The former deploys the whole pipeline with the new model implemented. The latter serves the model as a prediction service.
Implementing MLOPs pipelines in your organization allows you to cope with rapid changes in your data and business environment. It fosters innovation and ensures a high-quality ML product. Both small-scale and large-scale organizations should be motivated to set up MLOps pipelines.
Implementing MLOPs pipelines and reaching high MLOPs maturity levels is a gradual process. MLOps pipelines can be built using open-source tools, but since the cost and time investment are high, exploring platform MLOPs solutions is usually a good idea.
Ready to start? Build robust ML pipelines and deploy reliable AI faster with V7.
“Collecting user feedback and using human-in-the-loop methods for quality control are crucial for improving Al models over time and ensuring their reliability and safety. Capturing data on the inputs, outputs, user actions, and corrections can help filter and refine the dataset for fine-tuning and developing secure ML solutions.”