Accountability, Core Machine Learning, and Machine Learning Operations

Anyone following the technology debate through academia, industry, conferences, and the media has noticed that Artificial Intelligence (AI) and its sub-fields are the hottest topics at the moment.

Some companies that have the core of their business in digital systems/platforms (and some non-digital ones) have understood that the use of Machine Learning (ML) has great potential both for cases of optimization in the way the company works and for cases of direct revenue generation.

This can be seen in numerous businesses ranging from the banking sector, through recommendation systems for entertainment, and even reaching medical applications.

This post will attempt to briefly describe how ML is shaping many businesses, provide a short reflection on the march of accountability [1], and finally offer some brief considerations regarding Core Machine Learning teams and the Machine Learning Operations (MLOps) approach.

How is Machine Learning shaping some industries and what is the degree of responsibility for engineering teams?

With the adoption of machine learning by industry, a natural movement began where both machine learning and industry are shaping each other.

If at first the industry benefits from Machine Learning platforms to obtain predictions, classifications, inferences, and decision-making at scale with marginal costs near zero; Machine Learning benefits from industry through access to research and development resources unimaginable in academia, access to resources that would have an unfeasible cost for conducting studies, and an increase in the maturity of its methods in terms of engineering.

However, what we are ultimately talking about here is the scale at which decisions are made in industry, and how R&D in Machine Learning is advancing at a very high speed.

That said, we can state that today these systems are no longer in the harmless arena of ideas and proofs of concept; rather, they are active elements in processes of interaction between people and businesses on a massive high scale.

And due to this scale, a series of new questions that were not as concerning or were hidden in the past now take on greater importance, such as:

As we can see in these examples, current aspects such as structural human biases, lack of diversity, structural promotion of injustice, and abuse of authority can be minimized with ML using tools such as Fairness, transparency, accountability, and explainability.

And given the points raised above, it is unnecessary to state the importance and responsibility of each ML professional to ensure that an automated decision does not include and/or amplify these systematic biases.

One of the greatest truths in technology is that computer systems most of the time work to amplify behaviors and skills. An ML system that does not take structural biases into account is fated not only to continue but also to amplify these same biases on a high scale.

And given the enormous authority engineering has in relation to the implementation of these systems, accountability will automatically come with the same intensity as the degree of impact of these solutions.

Accountability will come voluntarily and/or coercively

Given all the scenarios where ML platforms have a direct impact on industry, and all the potential risks and impacts on society, there is a regulatory march coming from numerous fronts that will place much greater accountability on companies and ML engineers.

This accountability will essentially be related to sensitive aspects that concern society as a whole: ethics, fairness, diversity, privacy, security, the right to explanation of algorithmic decisions (for those under the GDPR), besides, of course, specific ML aspects (e.g. reproducibility, model evaluation, etc.).

In this way, this more than ever places great pressure on all of us engineers, data scientists, product managers, CTOs, CEOs, and other stakeholders to not only do our job but also pay attention to all these aspects.

If this scenario sounds distant or out of reality, I invite the most skeptical to honestly answer the following questions regarding their current employer:

I could list numerous other cases that are already happening today, but I believe I have made my point. For those who want to know more, I recommend Cathy O’Neil’s book called Weapons of Math Destruction which shows some of these scenarios or the talk based on the book, called ”The era of blind faith in Big Data must end”.

Furthermore, if this accountability does not come via the market, it will necessarily come through the coercive route of state regulation; the latter which at this moment is being developed by numerous governments worldwide to impute accountability, both on companies and individuals.

This can be seen in numerous observatories and Think Tanks such as The AI4EU Observatory, in some OECD recommendations regarding Artificial Intelligence, and in recent guidelines released by national AI strategies in countries like Estonia, Finland, Germany, China, United States, France, and the European Union Commission itself which has clearly said it will massively regulate AI from the perspective of risk and transparency.

This means, in the last instance, that errors in a system that interacts directly with human beings will result in a chain of consequences totally distinct from what we currently have.

Given this extremely complex scenario, we can deduce that if the era of the “analyst-with-a-script-on-their-own-machine” has not ended, it is on the way to happening much faster than we can imagine; whether through professionalism and awareness, or through coercion, threats, and/or fiduciary losses.

And do not be fooled by those who say that you are just “a person who must follow orders” and that nothing will happen. At the moment your company has any kind of civil/criminal/public relations problem, you will be co-responsible. And there is already a precedent of an engineer who is going to jail because of bad practices within their craft. And here it won’t just be a question of “if”, but rather “when” this will reach software engineering in ML.

The message I want to leave here is not one of despair or even inducing situations of corporate confrontation. What I want to leave as a final message is just that we must have situational awareness of this march of accountability/responsibility and why this will be inevitable.

In other words: Critical thinking is an intrinsic part of the job, you are responsible for what you do, and the value of this is already embedded in your salary.

Core Machine Learning

The first time I had contact with the Core Machine Learning approach was in mid-2015 at the Strata Data Conference and continuing in 2016 through some talks by Hussein Mehanna. However, it was only in 2017 at Facebook @Scale after contact with people from the industry that I could understand a bit more what this approach was.

Not that there is a formal definition, but basically a Core Machine Learning team would be responsible for developing Machine Learning platforms within the Core Business of organizations; whether embedding algorithms in existing platforms or delivering inference/prediction services via APIs.

Part of this team’s mission would be to deal directly with all machine learning application initiatives within the company’s main activity. This ranges from applied research, adoption of software engineering practices in ML, to the construction of the infrastructure part of these applications.

Thinking about the new economy that is here to stay, in my view, we are in the middle of a transition of product development paradigms.

On one side we have a paradigm that focuses on building static applications that are concerned with business flows. On the other side we have a paradigm that inherits the same characteristics but uses data to leverage these applications.

Obviously there is much hype and much solutionism using ML, but I am talking here about companies that manage to apply ML in an opportunistic way and with pragmatism for building these applications.

In other words: the algorithm on the platform becomes the product itself.

Let’s see some examples of platforms where the algorithm is the product:

These are some of the most famous public examples of some machine learning cases in the core business of businesses both in Brazil and elsewhere.

A very interesting way to understand how some algorithms helped in leveraging products can be seen in the paper Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective by Facebook:

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective

Of course, we know that internally things are not quite so simple, but we can get an idea of how vital aspects for the Facebook product depend essentially on the implementation of Core Machine Learning.

It may seem the same at first, but the main difference between the duties of a Data Science team and a Core Machine Learning team is that while the former generally focuses on analysis and modeling; the latter places all of this in a way that is scalable and automated within the main core of the business.

However, given that we said Core Machine Learning would ideally be a team/approach that potentially leverages the core business through the application of ML, I will talk a bit about how everything is operationalized.

MLOps - Machine Learning Operations

In Software Engineering there is a very high degree of maturity in the way applications are built and their tools, ranging from excellent IDEs, passing through frameworks that handle inversion of control and dependency injection well, mature and battle-tested development methodologies, deployment tools that greatly simplify the CI/CD process, and observability tools that facilitate application monitoring.

In contrast, in machine learning there is an abyss in terms of maturity regarding the adoption of these practices, as well as the fact that most of the time machine learning engineers work with data artifacts such as models and datasets.

Some of these artifacts (not exhaustive) are:

  • Data Science Analyses;
  • Data extraction pipelines and feature generation via Data Engineering;
  • Versioning of data that generate the models;
  • Tracking of model training;
  • Tracking of hyperparameters used in experiments;
  • Versioning of Machine Learning models;
  • Serialization and promotion of models to production;
  • Maintenance of privacy of data and the model;
  • Training of models considering security countermeasures (e.g. adversarial attacks);
  • Monitoring model performance given the intrinsic degradation nature of performance of these artifacts (data/model drift).

One of the consequences of so many distinctions in terms of processes between these areas is that the operationalization of these resources must also be done in a totally distinct way.

In other words, maybe DevOps might not be enough in these cases.

A figure that well summarizes this point is from Luke Marsden’s talk called “MLOps Lifecycle Description” where he places the difference between these two areas as follows:

MLOps Lifecycle Description

The idea behind it is that while traditionally software engineering deals with functionality and has code as the materialization of flows; in the Machine Learning Operations (MLOps) approach, besides the same concerns existing, there is an addition of many moving parts such as data, models, and metrics and the operationalization of all these aspects [2].

That is, the operationalization of this development and deployment flow requires a new way of delivering these solutions in an end-to-end manner.

For this, a proposal of how an end-to-end application would be considering these operational aspects of ML is presented in the article “Continuous Delivery for Machine Learning: Automating the end-to-end lifecycle of Machine Learning applications” within the continuous delivery perspective:

Continuous Delivery for Machine Learning (CD4ML) is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles.

Continuous Delivery for Machine Learning: Automating the end-to-end lifecycle of Machine Learning applications

In the same article, there is also a figure of what an end-to-end flow of a machine learning platform would look like:

Continuous Delivery for Machine Learning: Automating the end-to-end lifecycle of Machine Learning applications

And with these new layers of complexity combined with the little education in software engineering on the part of a large portion of data scientists, it becomes clear that the spectrum of potential problems regarding the delivery of machine learning applications becomes much larger.

However, so far we have discussed very high-level aspects such as the impact of ML systems, aspects linked to accountability, Core Machine Learning and its responsibilities, and the MLOps approach.

But I want to deepen the level a bit more and enter into some more specific points where MLOps has a more direct action; that is, shed some light on the dark trail where SysOps, DevOps, Software Engineering, and Data Science generally would not enter.

Source: Christian Collins - shades of mirkwood

Complexity in Machine Learning Systems

In the classic paper Hidden Technical Debt in Machine Learning Systems, there is an image that crystallizes well what a machine learning system really is in relation to complexity and effort for each component of this system:

Hidden Technical Debt in Machine Learning Systems

Even without a direct mention of MLOps, the article has some considerations regarding the specific problems of machine learning systems that would result in technical debts and other problems that would potentially leave these applications more fragile in terms of scalability and maintenance.

I decided to take some of the seven points from the article and give some practical examples. The idea is to show an MLOps approach in some scenarios (hypothetical or not) as we can see below:

The corrosion of boundaries due to complex models
Data dependencies cost more than code dependencies
Feedback Loops
  • The product team asked to perform an experimentation strategy with Multi-Armed Bandits with n models. How are data from losing strategies being isolated (i.e. given that strategies affect present data and future training)? Is there any log signature that identifies these records?[3]
  • A recommender system returns a list of items ordered by probability in terms of relevance to the user. However, the model’s nDCG is very low. How long would it take for you to know that the reason is because the Front-End, instead of respecting the received ranking from the recommender system, is re-sorting by alphabetical order? What would a test or feedback loop look like between the recommender system and the Front-End in this case?[3]
Anti-Patterns in Machine Learning Systems
Configuration Debt
  • Each of your ML microservices has local logging configurations and sending these logs to an ELK would involve redoing scripts and deployment in all these services.
Dealing with changes in the external world
  • The ML system that does anomaly detection/classification and is used for alarm and revenue monitoring is receiving an increase in the volume of requests. However, after some time the system starts firing numerous revenue drop alerts; alerts that activate developers to solve this problem. However, you discover the reason for the alerts: the marketing team ran a non-recurring campaign that non-organically increased revenue, and the classifier “learned” that those revenue levels would be the “new normal”. [3]
  • Your recommender system is offering the same items out of catalog for 15 days in a row, bringing not only a horrible user experience but also negatively affecting revenue. The reason? There is no monitoring of data (filebeat) nor application metrics (metricbeat). [3]

The points above were some examples of how machine learning systems carry intrinsic complexities that involve very specific skill sets and that must be taken into account regarding their operationalization.

FINAL CONSIDERATIONS

We are on the march to have more and more Machine Learning systems involved directly or marginally in companies’ core business.

With the increased impact of these systems on people’s lives, society, and businesses, it is a matter of time before we have accountability protocols if something goes out of control; especially in aspects linked to fairness, transparency, and explainability of these systems and algorithms.

Within this, it becomes increasingly clear that the era of the “analyst-with-a-script-on-their-own-machine” has its days numbered when we talk about platforms that have interactivity with people.

While Machine Learning systems do not yet have the same level of software engineering maturity regarding their development, deployment, and operationalization, as well as many specific aspects; perhaps there is an avenue for the growth of what is known today as MLOps, or Machine Learning Operations.

The MLOps approach comes not just to deal with aspects linked to infrastructure or software development, but these teams come to meet a demand, still latent, for eliminating or mitigating the problems and debts intrinsic to the activity of Machine Learning development.

NOTES

[1] - The terms “Data Scientist”, “Systems/Platforms”, “Product Manager”, “Accountability”, and “Fairness” will be used throughout this text.

[2] - For those interested, Luke Marsden wrote a kind of MLOps Manifesto where some of these ideas are present.

[3] - Events of which I was a witness or that happened to me directly.