Some weeks ago during a security training for developers provided by Marcus from Hackmanit (by the way, it’s a very good course that goes in some topics since web development until vulnerabilities of NoSQL and some defensive coding) we discussed about some white box attacks in web applications (e.g.attacks where the offender has internal access in the object) I got a bit curious to check if there’s some similar vulnerabilities in ML models. After running a simple script based in ,, using Scikit-Learn, I noticed there’s some latent vulnerabilities not only in terms of objects but also in regarding to have a proper security mindset when we’re developing ML models. But first let’s check a simple example.
From MIT Tech Review article called “Google shows how AI might detect lung cancer faster and more reliably” we have the following information: Early warning: Danial Tse, a researcher at Google, developed an algorithm that beat a number of trained radiologists in testing.
In a very insightful article made by David Talby he discuss about the fact that in a second that a Machine Learning goes to production, actually this model starts degradate itself because the model contact with the reality, where the author uses the following statement: The key is that, in contrast to a calculator, your ML system does interact with the real world.
Most of the time we completely rely in the default parameters of Machine Learning Algorithm and this fact can hide that sometimes we can make wrong statements about the ‘efficiency’ of some algorithm.
In one experiment using a very large text database I got at the end of training using train_supervised()in FastText a serialized model with more than 1Gb.
Ben Lorica talks security in terms of Software Engineering but at least for me the most important aspect of security in Machine Learning in the future it’s the model explainability where he says: Model explainability has become an important area of research in machine learning.
It’s a great project that can tackle a very huge problem in Machine Learning that is the reproducibility.