O Estouro da Bolha do Big Data

Provavelmente esse é um dos melhores posts da blogosfera a respeito do assunto. A Cathy O’Neil toca na ferida de muitos dos Vendedores Engenheiros de Vendas no que tange o alto volume de publicações, posts, e demais White Advertised Papers lançados sobre o Big Data. A questão como um todo merece reflexões em doses homeopáticas, mas seguem abaixo alguns dos interessantes pontos do post:

[…] Unfortunately, this process rarely actually happens the right way, often because the business people ask their data people the wrong questions to being with, and since they think of their data people as little more than pieces of software – data in, magic out – they don’t get their data people sufficiently involved with working on something that data can address.[…]

[…] Also, since there are absolutely no standards for what constitutes a data scientist, and anyone who’s taken a machine learning class at college can claim to be one, the data scientists walking around often have no clue how to actually form the right questions to ask anyway. They are lopsided data people, and only know how to answer already well-defined questions like the ones that Kaggle comes up with. That’s less than half of what a good data scientist does, but people have no idea what a good data scientist does.[…]

[…] Here’s what I see happening. People have invested some real money in data, and they’ve gotten burned with a lack of medium-term results. Now they’re getting impatient for proof that data is an appropriate place to invest what little money their VC’s have offered them. That means they want really short-term results, which means they’re lowballing data science expertise, which means they only attract people who’ve taken one machine learning class and fancy themselves experts.[…]

[…] In other words, data science expertise has been commodified, and it’s a race to the bottom. Who will solve my business-critical data problem on a short-term consulting basis for less than $5000? Less than $4000?[…]

[…] My forecast is that, once the hype wave of big data is dead and gone, there will emerge reasonable standards of what a data scientist should actually be able to do, and moreover a standard of when and how to hire a good one. It’ll be a rubrik, and possibly some tests, of both problem solving and communication.[…]