in Uncategorized

Cuidado com os Grandes Erros em Big Data – Nassim Taleb

Depois do Stephen Few, chega a vez do Nassim Taleb realizar algumas considerações sobre o Big Data:

[…] But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. 
This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s  computer, there is. Large deviations are likely to be bogus. […]

 
 

Write a Comment

Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Webmentions

  1. Para os cientistas de Big-Data “Trabalho de Limpeza dos Dados” é principal obstáculo para Insights | Mineração de Dados

    […] analistas iludidos (como alguns do NYT) leiam essas referências aqui, aqui, aqui, aqui, aqui, aqui, e finalmente […]