Big Data contradiz senso comum na NFL

Nessa reportagem de Isaac Lopez ele traz alguns dos resultados do pesquisador Jesse Anderson sobre análise de Big Data (o que não é tão ‘big’ assim)  dos dados da NFL, no qual ele chega a algumas conclusões que muito do senso comum sobre o jogo simplesmente tem um fator de influência muito baixo. Abaixo algumas análises do estudo: Metodologia e Coleta de Dados

Using data collected by the website Advanced NFL Stats, Anderson put ten years of NFL play-by-play data into Hadoop to try to extract useful information from the unstructured data. “I spent a good 80% of my time dealing with problems in the data,” he explained discussing the challenges of working with an unstructured data set that contains 2,898 games with 471,392 plays. The biggest challenge he explained, was in the natural language processing, and getting useful data out consistently. He says he used regular expressions to parse out the human-generated strings and extract useful info.

Sobre a altitude de jogar no estádio do Denver Broncos

Anyone who watches the NFL has seen the images of the players on the sidelines huffing oxygen through masks, while the announcers dramatize the images with talk about the advantage that the Denver Broncos have in their mile high home field. According to the data, the altitude doesn’t really show any discernible effect in either the outcome or how the game is played relative to other stadiums, saving one minor difference: a 1% increase in passes.

Sobre Jogar em Casa

However, that doesn’t mean that there aren’t real home field advantages to speak of. The home team wins an average of 57% of the time. There are outliers to this number, however. Baltimore was the biggest outlier in the data when they were at home and were playing in weather, winning on average 22-14 in adverse conditions. This makes some visceral sense given the strength of their defense during this period of time, and considering that offenses would have to battle against both it and the weather.

Sobre a evolução no jogo das equipes quando estão com a bola

The data revealed some interesting things about the way the game is played. On first downs, 52% of the time it’s a run, and 42% of the time it’s a pass. On second down, it’s 45% run, and 49% pass. And on third downs, this changes dramatically, with runs falling to 26% and passing climbing to 66%. However, the thing that changed the way the game was played the most is the wind. At calm winds, 41% of the plays resulted in passes, and 37% were runs. But when the wind climbed higher than 30 MPH, this virtually flips, with 34% of plays resulting in passes, and 46% resulting in runs.