Improving a surface interpretation of “big data”

A silly little piece appeared in The New York Times discussing a hypothesis of a Harvard economics professor that Apple might slow down its operating system ahead of major product releases in an attempt to encourage consumers to upgrade.

One of his students used Google Trends data to investigate this hypothesis. In the article, two graphs are compared — one that shows Google Trends search volume for “iPhone Slow” and the other for “Samsung Galaxy slow”.

iphone_slow

It is shown that the spikes in searches for slow operation of Apple’s products seem to correlate with new iPhone release dates, whereas there are no search spikes in the data for the Samsung Galaxy.

samsung_galaxy_slow

These graphs are horribly misleading on their own. Both products have grown in popularity over the years, so the increase in search volume over time reflects nothing more than their widespread mainstream popularity. This could have easily been removed from the graphs by adjusting these trendlines relative to the “base” searches, e.g. “iPhone” and “Samsung Galaxy”. In the graphs as shown, it’s hard to tell whether little spikes are actually hidden within the compressed and precise trendline for the Samsung Galaxy.

Continue reading Improving a surface interpretation of “big data”