Friday, April 11, 2014

Big Data and Data Science

Big data and data science have been generating a lot of excitement lately.  Excitement is great and all, but more importantly, more substantive articles about its limitations and uses have also been cropping up.  Here are some moderately substantive ones.
Highlights the general problems with 'big data' but actually it's more about data science as it's practiced in tech firms these days.  The problems aren't explained that comprehensively.  What's nice is that there is an example for each one, though.
This is a bit more substantive.  It makes the argument that big data needs to go from 'thin data' to 'thick data,' where 'thin data' are just traces of activities that are getting collected inadvertently.  'Thick data' is more information about the context of actions.  'Thick data' is probably more useful for making decisions but takes more effort to college, probably requiring one to get out and talk to people.  Then again, it's written by someone who sounds like an advocate of 'the humanities,' who perhaps is trying to justify all the 'qualitative analysis' skills she learned instead of big data analysis.  In fact, its main argument is interesting but largely unsubstantiated.
Good old MIT Technology Review.  The March/April issue had a Business Report on Data and Decision-Making.  It had several articles about trends in how businesses are using A/B testing.

aside : Yes, committing to blogging about an article (that takes more than 20 seconds to understand) is the only way I will ever actually read it much less remember what it said.

No comments: