A former co-worker (thanks Elf!) posted the following link on his FB:


What an amazing undertaking of data analysis!  This “study” corresponds quite closely to two things I was thinking about this morning: specifically, what a time-history plot of my use of tags on this blog would look like (see this graph of last.fm usage for something akin to what I was thinking of), and what a time-history plot of people logging on/off of chat clients (which I am seeing minute-to-minute on the side of my screen right now in digsby) might look like over the course of an average day.

I have also had multiple discussions with game designers recently where the foremost on their mind was getting “real” user feedback from analyzing their metric data – their usage of the game system.  How useful is the world according to metrics?  I love that we can do interesting visualizations of the HUGE amount of data being generated on the Internet every second by millions of people… but how useful is the data itself with regard to understanding people’s behavior and/or making decisions about how to interact with people?

It reminds me of my musings on Asimov’s hypothetical field of psychohistory, and also some fundamental ideas about how individuals differ from populations.  It is a well-known pitfall in analysis that when you come up against the barrier of too many dimensions (meaning, tons of different ways to gain perspective on the data), that by choosing a perspective, you are forcing yourself into limited usefulness of results.  What I mean by this is, you can’t answer every question by looking at the data from one perspective – seems obvious, I know!

I worry that if the preponderance of accessibility to this raw metric data increases as it has been recently, that particular perspectives may gain undue weight and skew decisions toward something akin to a majority rule of perspective.