Dancing with the Stars seems to be omnipresent on TV these days. I confess that, contrary to certain members of my household, I'm not a fan. The lesson I'll draw for this post is one very profound insight: different partners require different approaches in choreography, costumes, and complexity. When you dance with data, it is important to keep that lesson in mind.
I recently completed an experiment that turned out to be educational. Having tried various online learning environments over the years, I decided that the evolution of the platforms warranted another investment to see what had changed. I decided to sign up for Jeff Leek's Data Analysis on Coursera (Jeff Leek is a professor at the Johns Hopkins Bloomberg School of Public Health). My experience turned into an eight week evening and weekend wilderness tour through applied statistics, R programming, and data analysis. I did the assignments and quizzes, watched the videos, and completed the course.
Here are a handful of the gems (I'll spare you the agonies getting acquainted with R—picture what happened to Rudy during football practice in that old Notre Dame movie). Jeff reminded us in the early going that data is cheap and everywhere but that understanding it and processing it is a significant challenge. He also reminded us that "If it isn't the right data, volume doesn't matter" which I thought was very helpful given the deluge of reminders that we hear daily about big data. John Tukey, data analysis guru, observed decades ago that "the data may not contain the answers we seek". These data proverbs come in handy when choosing your dance partner.
To the point, here are some of the primary data analysis contenders you may meet. In general, they range from easiest to most difficult in terms of gathering and analysis. Each has a different role, involves different types of analysis, and is suited to particular contexts. Mechanistic analysis, for instance, is not generally applicable to social research since you can't isolate variables in a lab as easily as you might a chemical reaction.
- Descriptive—how many of something or some feature
- Exploratory—find relationships, insight, connections you didn't know when you started
- Inferential—small sample used to say something about a larger population
- Predictive—data about something used to predict data about something else
- Causal—what happens to one variable when you change another variable
- Mechanistic—exact changes in one variable lead to exact changes in another variable
Data and the analysis that makes sense of it has important differences in texture, purpose, and potential. We use statistical methods to try and read the meaning in data. This is a pervasive and important part of contemporary life. Taking a free course like Data Analysis can be a very useful way of increasing your literacy about the flows of data that are all around us and that we generate constantly as we go about our lives.
As you read about and look at graphs in newspapers, reports, or online, it's worth noting that data is a means to an end and won't, of itself, solve our problems or get us on the Dancing with Data podium. It can, however, prove immensely useful if we understand some of the important features of the data landscape and it may, under the right kinds of analysis, help us sort out some of the puzzles we face. Though we may be alarmed at the prospect, it may well be that some part of the narrative of our complex society will be a jig danced by a sequin-free analyst loading data sets into R.