Visualizing Data

When an engineering colleague of mine (95% of my immediate surroundings) emailed me a link this past week, I thought for sure it was for a YouTube video.  This link was different though, in that it introduced me to Anscombe's quartet, a set of four datasets that share the exact same statistical properties.  Each one of these datasets has the same mean, variance and mean of each y variable, variance and mean of each x variable and the exact same correlation between the x and y variables.   A little geeky, eh?

What's my point, you might ask?  When each one of these datasets is plotted out visually, they have completely different appearances (just click on the link above and you'll see what I mean).  There are outliers where one would not expect to see them -- identifying both opportunities and risks in your data depending on what you are analyzing.  However, one would never see the variance in data patterns if it was not plotted in a chart or graph (or analyzed data point by data point).



Looking at your data is just as important as reading your data.  Not all of us can see the obvious by just looking at numbers, even if we don't consider ourselves "visual" people.  I've learned this quite extensively in my current job when presenting data findings to a combination of both finance and product teams.  Some people are "table" folks and others are "chart" folks.  Regardless, the combination of the two data presentation methods jogs the brain and forces you to see the data in new ways and patterns. Here are three reasons why you need to visualize your data:

1.     Visualizing data allows you to segment elements within your results set.  For example, when analyzing campaign data, the average click-through rate or cost per acquisition might meet your campaign goals.  However, deeper segmentation via visualization can help you determine where further optimization opportunities lie via eliminating waste and exploiting upside outliers.

2.     Ability to see relationships and correlations within the data.  Scatterplots and grids can help sift out opportunities where you might have huge upside if you can optimize your key trigger points (whether it is behavior metrics like CTR or margin management).

3.     Trend analysis over time is more obvious when it is visualized. Trends in data can shift dramatically or slowly over time. Visualizing trends can allow you to see the slope of a metric or group of metrics to identify seasonal and market trends that may not pop out at you when you are just staring at a table of numbers.

Segmentation, correlation and trending are just three of the reasons for data visualizations -- but there are so many more.  As analysts, we sometimes do not see the forest through the trees.  Visualizing data forces you to stop and look at the results and ponder the bigger picture.  In this case, perhaps, a picture is worth a thousand data points.

6 comments about "Visualizing Data ".
Check to receive email when comments are posted.
  1. Eric Melchor from Smart Digital Spending, September 4, 2009 at 1:01 p.m.

    Hi Jodi,

    Interesting post! As an analyst, I always found more meaning in visual charts and graphs.


  2. John Jainschigg from World2Worlds, Inc., September 4, 2009 at 1:21 p.m.

    Awesome post! And what a wonderful tool to use for beating up people in meetings. "Clarifying mathematical verities by simple, readily-understood example? Valuable. Doing so while humiliating some MBA who wasn't listening in stat class and now thinks basic knowledge of Excel makes him/her an analyst of consequence to the organization? Priceless."

  3. Maurice Boissiere from xif Communications, September 4, 2009 at 1:48 p.m.

    Great reminder that data serves decision making and not to rely on any one metric or method of processing information when collaborating.

  4. John Dietz, September 4, 2009 at 3:07 p.m.

    As you say, charts and graphs are a great way to visualize trends or anomalies that would be hard to pick out of reading through a table, but in my experience it works best limited variables. I've seen too many graphs where too many variables are represented, then you end up trying to decipher lines and colors and shapes all at the same time.

    We spend a lot of time trying to figure out which data sets to show graphically and how to identify trends beyond basic variance analysis, and your example is a perfect example where you should graph it out.

  5. John Grono from GAP Research, September 4, 2009 at 7:34 p.m.

    It reminds of the tale of the statistician who couldn't swim who was found drowned in a lake.

    He'd calculated that as he was 6 feet tall and the average depth of the lake was only 5 feet he'd be fine ... even in the middle where his body was found at the bottom of twenty feet of water.

  6. Paul Van winkle from FUNCTION, September 5, 2009 at 3:18 p.m.

    The Tufte books and lectures are finally and massively relevant. I guess like many leaps, it takes crisis and crashes to heighten wider awareness of what's painfully obvious to a few focused individuals.

    Anyone who hasn't signed up for the travelling show should do so soon, before Tufte stops doing them. His three great books come with.

Next story loading loading..