Posts Tagged Software/Tools
In Praise of Mr Data Converter
Posted by in Software/Tools on January 18, 2011
An open letter to Mr Data Converter:
Dear Mr Data Converter,
Thank you so much for your awesome and free data converter. As Flowing Data so eloquently stated yesterday, “Data is rarely in the format you want it”. Rarely is probably an overly optimistic word for such a statement! This unfortunate reality is why I was so pleased to come across your tool a few months back.
Much of what I’ve done recently calls for large XML data sets to be loaded into Adobe Flex for interactive visualization projects (public examples embedded here and here). Before stumbling across Mr Data Converter this would require a lot of tedious tweaking on my end. Now it’s as easy as cutting and pasting from excel to instantly generate well structured XML, actionscript, and JSON data files (and more). This has saved me countless hours and late night headaches.
I will continue to share your converter with colleagues/friends and again thank you for such an easy-to-use data conversion resource.
Kind regards,
Alex Chisholm
DynamicDataDisplay.com
@VisualizingData
How many economists does it take….
Posted by in Data Visualization on January 18, 2011
Marginal Revolution links to the new Economist’s Oath, which references the number of economists that work for the federal government (excluding the FED).
In 2008, there were 4,130 federal employees with the title ‘Economist’. Too few? Too many? Probably depends on which agency you’re talking about. Click the image below to see the full distribution.
_____
Simple coding done with Flare in Flex
Missing Data in Excel Line Chart
Posted by in Data Visualization on January 16, 2011
First off – although perhaps unpopular to say – I still think Excel is a great tool for quick analysis and chart making. This became especially true after Office 2007 added user-friendly style and formatting controls.
Take the following example looking at growth of Gross Domestic Product in China since 1990.

The graph is both easy to create and tells the story quite well. Data from the IMF show that GDP in China grew substantially between 1990 and 2010. The rate of growth was even quicker between 2000 and 2010.
With incomplete data sets, however, I always found it difficult not to show something misleading. Take this data from UNESCO on higher education enrollments in Comoros.
1999: 649
2000: 714
2001: Missing
2002: Missing
2003: 1,707
2004: 1,779
2005: Missing
2006: Missing
2007: 2,598
2008: Missing
2009: 3,457
Some might not recommend graphing anything with 5 of 11 years missing but I think that, even with the limited data, the underlying trend is strong enough to warrant a visual.
So if you leave the cells blank excel would spit out something like this line or column chart.

Not very pleasing. You can tell the trend is upward but most non-data people would be so distracted by the gaps that they wouldn’t even care. Another method I’ve seen used is taking out the missing years.

I find this one even more offensive. The speed (slope) of the trend is exaggerated because time has been compressed. Also, since the number of years between actual data points is inconsistent it is additionally misleading.
This issue always bothered me which is why I was so happy last week to discover a simple solution. Instead of leaving the cells with no data blank, enter =NA() as the formula. This changes the cell to read #N/A and causes Excel to interpolate the line through any missing values. The end result isn’t perfect because we don’t actually know what happened in 2001, 2002, 2005, 2006, or 2008 but I think it’s a practical real-world solution where information is often incomplete.

Strata Conference 2011 – Wordle Cloud for Day 1 Sessions
Posted by in Data Visualization on January 14, 2011
Wordle has been around for a while now and let’s users create fun (free) word clouds from text. It is super easy to generate polished images.
In a few weeks I’ll be heading to Santa Clara for the inaugural O’Reilly Strata conference on ‘Making Data Work’. I dumped the first days session descriptions into Wordle and it spit this out.

What a great way to summarize content. Any uncertainty of what this conference is all about?


