Nathan Yau

May 162013
 

Show ratings for 24

The quality of television shows follow all kinds of patterns. Some shows stink in the beginning and slowly gain steam, whereas others are great at first and then lost momentum towards eventual cancellation. Using data from the Global Episode Opinion Survey, Andrew Clark visualized ratings over time for many popular shows in an interactive.

The graph represents the average ranking for the show over time. The red lines indicate changepoints, estimations of when the properties of the time-series, typically the mean changes. The intensity of the plot varies according to the number of respondents. An episode of a show that is favourably rated tends to get more people ranking as do earlier episodes in long-running show.

For example, the chart above shows ratings for 24. The ratings started in the 8s and finished in the 7s, which isn't a huge difference really when you compare it to ratings for The Simpsons.

Simpsons

There's a self-selection challenge here. To participate in the GEOS survey, you have to create an account, so there's probably going to be some polarity in the ratings as well as limited sampling for many episodes. So take it all with some salt. Nevertheless, it's fun to poke around and see how your favorite shows changed over time. Most of the ratings matched my expectations.

The R code is available on github if you want to have a go at the data.

May 152013
 

Arrested development jokes

Watch Arrested Development enough and you start to realize there are a lot of recurring jokes in various episodes and seasons. In an interactive by Beutler Ink and Red Edge, Recurring Developments shows what episodes jokes, such as the awkwardness between George Michael and Maeby, happen. And like the visualization this is based on, you can also go the other way around and look at the recurring themes in each episode.

The interaction is fairly straightforward. Jokes are on the left and a listing of episodes is on the right. Click a joke and orange lines extend to corresponding episodes. Click an episode and lines extend to corresponding jokes.

Excuse me while I go on an Arrested Development binge on Netflix.

May 132013
 

Homophobic tweets

In a follow-up to their map of racist tweets towards Barack Obama, the folks at Floating Sheep took a more rigorous route to get around the challenges of sentiment analysis. Over 150,000 geotagged tweets against races, sexuality, and disabled were manually classified and mapped.

All together, the students determined over 150,000 geotagged tweets with a hateful slur to be negative. Hateful tweets were aggregated to the county level and then normalized by the total number of tweets in each county. This then shows a comparison of places with disproportionately high amounts of a particular hate word relative to all tweeting activity. For example, Orange County, California has the highest absolute number of tweets mentioning many of the slurs, but because of its significant overall Twitter activity, such hateful tweets are less prominent and therefore do not appear as prominently on our map. So when viewing the map at a broad scale, it’s best not to be covered with the blue smog of hate, as even the lower end of the scale includes the presence of hateful tweeting activity.

Hard to believe this stuff is still around. It looks like I might want to stay clear of some parts of Virginia.

Data Points: Sample chapter

 Uncategorized  Comments Off
May 102013
 

It's hard to believe it's been over a month since Data Points: Visualization That Means Something hit the shelves. Thanks to all of you for the tweets, emails, and pictures of the book in the wild. Every one make me smile, and I'm glad that people are finding it helpful.

In case you're still deciding, here's a sample chapter from the book. It's Chapter 3 on representing data and should give you a good idea of what to expect. And of course it's way sexier in print.

What others are saying

Of course, don't just take my word for it. Here's a sample of the chatter on Twitter.

@kindraupdates: Got my #DataPoints book by @flowingdata today! Another great #dataviz effort, Nathan! Excellent work! pic.twitter.com/at3VE6OxpS

@blynchdata: Have been delaying pleasure Slowly reading Received Visualize This and Data Points.These 2 are beautiful! Nice job @flowingdata!

@JanWillemTulp: Data Points, @flowingdata's latest book looks beautiful and informative pic.twitter.com/fY5lnZyYNS

@RyanMullins: My parents got me a copy of @flowingdata’s book Data Points for my birthday, excited to read it. pic.twitter.com/2iWr9FT64z

@SusanaAssuad: @flowingdata just got "Data Points" today, looking forward to start working with it ! Thanx Nathan :-)

@joreira: Really good read about Data Visualization! Thanks @ajeets for the recommendation. @flowingdata pic.twitter.com/1SxfPjnfkq

@dseverski: Data Points by @flowingdata is a gorgeous and inspiring collection. So much goodness.

@jcwong86: Can already tell it's gonna be great. Beautiful full bleed graphics. Congrats @flowingdata @nathanyau on its success! pic.twitter.com/6xpGWYlpWf

Thanks again, everyone! Didn't get your copy yet? You can order Data Points on Amazon, Barnes & Noble, or your local bookstore. It's also available in all major ebook formats.

Data Points: Visualization That Means Something is available now. Order your copy.

May 102013
 

Cicada

This is my first time hearing about this, probably because it only happens every 17 years. After 17 years of development in the ground (getting nourishment from tree roots), the Cicada insects are starting to swarm on the east coast. Hundreds of millions of them mate, make a lot of noise, and then die. Adam Becker and Peter Aldhous for New Scientist mapped data maintained by John Cooley and Chris Simon from the University of Connecticut to show the cycles of the Cicada.

There are 17-year broods, which is what's happening now, and there are 13-year broods, with the next one expected next year in Louisiana.

Click the play button on the top right to see the various broods appear over time, and be sure to turn on the audio (in the left panel) for added flavor. [Thanks, Peter]

Data Points: Visualization That Means Something is available now. Order your copy.

May 092013
 

Removing geometry by Fathom

Terrence Fradet of Fathom Information Design ponders whether metro maps suffer or benefit by leaving out geography. Geographic accuracy is good, but sometimes it can confuse your audience.

Just how important is it that metro maps represent geography? This piece came from an interest in how metro maps over the past century have tiptoed between geographic and topological representations—topological meaning to forgo all spatial integrity and instead represent the connectivity of a specific environment.

Data Points: Visualization That Means Something is available now. Order your copy.

May 092013
 

Here is today

When you focus on all the small events and decisions that happen throughout a single day, those 24 hours can seem like an eternity. Graphic designer Luke Twyman turned that around in Here is Today. It's a straightforward interactive that places one day in the context of all days ever.

You start at today, and as you move forward, the days before this one appear, until today is reduced to a one-pixel sliver on the screen and doesn't seem like much at all.

Data Points: Visualization That Means Something is available now. Order your copy.

May 092013
 

Average dissertation

On R is My Friend, as a way to procrastinate on his own dissertation, beckmw took a look at dissertation length via the digital archives at the University of Minnesota.

I've selected the top fifty majors with the highest number of dissertations and created boxplots to show relative distributions. Not many differences are observed among the majors, although some exceptions are apparent. Economics, mathematics, and biostatistics had the lowest median page lengths, whereas anthropology, history, and political science had the highest median page lengths. This distinction makes sense given the nature of the disciplines.

I was on the long end of the statistics distribution, around 180 pages. Probably because I had a lot of pictures.

As I was working on my dissertation, people often asked me how many pages I had written and how many pages I had left to write. I never had a good answer, because there's no page limit or required page count. It's just whenever you (and your adviser) feel like there's enough to get a point across. Sometimes that takes 50 pages. Other times it takes 200.

So for those who get that dreaded page-count question, you can wave your finger at this chart and tell people you're somewhere in the distribution.

Data Points: Visualization That Means Something is available now. Order your copy.