Aug 212014
 

I’ve come up with a visualization of data uncertainty that seems really obviously useful, but that I’ve never seen before[1]. So I guess some combination of three things must be true:

  1. I am a genius. Deeply unlikely, given that I misspelled “genius” the first time I typed it here.
  2. There’s something wrong with the “new” method that makes it less useful than I think and/or total bunk.
  3. People do use this, and I just haven’t seen it before. Totally possible, given the number of statistical visualizations in most literary studies papers.

Anyway, the idea is to use probability clouds to show a density region around a given line of best fit through the data. I think this avoids some visual-rhetorical pitfalls in the usual ways of showing trends and uncertainty in data, but/and I’d be grateful for thoughts on its value.

Here’s the context and an example: I’m working on a manuscript at the moment for which I need to visualize a bit of data. Nothing fancy; this is one of the basic figures:

Demo 0 data

Yeah, the axes aren’t labeled, etc. The point is, there are two series that are pretty noisy but seem to be doing different things over time (along the x axis).

OK, so to get a handle on the trend, let’s insert a linear fit for each series:

Demo 1 line

Neat! But the fit lines are a little misleadingly precise. I don’t think we want to say that the “true” value of series 2 in 1820 is exactly 0.15, or that the true values cross in exactly 1872. So let’s add a confidence interval at the usual 95% level:

Demo 2 line se

Better, but this manages to be somehow both too precise and not precise enough. Beyond the line of best fit, which still suggests false precision at the center, the shaded 95% confidence region comes to an abrupt end (too precise) and doesn’t have any internal differentiation (not precise enough). The true value, if we want to think of it that way, isn’t equally likely to fall anywhere within the shaded region; it’s probably somewhere near the middle. But there’s also a smallish chance (5%, to be exact) that it falls outside the shaded region entirely.

So why not indicate those facts visually, while getting rid of the fit line entirely? Here’s what this might look like:

Demo 3 cloud

This seems a lot better. It doesn’t draw your eye misleadingly to the fit line or to the edges of an arbitrarily bounded region, but it does suggest where the real fit might be. And it does that while making plain the fuzziness of the whole business. It would be even better in color, too. I like it. Am I missing something?

On the technical side, this is built up by brute force in R with ggplot. The relevant code is:

library(ggplot2)
p = qplot(x, y, data=data)  # Use real data, of course!
se_limit     = 0.99  # Largest standard error level to show; valid range 0 to 1
se_regions   = 100   # Number of regions in uncertainty cloud. 99 is a lot;
                     #   a little slow, but produces very smooth cloud.
se_alpha_max = 0.5   # How dark to make region at center of uncertainty cloud.
                     #   0.5 = 50% grey.
line_type    = 0     # A ggplot2 linetype for fit line; 0 = none, 1 = solid
for(i in 1:se_regions) { # This loop generates the uncertainty density shading
  p = p + geom_smooth(method = "lm", linetype = line_type, fill = "black", level = i*se_limit/se_regions, alpha = se_alpha_max/(se_regions))
}
p    # Show the finished plot

That’s it. As you can see, it’s just brute force building up overlapping alpha layers at different confidence levels. I once looked at the denstrip package, but couldn’t make it do the same thing. But I’m dumb, so …


[1] If you’ve learned any undergrad-level physical chemistry, you can probably see where this idea came from. Here’s a bog-standard textbook visualization of the electron probability density of a 2p atomic orbital:

(source; back to the post body])


Filed under: Digital Humanities
Aug 212014
 
In one of our blog posts last week, we featured the Wardington Hours, a relative newcomer to our collections. Three other Books of Hours have been acquired by the British Library since 2000, each of particular interest to art historians and scholars. Add MS 74754: ‘The Small Bedford hours’ In...
Aug 202014
 

Cross-posted on my personal site

This past year the Scholars’ Lab has implemented many performance upgrades and bug fixes for Prism. The most recent upgrade is particularly exciting: users can now deploy their own personal Prism installations to Heroku with the click of a button. Well – it will take the click of a button and a few other commands. I’ve added a section detailing just how to do so under the “Deploy to Heroku” section of the Prism Github’s readme.

It was already possible to implement private user communities by marking uploaded prism games as “unlisted” and then distributing the links to your group of participants. The Heroku deploy function makes this process a bit easier by allowing to users to host all of their games in one place. The process also sets you up well to tinker with the Prism codebase using a live app, as Heroku provides instructions for cloning the app to your desktop.

All of this on the heels of another exciting announcement: the Praxis Program has a short article on Prism appearing in the Digital Humanities 2013 special conference issue of Literary and Linguistic Computing. In the piece, we summarize Prism’s and interventions into conversations on crowdsourcing with special reference to its user interface.

It’s a good day to e-highlight!

 Posted by on August 20, 2014
Aug 202014
 

 Jønlers Saga of a past future

Bern Stone wanders into an empty family. This often happens in his environment stripped of playful affordances. In his peak of the civilization, nothing brings fourth our hero, but the unfulfilling routinely engagement in the lonely supper meal’s slow attachment to the warmth of the stove.

read more

Aug 202014
 

Jawbone Sleep

An additional hour of sleep can make a huge difference in how you feel the next day (especially when you have kids). It's the ability to concentrate for long periods of time versus the ability to stare at a clock until your next break. I got the Jawbone UP24 band to try to improve on that, and I still wear it every night to better understand my sleep habits.

So, it only seems natural for Jawbone to look closer at how people sleep as a whole in a couple of interactive graphics. Select your city to see how people sleep in your neck of the woods.

Every now and then we see a set of graphics that shows America's sleep habits, based on data from the American Time Use Survey. The Jawbone data is likely more accurate though, which makes it more interesting. The former depends on survey participants' memories and doesn't factor out things like reading in bed. The latter is actual sleep.

Tags: ,

Aug 202014
 
Reader, I was wrong.

Five years ago, I wrote a post arguing that museum photo policies should be as open as possible. I believe that the ability to take photographs (no flash) in a museum greatly increases many people's abilities to personalize, memorialize, and enjoy the experience. I still feel that way. Mostly. But this past week, a string of stories from London have changed my perspective.

The posts come from an aptly-named blog: Grumpy Art Historian. Blogger Michael Savage and I rarely see eye-to-eye, and that's why I love reading his posts. Last week, he wrote a series of posts about the British National Gallery's reversal of their photo policy. For the first time, the National Gallery is permitting non-flash photography.

The result appears to be a total mess. Lots of flashes. Mobs of ipads. Dangerous leaning and touching. A swarm of cameras everywhere. The paintings have become beleaguered celebrities, pursued by mobs of novice paparazzi.

Reading Michael's posts carefully, it seems that the cameras are not the ultimate culprits. Cameras weaponize an already unwieldy mob of people. They are the sidearms of packed-in novelty seekers. A scene like the one shown above is not just a mess because of the bevy of phones and cameras. It's a mess because of the crowd.

A packed crowd in a museum turns a free-choice viewing environment into a programmed event. You are stuck with the people around you, in front of you, shoving up behind you. Suddenly, a visual distraction like a camera--innocuous in an uncrowded space--becomes as bad as someone talking in the movie theater. You can't not see their camera. You are all in the same space.

Why is this gallery so crowded? Because it's famous. Michael notes that other parts of the National Gallery are still relatively quiet and manageable. But the star paintings--the Van Gogh sunflowers, the Botticelli virgins--are mobbed.

The cult of celebrity is strongest in fields where the general public knows little. How many opera singers can you name? How many painters? How many museums? The biggest museums get the most traffic--and primarily therein to the big name artworks in their collections. There are plenty of galleries in the Louvre that are empty. The one with the Mona Lisa will never be one of them.

Museums have exacerbated this cult of celebrity through an emphasis on blockbuster exhibitions and traveling shows that "package" the greatest hits into must-see moments. We push the once-in-a-lifetime experience of seeing the art. And then the crowds show up. They were told they must not miss it. They had better capture the moment however they can! And so the crowds shuffle through, cameras dutifully in hand. The art gets captured like a lame animal in a game park, instead of the wild thing it is.

Thinking about all of this, I remembered Don Delillo's beautiful bit in White Noise about the most photographed barn in America. Two of the characters in the novel go out to see this barn, and to see all the people taking pictures of it. One of them, Murray, says,
"No one sees the barn... Being here is a kind of spiritual surrender. We see only what the others see.  The thousands who were here in the past, those who will come in the future. We've agreed to be part of a collective perception. It literally colors our vision. A religious experience in a way, like all tourism."
The barn, like Van Gogh's sunflowers, is a tamed thing. With every click, it becomes less a barn and more a likeness of a barn. It is sacrificed to the continuous capture of its likeness.

I'm OK with this happening to a barn in a novel. I'm not sure I'm OK with it happening to art and cultural artifacts.

Is there an alternative?

Michael Savage might say: turn back the photo policy. Get rid of the cameras. But I think the cameras are a distraction. The real thing we have to get rid of is the crowding.

I'm heading out next week on vacation, camping in the high Sierras. To do this, I have to get a wilderness permit. To do that, I either had to plan way in advance (I didn't) or I have to get up at 5am to stand in line for three hours to get a permit (I will).

There are wilderness permits for the same reasons there are restrictions on visitors to museums: to protect the artifacts (nature) and to ensure the safety and positive experiences of the participants.

The permitting system is not primarily based on money; anyone can get a permit for a reasonable rate. It is based on the idea that there is a maximum capacity for safe and positive wilderness experiences, and that there are rules and systems that have to be put in place to ensure that capacity is not exceeded.

There is a maximum capacity for safe and positive experiences with art in museums. The right capacity absorbs diversity in learning styles. Some people can sketch in museums. Some people can take photos. Some people can talk. Some people can look. Any of these actions can be catalysts for deep and meaningful engagement. And they can all do all of these things peaceably if there is enough breathing room among them.

I think of the best museums as generous places. They welcome different people spending different amounts of time doing different things to connect with the work on display. If they are popular museums, they support people visiting at many hours of the day to be able to have a good experience despite the demand.

Crowded places become parsimonious places. They are transactional by necessity. Every deviance from our own preferred mode of engagement becomes more visible and frustrating. Diversity breeds name-calling instead of understanding.

Let's find a way to build generosity back into the operation of the largest museums in the world. Let Van Gogh be Van Gogh. Let the people experience the sunflowers in their own way, with their own bit of space and time. We need to build systems that let visitors, and art, bloom.

Crisis Text Line releases trends and data

 Uncategorized  Comments Off
Aug 192014
 

Crisis Trends

Crisis Text Line is a service that troubled teens can use to find help with suicidal thoughts, depression, anxiety, and other issues via text messaging. The long-term hope was to anonymize and encode these text messages so that researchers and policy-makers could better understand something typically kept private to the individuals.

Following through, the organization recently released a look into their data and a sample of encoded messages. (There's a link to download the data at the bottom of the page.)

The visual part of the release shows when text messages typically come in, and you can subset by issue, state, and days. It could use some work, but it's a good start. Hopefully they keep working on it and release more data as the set grows. It could potentially do a lot of good.

Tags: