English (in grey above) is by far the most popular with Spanish (in blue above) taking the top spot amongst the other language groups. Portuguese and Japanese take third and fourth respectively. Midtown Manhattan and JFK International Airport have, perhaps unsurprisingly, the most linguistically diverse tweets whilst specific languages shine through in places such as Brighton Beach (Russian), the Bronx (Spanish) and towards Newark (Portuguese). You can also spot international clusters on Liberty Island and Ellis Island and if you look carefully the tracks of ferry boats between them.
Google, in collaboration with Vizzuality, are trying to catalog endangered languages before they are gone forever in the Endangered Languages Project.
Humanity today is facing a massive extinction: languages are disappearing at an unprecedented pace. And when that happens, a unique vision of the world is lost. With every language that dies we lose an enormous cultural heritage; the understanding of how humans relate to the world around us; scientific, medical and botanical knowledge; and most importantly, we lose the expression of communities’ humor, love and life. In short, we lose the testimony of centuries of life.
A map on the homepage gets the most attention. Each small dot represents a language, and they are color-coded by endangerment risk. Click on one to get more details about the language or add information yourself to improve the records. Zoom out and the counts aggregate for an overview.
Simply select a language, a region, and the metric that you want to map, such as word count, number of authors, or the languages themselves, and you've got a view into "local knowledge production and representation" on the encyclopedia. Each dot represents an article with a link to the Wikipedia article. For the number of dots on the map, a maximum of 800,000, it works surprisingly without a hitch, other than the time it initially takes to load articles.
There are obvious gaps in access to the Internet, particularly the participation gap between those who have their say, and those whose voices are pushed to the sidelines. Despite the rapid increase in Internet access, there are indications that people in the Middle East and North Africa (MENA) region remain largely absent from websites and services that represent the region to the larger world.
I have a problem. I want to talk about the digital humanities in the most relatable terms. I want to make it so accessible that a high school student could grasp the fantastic possibilities of using digital means to describe the world. I want great aunts—who have just learned how to Skype with their great nephews and get to see their great grand nieces smile—to be able to post traditional recipes to a family website, a website that transforms into a digital archive through the richness of its content.
While color is a purely visual phenomenon, the way we see color is not only a matter of our visual systems. It is well known that we are faster in telling colors apart that have different names, but do the names determine the colors or the colors the names? Recent work shows that language has a stronger influence than previously thought.
The Sapir-Whorf Hypothesis
If and how much language shapes our thought has been the subject of many debates over the years. In the 1930s, Edward Sapir and Benjamin Lee Whorf described a view that language determines our thinking: if we don’t have a word for a concept, we cannot think about it. This was a popular view for a while, but fell out of favor in the 1960s. The pendulum then swung the other way, with researchers believing that there was no connection between language and thought, and that language was a purely abstract construct.
In the last 20 years or so, a middle ground has started to develop. While it’s clear that language does not entirely determine our thinking, there is certainly an influence. The surprising thing is how deeply seated that influence can be.
Russian Blues
In their paper, Russian blues reveal effects of language on color discrimination, Jonathan Winawer, Nathan Witthoft, Michael C. Frank, Lisa Wu, Alex R. Wade, and Lera Boroditsky looked at differences in how native English and Russian speakers distinguish shades of blue.
It turns out that there is no single word for the English “blue” in Russian. The term siniy describes what most other languages know as dark blue, while goluboy is the name for lighter blues. The question is, does that difference mean that there is a difference in color perception between Russian speakers and speakers of other languages, like English?
The test Winawer and colleagues came up with is based on the well-known fact that it is easier for us to distinguish colors that have different names. When shown a reference color and two possible matching colors, we’re much faster when presented with, say, blue and orange than just two shades of orange.
The question is whether that is also true for Russian speakers and their different words for shades of blue. After all, our color names might be based on the same perceptual effects that our color perception uses to distinguish categorically different colors.
The result was that Russian speakers did indeed have an advantage over English speakers in telling siniy and goluboy apart. The authors of that paper then went on to test whether the reason was really language and not some genetic variation or similar. They had the study participants recite nonsense words (to keep their language centers busy) while performing the study, and found that under this condition, the difference went away.
It was clearly the language system interfering with a task that was presumably purely visual: distinguishing between different colors. Categories in our thinking may go much deeper than we think.
The Himba Tribe
A tribe in northern Namibia, named the Himba, have seemingly unusual names for colors. What the video embedded below (linked here for people reading this in their newsreaders) shows is that those names make it easier for them to see some color differences that most other people would find very difficult, whereas they have trouble telling colors apart that look quite different to most of us.
What the video unfortunately does not discuss is why they have these names for colors. There is a slight hint when one of the tribesmen describes several things that are “white,” like milk and water. It seems to me that their color names do not only (or primarily) describe hue, but also function of the things whose color they name. This is a very pragmatic way of using language, and is not unlike some languages whose grammatical genders are based not on sex, but on classes of things and animals that are more specific, like large vs. small animals, plants, dead things, etc.
Thoughts
The impact of language and higher-level concepts on visualization is the key to understanding how visualization actually works. Abstract concepts like color, shape, size, etc. seen in isolation elicit associations and embellishments that influence what we see and how we think about it.
The beauty of visualization is not only its visual nature and all the complexity it brings with it, but especially the deep connections we’re only discovering as we dive deeper into it.
Eric Fischer maps language communities on Twitter using Chrome's open-source language detector. Each color, chosen to make differences more visibly obvious, represents a language. English is represented in dark gray, which is used just about everywhere, so it doesn't obscure everything else.
The emergence of borders without actually drawing them in is interesting. There's a little bit of blending, but the splits are pretty well-defined. Especially in the Netherlands, where the tweet dispersal seems to be abnormally dense in that area. What's going on over there?
There's also a world version, but Europe is where all the action's at.
Can you picture it! Well google are probably well on their way developing it, but I want to share more doodles and ideas on this blog more.
Won't it be brilliant to use this as an app on your phone, or automatically detect a langauge from a sender then automatically translate it to the language you understand in their reciever. It cant be far away from development.
There is voice to text search app from google on my android htc, i'm sure there is text to voice that I hear students playing with on the mac with it. I can appreciate it probably takes a lot of servers to manage with the global population wanting to converse and communicate in their own lanaguage to other businessmen.
If you take the shannon and weaver communication model diagram of 1949 was... the noise in the middle would be the server translating and detecting idioms (uk an example would be: dog and bone, or up north: put wood in 'oil) and dialects.
Example:
English Voice to English text - server translate text (like at this site on toolbar) - Japanese Text to Japanese Voice
I admit the text to voice convertor might be limited in its translation of tone of the message from intonation of speech and inflection that comes from the rubato of spoken word. Maybe in time it can measure the pace, the raise in volume, the length of pauses, irony, but for now the nearest we can get to word for word meaning would be excellent.