All The Words In The World

by | February 4, 2016


One of the most useful websites to my Modern Languages degree is, which claims to feature “All the words in the world. Pronounced.” It is an intriguing claim. Since it is user-generated, it doesn’t really feature all the words in the world, which makes that declaration something like Activia claiming to “Feed your Inner Smile”. False advertising … but understandable. According to the list of featured languages on the site itself, it currently has a database of 662 languages. Of these, about ten have more than 100,000 pronunciations, while the majority have fewer than 500.

There are 6,909 languages in the world, according to SIL International, which puts my four-and-a-half to shame. It also puts the extent of Forvo’s database in some perspective. Of these, somewhere close to fifty per cent are considered to be endangered, ranging from severely endangered—like Mexico’s Ayapaneco, whose last two native speakers reportedly wouldn’t talk to each other—to the vulnerable languages such as Welsh. Surprisingly, there is quite a range of endangered languages featured on Forvo, although most of them have very small sample lexicons. Crimean Tatar, for instance, which has six words pronounced by five different speakers; Suret hasjust one. Presumably, given time and a combination of academic devotion and popular enthusiasm, Forvo’s range could expand considerably.

This is all very well and good, but also highly inaccurate. Languages, unlike—or perhaps quite like—some national borders, cannot be clearly delineated or even definitively counted. What really is a language? And where do the borders lie between them? Mutual intelligibility is often listed as a defining factor but, like a language itself, intelligibility is not fixed or stable. Sometimes languages grow apart; many exist on a continuum. Recent linguistic approaches have veered away from the classic idea of linguistic ‘branches’, derived from one robust mother tree, towards a more complex wave representation. This acknowledges that dialects, creoles, patois, and sister languages all exist simultaneously and that ‘standard languages’ are essentially artificial.

The model is simple: type a word search and you will come up with user-created sound clips, demonstrating how to pronounce it. Among this plethora of voices, different accents and dialects emerge, enumerating everything from simple pronouns to brand names or regional slang. When I search “Québec” under the French category, six pronunciations come up, of which half are Canadian and half metropolitan French. By contrast, when I search Marseille, only metropolitan French speakers appear. Take a common and neutral phrase such as “Bonjour” and twenty-seven out of the thirty-seven speakers list their region as France and all have standard French accents: the rest are distributed across other French-speaking areas, such as Belgium, Switzerland, and Canada. ‘Bonjour’ has no speakers from Morocco, the Ivory Coast, the Congo, or other French-speaking areas, although I’ve stumbled across them elsewhere on the site.

If minority languages are dying out at a terrifying speed, regional variants may be dying even faster. In a world of instant communication and globalised industry, speaking the Queen’s English or Parisian French has become an important sign of linguistic prestige. From this perspective, lies ambiguous at the centre of a web of geographical and social intersections. On the one hand, it has the potential to preserve voices from all over the world pronouncing “all the words in the world.” On the other hand, it runs the risk of further neutralising language and promoting one selective standard. In the future, like some kind of Leibnizian fantasy, we might all speak the same way.