Quick access:
Go directly to content (Alt 1)Go directly to second-level navigation (Alt 3)Go directly to first-level navigation (Alt 2)

Word! The Language Column
How many Words does the German Language have?

Word! The Language Column
How many words does the German Language have? | © Goethe-Institut e. V./Illustration: Tobias Schrank

​Coining new words is a cinch in German. But not every neologism belongs in a dictionary. About sifting through long word lists for the next edition of the “Rechtschreibduden”, the standard German spelling dictionary.

 

By Kathrin Kunkel-Razum

Konrad Duden published his Vollständiges orthographisches Wörterbuch der deutschen Sprache (“Complete Orthographic Dictionary of the German Language”) way back in 1880. That was a brilliant stroke of false advertising of course because there’s no such thing – and never will be such a thing – as a "complete" dictionary of German.

But how many words does the German language have anyway? This question has been a bone of contention in recent years. Computational linguistics methods now make it possible to determine far more accurately than ever before how big the language actually is. Around the year 2000, during my first years as a Duden editor, the standard German lexis was estimated at 300,000 to 400,000 words. Recently, an analysis of the Dudenkorpus, our electronic collection of texts, came up with a count of 17.4 million base forms (i.e. different words in uninflected form). Has the vocabulary grown so much so fast? How can we explain such a huge difference?

What’s a word?

First of all, we need to settle the question of what a word actually is. Is Müllautohintendraufsteher (i.e. a garbage collector who stands on the back of a garbage truck) a word? Even if you've never heard the word before, yes, it is. Why? Because it’s a so-called formative (i.e. minimal syntactic unit) with a meaning that we understand. It is capitalized (like any German noun), with a space before and after it; you can construct a female form (Müllautohintendraufsteherin) and so on. But some people have qualms about accepting this compound coinage as a word, probably owing to how rarely it occurs, which makes it a so-called “nonce word” or “occasionalism”. But is this word part of our standard language? Certainly not. It simply isn’t used often enough, in fact it doesn't occur at all in our corpus. And there are thousands and thousands of words attested by only a single occurrence in our text database, which means they are rarely, if ever, used.

That explains the big difference. But of course the number of German words is infinite anyway, because we can form new words all the time. Among other things, this has to do with the fact that the system for forming new words in German is perfect for putting together new combinations of words and parts of words whenever we like. So there will never be a complete dictionary of the German language.

Dictionaries mirror the age we live in

Five thousand new entries have been added to each of the last editions of the Rechtschreibduden. How do Duden editors select them from the spate of new words? We zero in on words that have been newly added to our corpus of texts in the three or four years between two editions and are not listed in a previous edition. The result is a very, very long Excel list of thousands of words. The editors go through the list selecting words that might be worth including in a spelling dictionary because they’re hard to spell. This criterion wouldn’t be very important for a defining dictionary, on the other hand. The list also includes plenty of street names, names of football players and the like because they often come up in the newspapers we analyse. But they’re not very important for us either since we’re not putting together a “who’s who” of famous people. What is important, however, is which words have social relevance or belong to people's everyday vocabulary. So a dictionary, especially a spelling dictionary, is always a reflection of social developments in its day and age. The words we picked in 2017 for the 27th edition of the Rechtschreibduden included Lügenpresse (the “lying press”), Mütterrente (lit. a special “mothers’ pension” to count time spent raising preschool children), Späti (a convenience store that’s open late), Willkommenskultur (“welcoming culture”) and Zipphose (“zip off trousers”).      
 
My next article will be about integration – integrating foreign “loanwords” into the German language.

Top