my top 5 “most difficult spelling systems” list

English orthography is a beautifully inconsistent, inconsistently beautiful, sadistic excuse for a “system.”  My favorite example is what happens when you add letters to the word “tough.”  (The symbols on the right are how the words are pronounced, and even if you don’t understand what they mean exactly, you can see that more changes than just a single character)
  • tough /tʌf/
  • though /ðoʊ/
  • through /θɹu:/  (notice, so far, not a single sound in common)
  • thought /θɔ:t/ (or in my dialect, /θɑ:t/)
  • thorough /ˈθʊɹoʊ/
So, there are at least four different vowels and sometimes an /f/ represented by <ough>, and there are two different sounds represented by the <th>.  Add words like “laugh,” “taught,” “draught,” and “drought” and you start to see how this spirals out of control really quickly.  Part of the problem is that we have, depending on the dialect, anywhere between about 35 and 45 distinct sounds (called phonemes), and yet a mere 26 letters to write them. Linguists, very appropriately, call that a “defective script,” and English is astonishingly defective.  Of the 26 letters we do have, several overlap in strange Venn-diagram-like sound mappings such as s-c-k-q-x.  It’s kind of mind-boggling really to think about this, but there is literally one sound in the entire English language that is written with one and only one group of letters, and that’s the so-called “soft th” /ð/ in “though” from above.  The letters <th> can represent multiple different sounds (e.g. Thomas), but the phoneme /ð/ can be written only one way.  Every other phoneme in English has at least two, sometimes more than 10, and, in the case of the sound /eɪ/ as in bay, pain, base, bass, ballet, obey, dossier, resume, or even résumé, etc., more than 20 different graphical representations!  So you can imagine that, if you were a speaker of another language trying to learn English, you might assume that native speakers had devised the system as some sort of cruel joke to keep foreigners from ever mastering it.
To every rule, there is some exception.  For every word like “photograph,” there are words like “haphazard” and “Stephen.”  Chore?  Choir and charade.  Singer?  Finger and angst.  Bureau?  Bureaucracy and beauty. Of course there are hundreds of homophones that are spelled (or spelt) differently but are pronounced the same, like write/rite/right, prince/prints, or soared/sword.  Those are decently common across languages.  It is quite rare, however, to find a language with anywhere near English’s number of heteronyms that are spelled the same and yet pronounced differently, such as having a gaping wound as opposed to being wound up, the bow of a ship or a bow and arrow.  It is not a moderate task to moderate discussions of English spelling.  Everyone knows about our silent “e,” but then there are things like silent “b” (debt, comb), silent “p” (psychology, pneumonia), silent “t” (castle, listen, soften, not to mention words that vary like “often”), silent “s” (island, debris), silent “l” (salmon, talk), and it just goes on and on like that.  It is genuinely pretty hellish, but are other languages any easier?  Are there even more horrific systems?  How does English orthography really stack up against the rest of the world’s written codes?  Well, first we need to survey the landscape and see what’s out there.
Before that, though, if you’re hearing the word “orthography” for the first time, it simply means the whole system of writing.  It comes from the Greek stems orthós-, meaning “standard,” “legal,” or “correct” (think “orthodox”), and -graphéin, meaning “to write.”  Linguists tend to use the term “orthography” rather than “spelling” for two main reasons: first and foremost, it makes us sound smarter and thus feel better about ourselves.  Second and almost equally important is the notion that a strict definition of “spelling” is relatively narrow in scope.  Spelling means the way in which words are written using letters and diacritics (which are accent marks like in résumé), and therefore doesn’t necessarily apply to all languages that have writing.  In contrast, orthography is a very broad term that encompasses many aspects of writing.  Here’s a simple graph:
Put simply, spelling is one part of orthography, but spelling is not everything you’d have to learn in trying to master English writing.  If I “spell out” some of the additional features that orthography covers, it might give a good idea of how daunting it could appear to would-be learners:
  • Punctuation and other unpronounced characters
    • Some languages don’t (or at least, didn’t) use punctuation at all, which might make an adjustment to a fully-punctuated system more taxing. Where there is punctuation, rules on usage are often quite complex, and they differ across languages, even between sister languages like French « Allons-y! » and Spanish «¡Vamos!»
    • Languages that use punctuation often use it to contrast meaningfully different phrases, like the famous “A woman without her man is useless” and “A woman: without her, man is useless.”  Those differences aren’t universal; they have to be learned
    • Emoticons and other non-letter characters like ellipses and dashes often carry meaning or perform important discourse functions, especially in informal contexts.  Anyone who has ever tried learning the stupefying array of Japanese emoticons knows the true meaning of despair. I mean, for crying out loud, there’s a specific emoticon for the act of playing volleyball! (/o^)/ °⊥ \(^o\)  When I first got that in a text message, I didn’t read that as it was intended: an invitation to go to the beach. I thought someone had to go to the hospital
  • Orientation
    • Just to list some examples, English, Russian, and Inuktitut are written horizontally from left to right
    • Arabic and Hebrew are written horizontally right to left
    • Classical Mongolian is written vertically left to right
    • Chữ Nôm (Old Vietnamese) is written vertically right to left
    • Modern Mandarin and Japanese are written in multiple different orientations depending on the format and level of formality
    • Egyptian Hieroglyphs were prototypically horizontal and right to left, but varied based on other stylistic factors and some characters were read in their own special order
    • There are yet other ways of orienting writing, too
  • Penmanship, scripts, and written styles
    • There are sometimes large differences between the way (or ways) in which things are written in the same language depending on the people, place, purpose, and time.  When people study a Chinese language, for example, they have to learn that type-font characters like 書道 can appear radically different in hand-written forms, but they have to recognize that the intent is the same.  Beginning learners of English are often confused with the Times New Roman “g” and “a/ɑ”
    • Block-print and cursive writing are other good examples in English, but even within cursive, there are different styles like Spencerian and D’Nealian, and there are often idiosyncrasies across languages
      • French cursive “1” often has the hook extending all the way down such that, to North American eyes, it looks almost like “Λ”
      • Japanese teachers of English were taught for many years to write “s” with a hook such that many still write it as “ʂ,” which is a different letter in other languages
  • Alternate characters and characters for special purposes
    • In English, accountants and mathematicians often write 0 (as well as 7 and z) with additional slashes to disambiguate or prevent fraud, but Ø is a separate letter in Swedish, for example, so Swedish accountants tend to write 0 with a dot in the center instead.  Similarly, many Japanese and Chinese legal documents write the numbers 1, 2, and 3 as “壱、弐、参” instead of the more general “一、二、三”
    • Roman numerals (like MMXII) are another part of our writing system that has to be learned in order to be fully literate
For some languages, parts of orthography fall into spelling, too, such as:
  • Spacing
    • Some languages put spaces between words like modern Korean, others like Japanese and Chinese do not
    • Some languages or language varieties that put spaces between words write compound words as one continuous sequence. German is probably the most famous example with its words like “Geschwindigkeitsbegrenzung,” which English spaces out into “speed limit”
    • English is wildly inconsistent on this one, though. We write fireman and highway in unbroken strings, fire truck and high school with spaces. Rollerskate, roller-skate, and roller skate are all well-attested in corpora, and even the snootiest of dictionaries often have multiple listings
  • Capitalization
    • Most languages are unicameral, meaning that they don’t have upper- and lower-case letters, but even closely-related languages that are bicameral often differ wildly in this respect; English doesn’t capitalize every noun anymore, but German still does
    • English is perhaps unique among languages in requiring that the pronoun “I,” but not “we,” “you,” or any other pronoun for that matter, is always capitalized
    • Conventions change over time. If you have an old version of Word (which is capitalized) and you run a spellcheck (no space), you might get prompted to capitalize words like “internet,” but that’s no longer the case (if you got that last pun, you can join me in tears of shame)
    • We can capitalize words mid-sentence for emphasis or to make a contrast, such as religion/Religion, or we can sometimes use ALL CAPS
    • French, English, and several other languages also have an interesting trend in the opposite direction, writing common acronyms in all lower-case (e.g. HIV/AIDS in French is “sida”; the word “radar” started as an acronym)
Even after all that, the most important, and in some ways the most obvious difference between the terms “orthography” and “spelling” is that “orthography” can refer to more types of grapheme systems. (A grapheme is just a unit of writing)  For example, it would be a little strange to talk about “spelling” for logographic writing systems like Chinese characters.  There are different ways to represent characters in Chinese, and to be sure, there are strict rules for how to write them with correct stroke order and such, but the characters don’t directly or consistently represent sounds per se.  In fact, there are quite a number of different types of written systems:
  • Alphabets like Georgian (there are actually three different Georgian alphabets) or Korean, where each character (or component part of the character) represents usually only one sound, although even the purest ones like Korean have exceptions and situational rules
    • English is alphabetic in a sense, but there are only two letters that consistently represent only one sound, <v> and <q>, both of those represent sounds that are sometimes also written using other letters, and there are an increasingly large number of foreign loanwords that break even those patterns
  • Abjads like Arabic or Hebrew, where the consonant sounds are written but some or all of the vowels are left unspecified, and the reader has to figure it out from context
  • Syllabaries where each character represents a whole syllable, not just a sound, which can be further divided into:
    • Abugidas like Ge’ez in Ethiopia where relations between related syllables are shown by added marks or changes to the same base character.  For example, in Inuktitut, the language of the Inuit people in North America, /ki/, /ka/, /ku/ are written ᑭ, ᑲ, ᑯ
    • “Arbitrary syllabaries” like Cherokee or Japanese kana where there is no relationship between similar sounds. Using the same syllables, /ki/ /ka/ /ku/ in Japanese hiragana are written き,か,く
  • There are yet other types of absolutely crazy hybrid systems like Mayan that used logograms and syllabics and plentiful rebuses (think of things like “gr8” for “great”) all together in the same characters.  How Mayans were able to read without having an aneurysm is beyond me.
So after all that, there’s quite a lot to consider, really, when we try to compare written systems in terms of difficulty level.  In truth, any objective comparison is flat-out impossible.  How could we compare English to a system like Hong Kong Cantonese that has more than 20,000 characters in common usage, some of which are just brutal like 戲劇 (which means “movie”), but which are nevertheless regular, represent units of meaning and not always sound, are pronounced almost always in only one way, and are composed of a limited set of simpler parts?   Put simply, we can’t.  They’re completely different challenges.  So, instead, I’ve arbitrarily limited my list to languages that either use syllabaries or alphabets, I’ve kept the list to languages that are alive today (otherwise, in my opinion, Mayan is hands-down the most astonishingly complex orthography ever devised), and I’ve excluded exceedingly rare languages or scripts like Afaka.  The remaining list is, I think understandably, not perfect, but I challenge you to come up with a better one, haha!
Last note: I have a few runners-up.  First, Thai orthography is pretty rough.  There are several characters borrowed from Sanskrit that are pronounced the same as other Thai letters but which are not interchangeable.  Thai is also interesting in that it marks tone in writing.  Its Eastern neighbour, Khmer, from which large parts of Thai script are derived, is also highly complex and irregular, with lots of variability depending on the surrounding sounds.  Both of these are, however, not quite so irregular and sadistic to make it into my subjectively-judged top 5.  The reason is simple, and also explains why English does make the list: spelling reform.
Most languages have undergone at least one, and often multiple spelling reforms, usually because the government or another authoritative body wants to standardize the language and modernize it.  Languages change over time, and pronunciation in particular is highly variable, but ink blots printed on a page don’t often change shape in response to social trends.  In French, the Académie Française was created during the time of Louis XIV to standardize the language.  They publish dictionaries, make recommendations for school curricula, and have helped to rein in the chaos (note: rein, not reign, or rain).  The French language gets a lot of flack for words like “ils accueillent” where half the letters are silent, and then words like “lent” where most of those same letters are pronounced, but once you understand a few rules about grammatical categories, reading is not so bad.  Linguists call this a distinction between “encoding” (writing) and “decoding” (making sense of that writing, usually reading), and while French is difficult to encode, the decoding process is much more doable.  That doable-ness is largely thanks to the reforms enforced (sometimes even through violence!) by the French powers that be.  Khmer, Thai, Spanish, Russian, Italian, Swedish, Mandarin, Korean, Hindi, Mongolian, indeed a great many of the world’s languages have undergone systemic reforms to try to make the writing system more regular than it was before.  In fact, another “runner-up” for me would be Danish, which is a particularly interesting example since it’s so closely related to Swedish but hasn’t gone through the same kinds of spelling reforms.  As for English, there have been multiple attempts, but the only major influential spelling reform (Noah Webster’s) only succeeded in creating a chasm between two separate standards with equally absurd inconsistencies and idiosyncrasies.  Hopefully, that provides enough context and caveats.  Here, finally, is my top 5 crazy spelling systems:
  • Uyghur has four completely separate alphabets that are all standard in modern usage, and historically there have been quite a few others. It’s one of very few languages based on Persian to obligatorily represent vowels, and there are quite a few of them: /y/ /ɪ/ /ø/ /æ/ /ɑ/ /u/ /e/ /o/, but the real irregularities come from its many Chinese loanwords that are often quite difficult to distinguish from other words and follow their own set of phonological rules
4) Burmese
  • The fact that this language’s orthography doesn’t match up with its pronunciation is a point of pride for some Burmese nationals.  There are even different words for “written language” and “spoken language” that mark the distinction.  In many ways, literate Burmese people can be said to practice diglossia, the command of multiple different dialects for different functions.  The spelling system is more or less regular viewed from the inside, but its difficulties and irregularities come from being written in stone hundreds of years ago and far removed from what has happened with the language since then
3) Irish Gaelic
  • The language is growing after it had declined in previous generations, but the writing system reflects its diverse and chaotic history.  There are digraphs and complicated allophonic rules up the wazoo, such that the word for Prime Minister, Taoiseach, is pronounced /ˈt̪ˠiːʃəx/.  A combination of having no standard spelling until the mid-20th century and huge dialectal variation, especially in vowels, has created cases where the orthography is regular in some regions for some words, and in others for other words, but nowhere for all of them
2) English
  • Truly, English is among the craziest spelling systems in the world.  Ruth Shemesh and Sheila Waller’s book explains a great deal of the subtle regularities, and I highly recommend it, but even they can’t make sense of the huge number of exceptions that continue to grow by the day.  There are some theorists who believe that English is becoming more and more like a logographic system, where basically each word has to be memorized as one chunky symbol with component parts, rather than analyzing each word internally from left to right.  That’s not far off, and yet English, I think, still only gets the silver medal in my book
1) Japanese
  • Interestingly, many of the same historical reasons for English’s, shall we say, “diversity” of spelling rules are shared by Japanese: several, chronologically-disparate waves of mass importation of foreign words, several of which use entirely foreign writing systems that were only sometimes, and then only partially, regularized.  Japanese uses three largely independent writing systems together, or increasingly four if you include roma-ji, which many academics do because of words like t-shirt, “Tシャツ,” and acronyms.  Two of those systems are mostly faithful phonetic syllabaries, but there are exceptions like particles.  The third system has more than 2,000 characters in common usage each with numerous, largely unpredictable readings depending on when and where the word originated.  All three are commonly used together in the same sentence, and combinations of two are often together in the same word: サボる (to skip class), アメリカ的 (American-style).  Pitch accent, which is contrastive for most dialects, isn’t marked anywhere (e.g. the “three hashi”: 端、橋、箸), and all the while many other characters are written with several different variants despite no meaningful phonological or even semantic contrast in any dialect (again, I’ll use “hashi”: 橋、槗).  Like English, there are tons of heteronyms, sometimes among quite frequent words like 甘い(umai, delicious / amai, sweet), 辛い(karai, spicy / tsurai, painful, or here’s a better definition), and then we get into proper nouns and fossilized expressions and what little regularity was left in the system breaks down completely
So, in my opinion, when we ask how bad English spelling is compared to other languages, the answer is: among the worst.  There are other, frighteningly complex systems out there, to be sure, but English finds a way to take its deceptively simple 26 letters and make the absolute most it can out of them.  As a final note, I think it’s important to say that, by “worst,” I don’t mean to say that we should look down on English orthography, or Japanese orthography for that matter.  In fact, in my mind, I could have equally replaced the “most difficult” in the title with “most interesting.”  Irish, English, Uyghur, Burmese, Japanese, and even French are fascinatingly rich and complex, and indeed in many ways I think they are tremendously valuable in their idiosyncrasies.  The spelling of a word in these languages contains an immense amount of information; we can know just by looking at a word like “know” that it came from German, whereas a word like “ascertain” came from Latin through French, which in turn tells us more about the connotations of those words, the usage patterns, and the history of English-speaking people.  We’d have “no” way (or at least, no immediately apparent visual way) of doing that if we spelled no and know identically.  These “top 5” are a testament to society and language’s ability to evolve and thrive within the infinitely complex interactions of people and peoples, and while that does imply a lot of baggage, I for one am not upset about having all that stuff to cart around.  I like stuff.
So yes, English spelling is a handful, but it’s not alone, and that’s a good thing.  Perhaps, instead of denouncing others for their poor spelling of a choice few words, we should in fact celebrate the fact that people get so many other highly irregular, largely nonsense spellings correct!  (Like, for instance, “people”)  In the very least, if you deal with foreign learners of English, I hope you can sympathize (or even sympathise) with their struggle.  They truly do have it pretty bad.
Advertisements

why ‘funner’ should be an accepted word

For most people, the litmus test for whether a word is a “real word” or not is its inclusion in or exclusion from a dictionary, and especially a big, fat, haughty-looking paper dictionary.  Erin McKean eloquently describes how the lexicographers who make those dictionaries disagree with that approach here, but it wouldn’t take long for even complete lexicographical amateurs to start to see the holes in that line of logic.  New words are added to the dictionary every year; were they just figments of the imagination until that time?  More to the point, the act of printing a word in a way immortalizes it, such that it remains in the dictionary long after it ceases to have any meaning at all.  For example, it is both tragic and frankly absurd that the Oxford English Dictionary accepts the words “funniment” and even, I kid you not, “funniosity,” but not funner or funnest.  Are those words, which have fewer examples of use in written or spoken English than the number of letters they contain, somehow better or more real?

But we mustn’t forget, those who deny “funner” often state, that “fun” is only a noun, not an adjective.  We only have comparative and superlative forms for adjectives, not for nouns, so therefore “funner” and “funnest” must only be figments of our imagination.  To their credit, certainly the word “fun” is used as a noun quite often.  We can have a lot of fun; we can’t have a lot of enjoyable.  We can “make fun of” someone, just like other noun constructions like “making sense of” something.  On the other hand, there’s nothing about the status of “fun” as a noun that makes it any less viable as an adjective; there are literally hundreds of noun-adjective homonyms in English (e.g. every colour word in the language: a red firetruck or the deep red associated with it).

Still, some dictionaries like Oxford’s American online dictionary or the American Heritage Dictionary stubbornly pigeon-hole “fun” as a noun, often accepting its exceedingly rare use as a verb (meaning something akin to “tease” or “joke with”) while labeling the adjectival form “informal” or “slang.”  That’s a little, well, funny, given that the very same American Heritage Dictionary’s citation for informal use of fun as an adjective is a quote from Margaret Truman, daughter of the US President, in a public speech in the 1950s.  It’s weirder still given that the word “fun,” even when used as a noun, is not exactly among the snootier choices for that concept in the English lexicon.  How is “I’m having fun” any more formal than “this is a fun party”?  Oxford goes even further, though, stating that “the comparative and superlative forms funner and funnest are sometimes used but should be restricted to very informal contexts.”  Notice: “should be.”  Who the hell do those Oxford braggarts think they are?

While I was looking this up, I found several comments from anonymous online contributors saying that adjectival “fun” was some sort of “new development” and that it was only because it was new that it hadn’t been accepted yet.  Well, frankly no.  There are plenty of newer words that have been accepted, like “gramophone” and “photograph,” and even newer adjectives like “toasty,” “photographic,” and even words like “fugly.”  More importantly, however, it would be a mistake to say that the adjectival reading is some sort of neologism, and that it’s only in today’s materialist, consumerist culture that uneducated young people (and Margaret Truman) have started using the word improperly.  First off, that argument’s hard to reconcile with uses like “this is a fun little item,” spoken by a professor during an academic lecture.  Indeed, in that same corpus (called MICASE), the word “fun” was used almost a third of the time in contexts that only allow for an adjectival reading.  Second, in terms of the word’s etymology, funner and funnest both date back to the 18th century by conservative estimates.  So far as anyone can tell, “fun” probably comes from the Middle English word “fon,” from which we get other words like “fondle” and expressions like being “fond of” something.  Interestingly, that word was a verb, noun, and adjective, and even had the -ly suffix attached to it to make an adverb. The adjective form is at least as old as the nominal reading.  (If you ever need entertainment, just try adding -ly to random nouns around you and then try to make sense of the result.  You’ll find that words like “bedly,” “pillowly,” and “windowly” don’t quite roll off the tongue.)

Here’s the real kicker, though.  Since 2010, Merriam-Webster has listed the comparative and superlative forms as legitimate words, and it’s not alone.  The Scrabble Dictionary, which, if not exactly the pinnacle of lexicographic achievement, often plays a key role in word/non-word disputes, and has included both funner and funnest since 2008.  In sum, a dictionary is a piss-poor means of determining a word’s status as “real” or not, but even under the dictionary rule, funner and funnest are in a grey area.

Why, then, do ill-informed pedants still swagger about denouncing users of funner as “stupid” or “uncultured”? (Real quotes)  Some insist on “correcting” phrases like “the funnest party ever” to the distinctly less natural “the most fun party ever.”  At least that rules out the possibility of a noun interpretation.  You can’t have “the most enjoyment party ever.”  For adjectives, when we make comparative and superlative forms in English, we follow a pretty straightforward mechanism.  If it’s a single-syllable adjective, it gets -er/-est; if it has three or more syllables, then it’s always more/the most X; if it has two syllables, it’s more complicated and depends on the endings and stress position (cf. narrower vs. *politer), but that’s a side point.  Fun has one syllable.  So why would “fun” behave differently than every other single-syllable adjective in the English language?  (That’s actually an overstatement. There are examples like “bored” that act as adjectives, but I think it’s obvious how that’s quite different)
One might say that “it sounds wrong,” but in my opinion that’s probably attributable to our experience of being told that it’s supposed to sound wrong.  I doubt that there’s a native English speaker alive today who hasn’t at some point, probably when they were quite young, said funner or funnest and felt that it was perfectly natural.  We learn that it’s wrong, but we also learn all sorts of things that are later proven incorrect or, at least, vastly oversimplified.  Most of us also learn (from Shirley Jackson or somewhere else) that blindly, unquestioningly following the status quo is not a recipe for success.
The last bastion of hope for the naysayers is to resort to the argument that “only children say funner.”  At least then we could say that the word is an age-based dialectal marker.  On first glance, that might be appealing, but it turns out that it’s not just children who use the word in informal settings, either.  Bono said “the funnest thing” in his interview with 60 minutes, and multiple speakers have used funner and/or funnest on NBC’s Meet the Press–hardly the equivalent of schoolyard chats.  In written correspondence, the word “funnest” had a brief period of fairly widespread use as early as the 1820s according to Google Ngrams.  According to COCA, newspapers like The New York Times and USAToday and even academic articles have printed the words dozens of times, but always, of course, quoting someone else saying it, and usually a teenager or young person.  If professionals were to use the word without being facetious or campy, then they would probably be ridiculed for it.  But why?  On what grounds?  So far as I can see, the only reason over-zealous editors continue to stamp out the word is because they’ve had it stamped out of them.
I’m not saying that the words are well-suited to formal academic writing.  They’re clearly not, but first, that’s probably more due to semantics than morphology; it’s very rare that formal writing calls for comparative/superlative, subjective judgments of amusement levels anyway.  Second and more to the point, the word “toasty” (among plenty others) is even less well-suited to formal written contexts and neither do dictionaries put a derogatory “informal/slang” label on it, nor do people seem to have a problem with that word’s existence.  Funner and funnest are picked on because they’re frequent examples, but it’s precisely because they are so frequent that we should just get over it already and accept the words for what they are: highly useful communicative tools.
As Erin McKean says, if we embrace our language for the diverse, chaotic wonder it is instead of trying to police it, we’d probably lead happier lives.  It’s no wonder that she’s such a bubbly personality; she gets paid to study how words work, and she sees words for what they are: a means of “windowly” viewing into the infinite variability of the human experience.  Viewed in that light, lexicography sounds like one of the funnest professions I could imagine.