how linguists can deal with grammatical “mistakes”

Thank you for the feedback on my previous post about the difference between linguists and grammar snobs!  This time, all of the feedback, positive and negative, was through personal correspondence, and I don’t have permission to make that public, so you’ll have to take my word on this next bit.  In talking about grammar snobbery, one question that came up more than once was (and I’m summarizing crudely) what I’d do if I came across a mistake, and by extension what I think other linguists would do and other people should do.  In other words, if grammary snobbery is wrong, then what’s right?  That’s a fair question.

While the answers (yes, plural) to that question might differ depending on the context and the mistake, I think the means of answering that question can be summarized with a single, overarching statement.  As a good friend of mine often says, “it all comes down to consequences.”

Everything we do has consequences, and our use of language is no different.  What we say, write, or communicate non-verbally can have positive or negative effects on other people, on our relationships with those people, and in some cases, on our relationships with other people we don’t even know.  A simple sentence can rally a city or a whole nation, as in “Tear down this wall,” or “I’m taking my talents to South Beach.”  On a personal level, if I’m consistently able to come up with perfectly worded witty statements on the spot, I might win more arguments, or win over a girl, or win respect from my peers.  On the other hand, if I put my foot in my mouth all the time and say asinine things as a matter of course, then I might become the Governor of Texas.  If I tack on a “just kidding, lol” to the previous sentence’s little paraprosdokian, then it’s likely that fewer people will take issue with my slighting Rick Perry, but other people will take issue with my use of “lol,” and my friends will think I’m writing this while vaguely drunk.

In sum, any language use has consequences, and there are consequences to using forms that snobs deem to be “bad grammar” as well.  Linguists wouldn’t use the term “bad grammar,” but we should be able to understand what those consequences likely will be and respond appropriately.

Before I try to explain what “responding appropriately” means to me, I should first explain why we wouldn’t say “bad grammar.”  In my last post, I tried to clarify why the very phrase “bad grammar” is confusing and almost comical to linguists; it would be like a chemist saying that there are “good” molecules and “bad” molecules. If some chemistry textbook called strychnine a “bad molecule,” you would have to assume that either the author was being facetious or that “bad” was being used figuratively to mean something more like “potentially fatal if ingested by humans.” In other words, the molecule itself isn’t “bad” per se, but its presence might have consequences that we consider negative.  And there you have it.  The way linguists would respond to instances of non-standard language use is in many ways very similar to that.

To reiterate, grammar is the system of context-dependent form-meaning associations in a speech community. In X circumstances, speakers of language Y say A (or B, C, etc.) in order to communicate Z idea. If something is “ungrammatical,” that means either that there is an important mismatch between the intended meaning and the meaning that is understood, or that the phrase wouldn’t be understood consistently or at all.  The key here, though, is that we have to conceive of the phrase “intended meaning” very broadly so that it includes not just the literal meaning of the words, but also the intended social effect.  That’s a bit tricky, so here’s an example.

I work with many international students who have trouble with English noun phrases, especially when talking about abstract nouns.  Sometimes, the presence of a plural-looking marker doesn’t make that much of a difference, as in the following:

  • I like strawberry.
  • I like strawberries.

On some level, there is a difference between liking the flavor of the fruit and liking the fruit itself.  One could imagine some person who likes strawberry-flavored candy but who doesn’t like eating the berries or vice versa.  On the whole, though, if you were talking with a person who said either of those sentences, you probably wouldn’t be confused.  Sometimes, though, there can be a very large difference, as in these two sentences:

  • I like dog.
  • I like dogs.

Suddenly, the difference between liking the thing itself and liking the flavor of it is more consequential.  If you meant to say one and said the other, then there could be a fairly large and important mismatch between what I would understand and what you intended for me to understand.  Now, if you were one of my international students, and you saw me walking my dog on the street and said “I like dog,” I would assume that you were going to pet her and not that you were imagining how my dog would taste, but if we met each other at a potluck and you said “I like dog” while chowing down on some unlabeled casserole, I might think twice before putting some on my own plate.

"I like dog. By the way, this is good.  Want some?"

“You have a dog?  That’s great.  I like dog. By the way, this is good. Want some?”

Typical grammar snobs would probably respond to someone’s uttering “I like dog” by calling it a “mistake.”  Next, they might add “It should be ‘I like dogs’ because…” and then that’s where it gets really tricky.  What could they say after that?

Some of them might say, “In English, we need the plural in that position.”  Well, no we don’t.  No one says “I like ice creams,” or to take John McWhorter’s example, “I like corns.”

So maybe instead they say, “In English, count nouns require the plural in that position.”  You can count dogs: one dog, two dogs.  You can’t count corns.  You can count kernels of corn, or rows of corn, or something else of corn, but we don’t say one corn, two corns.  That would be closer, although there would still be problems.  Certain nouns can be either count or non-count, like strawberries.  “I like democracy,” and “I like democracies” are both possible sentences; it’s just that they mean slightly different things.  And that’s when they’d finally get at what I think is the right answer:

The sentences “I like dog” and “I like dogs” mean different things to native speakers.  Saying “I like dog” is a “mistake” if you intend to say that you like canines as pets because the way that is typically expressed in English is “I like dogs.”  At this point, we’ve largely abandoned any “prescriptivist” argumentation (i.e. “thou shalt do X and thou shalt not do Y”).  We’ve simply described the observable reality of the world, and laid out the consequences of the language so that the language user can make a choice.  If one option is more desirable, then the language user should choose the form that corresponds to that option.  We can only say “you should say ‘I like dogs'” by understanding that behind that “should” is a much more important “If what you mean to communicate is ___.”

At this point, you might say, “But surely you can figure it out from context.  Wouldn’t saying ‘I like dog’ to mean that labradors are cute and lovable be just fine by linguists as long as it’s said in a context where the intended meaning is clear?”  Well… in fact, there are some scholars who make a very similar argument, but I personally wouldn’t go that far, no, for two reasons.

A)  The context often doesn’t disambiguate, or at least doesn’t do so in an objectively clear way.  I work with a law student from China.  I help copy-edit his papers, and I remember once running into trouble with one of the subheadings he’d used.  He knew that he had problems with this very same form, so he had even suggested multiple different options, as if to give me a multiple choice question!  I was immensely entertained, but as it turns out, it wasn’t an easy choice:

  1. A Democracy in China
  2. Democracy in China
  3. The Democracy in China
  4. The Democracy of China

Most people, I think, would agree that (3) doesn’t sound like a plausible English phrase.  It’s  unclear what that one would mean, but the other three options are all possible section titles.  Which one he should use really depends on very subtle distinctions like whether or not he is claiming democracy already exists, whether he is referring to democratic principles and structures in society or to a larger government body, whether or not he is being ironic or cute, and so on.  He may or may not be aware that these functions apply to these language structures, and may or may not be intending to use any one or combination of them.  I might be able to disentangle those things by reading the rest of his paper, but I might get that wrong.  I might accidentally attach a meaning to his subheading that he didn’t intend on being there.  He might even have meant to say something completely different, like “Democracies in China,” and as it turns out, after conferencing with him, we decided on “Chinese-style Democracy,” which sounds far removed from any of those previous options unless you add quotation marks as in “The ‘Democracy’ of China.”

The point is that, in many contexts, we can’t simply rely on the audience to figure it out for us.  If we use a form whose meaning is unclear, then our chances of successfully communicating our intent decrease.  Using the clearest, most unambiguous language possible might still not result in a perfect match; indeed, some theorists argue that a perfect match is an impossibility and we can only hope to approximate understanding.  Even if we assume it is possible in principle, listeners might just not get it, some might not be fully paying attention in the first place, or they might not have native-like listening comprehension abilities.  In other words, we almost never have ideal circumstances for communication, but even under ideal circumstances, communication is a probabilistic, messy process, so we’d do well to maximize our chances.

B) “Intended meaning” doesn’t just concern the literal denotation of any one particular set of words.  Let me tell a story to illustrate.  During my first few days living in Japan, I was happy if the food a waitress gave me vaguely corresponded to what I wanted to eat, let alone what I thought I’d ordered.  I had coughed up some garbled string of Japanese-sounding syllables, and victuals were brought to my seat, very likely in part because of my linguistic efforts or at least because they took pity on me and knew I had money.  Still, success!  I was happy and enjoyed my meal until the time came to think about what I had to say to pay the bill.

That sense of satisfaction for completing simple tasks didn’t last long, though.  After all, I didn’t just want physical sustenance.  I also desired to be thought of as a capable, functioning adult.  Often when people start learning a second language, they get frustrated or embarrassed because they feel like they sound like a child (and, in fact, we all do).  We don’t start out in Japanese 101 expounding on the perils of complementary schismogenesis.  We start by learning how to say simple phrases, but the eventual goal for most people is to be able to say what they want to say, how they want to say it.  For me personally, I wanted to be seen as capable of saying what I wanted to say, how I wanted to say it in Japanese, and if that is my intent, then only a very specific set of linguistic forms will do.  (It’s 相補分裂生成の危難, in case you were wondering… although I imagine you weren’t)

So, going back to the example of “I like dog,” a linguist would have a couple different answers depending on the situation.  If one of my students is writing to a potential host family abroad and wants to make a good first impression when introducing his likes and dislikes, I could tell him that “I like dogs” is better.  If he asks or cares to know why, I could even add that saying “I like dog” means that he likes the flavor of dog meat, and that while his future host family could probably figure out that he doesn’t mean to say that, he will sound less competent in English than if he had used the other form.  If he doesn’t care or isn’t present for me to terrorize, I’d probably just “correct” it without making a big deal of the situation.  After all, it looks like a simple sentence, but it’s actually a tough distinction to learn, with lots of exceptions and subtleties, and so I can’t reasonably get angry that he hasn’t learned it despite however long it is that he’s been trying to do so.

If another student says “I like dog” to me as I’m walking my dog on the sidewalk, I’ll probably just smile, ignore it, and move on in the conversation.  It’s not really an appropriate time to launch into an exposition on English noun phrases, and the student’s desire to communicate is probably more central than the desire to achieve native-like accuracy in that specific moment.  If I chose to comment on the statement, that might not help the student remember the form anyway.  More likely, switching the topic of conversation from my dog to English grammar would just cause embarrassment and frustration, and maybe even engender some small amount of resentment.  I could always bring it up at some later time when I thought it would be more helpful, but nitpicking then and there might give the false impression that I care only about formal clarity over meaningful exchange of ideas.  In short, it would make me look like a grammar snob, and that is a consequence I definitely want to avoid.


on the difference between linguists and grammar snobs

People often believe that I am the type of person to whom it would be unsafe to write anything containing a grammatical mistake, and while that pains me, I get why. I study Applied Linguistics, and as such I am passionate about language. I think about it often, and I talk about it in casual conversation as if that were a normal thing to do. Moreover–besides being the type of person who would try to get away with using a word like “moreover”–like many in my field, I teach English, which lends even more credence to the notion that I am a linguistic control freak. However, I and, more importantly, most applied linguists would be deeply offended to be grouped together with people who cry in horror at split infinitives, missing apostrophes, and dangling participles (or, for that matter, the “serial comma” I just used), and I think the distinction between what I do and what they do is important to understand.  (For an introduction to this topic in general and to the serial comma specifically, this NPR editorial is a good read)

There are many different names for people who do that sort of thing such as “Grammar Nazis” or “the grammar police.” As you can see in the picture below, they even have their own merchandise, and some attempt to put a positive spin on things by calling themselves “Grammar nerds” or “Grammar geeks.” In her hilarious and wonderfully written book, June Casagrande calls them “Grammar snobs,” and for the sake of consistency, I’ll use the term “snobs” to refer to them here, but really, we all know who (excuse me: “whom”) I’m talking about (excuse me: “about whom I’m talking”). We’ve all at some point either been witness to, victim of, and/or perhaps even complicit in their tirades against “improper usage” or simply “bad grammar,” and on the surface it seems like their passion for correct form resembles the work of linguists, but that is really far from the case. Linguists and grammar snobs are in many ways diametrically opposed.  I’d like to try to show why.

Grammar Police Mug

Mantra of the snob, not linguist

First, let me say that correction itself doesn’t bother me. I edit my own writing (and self-correct in speech) quite often. Even in informal written contexts, I sometimes delete and rewrite my Facebook comments as fast as I can in a futile attempt to make it look like I had originally written what I eventually decided was better. I believe there is such a thing as a better, clearer, more powerful means of expression, and that, all else equal, we should pursue it. After all, language is a powerful tool, and Spiderman has taught us that “with great power comes great responsibility.” I do strongly value comprehensibility, force, deftness, and even beauty in language, but that’s not the same thing as conformity to arbitrary, self-contradictory stylistic edicts of the self-proclaimed elite. In so many words, I’m not against all of the snobs’ “corrections,” but I end up disagreeing with most of them because I strongly disagree with their means of judging language as grammatical or ungrammatical, and what they mean by both of those terms. So call those grammar snobs what you may–nerds, nazis, or nitpickers–just don’t call them linguists. They aren’t.

Grammar Snob Cat

I’m being rather nonchalant in using the word “they,” as if grammar snobs were some unified, homogeneous cult, but I’m comfortable doing so here because, no matter how diverse the individual snobs may be, “their” handiwork tends to follow a very distinct, uniform pattern. The following is what I see as the typical modus operandi of a grammar snob:

  • Decide before reading or listening to something that formal accuracy is more important than successful communication
  • Read or listen to a given language sample, paying special attention to particular forms, often (though not always) at the expense of the message itself
  • Ignoring the content, label any form that differs from their conception of the norm as wrong, when possible using linguistic-looking jargon
  • (Optional) Add some haughty-sounding phrase and assert that it constitutes what the original speaker or writer “should have said”
  • (Optional, and less frequent) Insult the original speaker or writer

I think it’s clear from that overview why I don’t much respect that whole process. The first two in that list are objectionable enough such that whatever happens after that is moot, but that’s actually not the only reason why linguists and grammar snobs differ in their judgments.  My biggest pet peeve with grammar snobs is that, in a surprisingly large number of cases, in the act of trying to “correct” someone else’s “grammar,” snobs commit three separate but related offenses.

  1. They invoke an argument that has nothing to do with linguistics, grammar, or sometimes even language
  2. The argument itself often isn’t true or even internally consistent
  3. The exchange that results distracts from real, underlying issues in language use and deflects people away from otherwise readily available information on language that is genuinely interesting, empowering, and meaningful

That was long and complicated, so let me break it down in a simple example. In a previous post I ranted about the word “funnest.” Use of that word can push grammar snobs into a long diatribe about how funnest isn’t a word because it’s not in a dictionary. Well, the three problems I just mentioned are well-evidenced in this example:

  1. Dictionaries (especially paper ones) aren’t an appropriate source of determining word/non-word status. That’s just not the kind of argumentation you’d use in linguistics at all.
  2. “Funnest” is listed in several prestigious dictionaries. They assume it isn’t in the dictionary because they think it shouldn’t be there, but in fact the opposite is true.
  3. The issue of what constitutes a “real word” is a fascinatingly icky problem, but if we look at real usage and gather data from (often free, often online) sources like corpora, it shows that “funnest” is as real a word as any other. If people knew about these relatively straightforward tools, they could find out all sorts of things about their own language and how it works.

Now, “funnest” is one example of this 1-2-3 pattern (bad argument; false anyway; a better argument says the opposite and is insightful), but there are countless others. I’ll describe just one of those “countless others” below, but before that I thought you might be wondering why I’m riled up about this. After all, maybe grammar snobs have nothing better to do with their time. Maybe–indeed, quite probably–they derive a sadistic pleasure from making snide remarks about other people’s language, and who am I to deny other people pleasure? It’s a free internet, and all that.

The problem is, though, language teachers have to deal with the aftermath of their handiwork. The legacy of this conflation of (often misbegotten) “style guidelines” with real “English grammar” (which, properly understood, are two very different things) is such that our students believe not only that they have to learn these “rules,” but that those rules have some intrinsic value. Some of my own students believe that teachers can (and even should) be evaluated based on their mastery of those rules and their ability to foster mastery thereof in their students, and that’s frankly appalling. While belittling someone for a missing apostrophe is trite and objectionable on its own grounds, for me there is the added grievance that these snobs are interfering with good teaching practice. Grammar snobs make my job, my colleagues’ jobs, and the work of my students harder and more complicated, and for that, they have become the subject of my rant.

There are so many examples of fundamentally flawed grammar snob arguments that it’s tough to choose just one. Who/whom, lie/lay, there/they’re/their, sentence-final prepositions, and effect/affect are each worthy of separate rants, but for here, in my thinly veiled attempt to reach out to people who might think grammar snobbery is a good thing, I’ll talk about one that’s less close to the typical grammar snob’s heart. It’s called “impersonal they.”

Grammar snobs will tell you that the following sentence is malformed:

  • If someone comes looking for me, tell them I’ll be back soon.

Instead, some of them will insist, straight-faced, that the sentence should be:

  • If someone comes looking for me, tell him or her (that) I’ll be back soon.
  • (Alternatively) If people come looking for me, tell them I’ll be back soon.

Their argument is typically that “they” refers to plural subjects, and “someone” refers to singular subjects, so the pronouns don’t agree.  They typically add on that “young people” or “people nowadays” say “they” but that traditionally, English strictly maintained that distinction.  Well, we can go down that checklist I proposed earlier: 1) Non-linguistic? Yes. 2) Untrue anyway? Yes. 3) Obfuscates interesting language-related issues? Yes. 4) Makes my job harder? Yes. Here’s the breakdown:

1) The argument sounds linguistically based, but it really isn’t. The snobs simply assert that “they” refers to plural subjects only–because snobs said so–and not because that’s how the pronoun actually behaves in the language. Linguists don’t just get to call the shots and say how a language should operate. Our job is to figure out how it actually operates and pass on the relevant parts of that knowledge to our students. If the word “they” is used frequently in the context of an unspecified singular third person–as in fact it is–then that’s part of the grammar of English. Grammar is, very simply, the system of form-meaning associations in a speech community. In certain contexts, certain forms mean certain things and have certain communicative value, and others do not. A linguistic argument against impersonal “they” would have to be phrased like “In formal written contexts, use of ‘they’ or its other case forms to refer to a singular referent is stigmatized and may result in unfavorable reception.”  Even that argument, in my opinion, is flawed (there are quite a few examples of its use in articles published in academic journals like TESOL Quarterly and national newspapers, but they’re usually not salient enough to attract ire). Really, though, that’s not their argument anyway. Their argument can be called any number of things–pretentious, pedantic, petty–but it cannot be called “linguistic.”

2) Their argument is just not true. The distinction between singular and plural pronouns in English has never been that strictly delimited. Many people learn the concept of the “royal we” through Shakespeare, and that’s one example of where a plural form is used in place of the singular. In Shakespearean times, people commonly used the pronoun “thou” to refer to singular persons and “you” for more than one person, but “you” was also used to refer to individual persons formally, and it eventually became the standard for all second-person addresses. More to the point, though, people in the 21st century use “we” in impersonal contexts when saying “I” simply sounds too committal. The sentence “We’re experiencing some cold weather up here” could refer to the people of that area, but really, it doesn’t refer to anyone specifically. It’s just impersonal, like when I said “our students” in a paragraph above. One could just as easily use “I” (or “my students”), but saying “we” depersonalizes the statement. All three pronoun distinctions, then–I/we, thou/you, he/she/they–are (or were) not quite so black-and-white as the grammar snobs would have us believe anyway.  Polysemy (having more than one possible meaning) and situational exceptions in the case of singular and plural forms are not even unusual across languages; English’s fuzziness in that respect is very similar to French, German, and many others. Separately, the snobs’ assumption that they are “preserving” English is also just false. Impersonal “they” was used more than 600 years ago in The Canterbury Tales by Chaucer, and has enjoyed widespread use consistently since then. It is English, and it has been English for quite some time, but even if it were some “new” development, John McWhorter likens the snobs’ practice of trying to preserve the language to trying to stop the tide from coming in by drying the beach with a towel. I find that a powerful image of how absurd what they’re doing really is.

3) If people weren’t scared off by grammar snobs, engaging them in a conversation about pronoun shifts in English might not sound like what it does today: something that should be prohibited in the Geneva Convention. Pronouns, which you’d think would be rock-solid, are in fact quite fluid and chaotic across languages. Japanese has or had literally dozens of personal pronouns, many of which have shifted meaning drastically over time, and all that despite the fact that pronouns are commonly dropped from speech and writing when not absolutely necessary for comprehension. Previously I discussed how the Japanese second-person omae, or other words like kisama, used to be strictly formal and honorific, but nowadays, only a few generations later, they can be insulting and even vulgar. German, in what I can only imagine is the result of years of its speakers consuming more beer than water, has come to a point where the second-person singular nominative and accusative (i.e. “you”) and third-person singular dative (“to him/her”) forms are the same, “ihr,” while the third-person singular and second-person plural nominative and accusative pronouns merged on a different form, “sie.”  If that whole sentence sounded like confusing nonsense, then you’ve accurately understood it. Even in English, “you” started out as the accusative case only. A millennium ago, Britons would say “I see you,” but crucially not “*You see me.”  That would sound weird to them, just like saying “Me see you” would sound weird to us. Instead, the form in that position was “ye.”  “Ye see me,” which sounds vaguely pirate-like now, was at one point in history the way people talked in English.

Why would that be?  Shouldn’t pronouns be relatively stable?  We use personal pronouns hundreds of times in daily conversation; they’re some of the most frequent words in the English language. Those are really interesting questions. Indeed, part of the work of linguists is finding answers to those questions.  Unfortunately, though, people don’t tend to think about those questions partially because they think discussions of grammar are exclusively for people who want to feel superior to others.

4) Lastly, and most importantly, this has an impact on English teaching, English teachers, and students learning English. Presumably due to the influence of grammar snobs in the language testing community, I have, to my dismay, seen questions on standardized tests of English that specifically target impersonal “they,” who/whom, lie/lay, and other grammar snob problems. What happens when a high-stakes standardized test like the TOEFL uses items that test mastery of these nonsense maxims?  Am I obligated to teach something I not only don’t believe, but in fact strongly believe against, all the while sacrificing classroom time that I could have otherwise dedicated to activities I feel would be truly beneficial?

Unfortunately, the answer to that last question is “yes.”  Psychometricians call this effect “washback,” and as much as ETS tries to use its power for good, the TOEFL has a long and storied history of negative washback in the ESL classroom. High-stakes exams often have dramatic, real-world consequences, and failure to pass them can cost students hundreds or even thousands of dollars and months of their time, so if I can get students to pass by having them memorize a few nonsense arguments to spew out for the exam and promptly forget thereafter, I will and probably even should. Now, I’m not helpless as a teacher. There are things conscientious teachers can do to deal with it, but that’s another post, and the point here is that we shouldn’t have to “deal with it” in the first place. Neither should my students, and neither should anyone.

So please, when someone tells you they’re studying linguistics or applied linguistics, understand that grammar snobbery is not part of their required coursework. (Note: I just used impersonal “they” twice) In fact, linguists often work against grammar snobs, advocating for our students, or simply advocating for logic, and I’d like to think we’re winning the war. Slowly but surely, awareness of the hypocrisy of the grammar police is spreading, as evidenced by educational websites, classes, and even comic skits (though I warn you, that last one isn’t family friendly). On the other hand, one could just as easily cite examples of other self-described grammar experts who continue to misinform and miseducate, but even they have to find ways to explain away their many fallacies instead of simply going unquestioned, and that’s good. The end result of those questions is, I believe, a fuller understanding of language, and that is what the job of linguists is all about.  (Excuse me: “That is all about which the job of linguists is.”)

my top 5 “most difficult spelling systems” list

English orthography is a beautifully inconsistent, inconsistently beautiful, sadistic excuse for a “system.”  My favorite example is what happens when you add letters to the word “tough.”  (The symbols on the right are how the words are pronounced, and even if you don’t understand what they mean exactly, you can see that more changes than just a single character)
  • tough /tʌf/
  • though /ðoʊ/
  • through /θɹu:/  (notice, so far, not a single sound in common)
  • thought /θɔ:t/ (or in my dialect, /θɑ:t/)
  • thorough /ˈθʊɹoʊ/
So, there are at least four different vowels and sometimes an /f/ represented by <ough>, and there are two different sounds represented by the <th>.  Add words like “laugh,” “taught,” “draught,” and “drought” and you start to see how this spirals out of control really quickly.  Part of the problem is that we have, depending on the dialect, anywhere between about 35 and 45 distinct sounds (called phonemes), and yet a mere 26 letters to write them. Linguists, very appropriately, call that a “defective script,” and English is astonishingly defective.  Of the 26 letters we do have, several overlap in strange Venn-diagram-like sound mappings such as s-c-k-q-x.  It’s kind of mind-boggling really to think about this, but there is literally one sound in the entire English language that is written with one and only one group of letters, and that’s the so-called “soft th” /ð/ in “though” from above.  The letters <th> can represent multiple different sounds (e.g. Thomas), but the phoneme /ð/ can be written only one way.  Every other phoneme in English has at least two, sometimes more than 10, and, in the case of the sound /eɪ/ as in bay, pain, base, bass, ballet, obey, dossier, resume, or even résumé, etc., more than 20 different graphical representations!  So you can imagine that, if you were a speaker of another language trying to learn English, you might assume that native speakers had devised the system as some sort of cruel joke to keep foreigners from ever mastering it.
To every rule, there is some exception.  For every word like “photograph,” there are words like “haphazard” and “Stephen.”  Chore?  Choir and charade.  Singer?  Finger and angst.  Bureau?  Bureaucracy and beauty. Of course there are hundreds of homophones that are spelled (or spelt) differently but are pronounced the same, like write/rite/right, prince/prints, or soared/sword.  Those are decently common across languages.  It is quite rare, however, to find a language with anywhere near English’s number of heteronyms that are spelled the same and yet pronounced differently, such as having a gaping wound as opposed to being wound up, the bow of a ship or a bow and arrow.  It is not a moderate task to moderate discussions of English spelling.  Everyone knows about our silent “e,” but then there are things like silent “b” (debt, comb), silent “p” (psychology, pneumonia), silent “t” (castle, listen, soften, not to mention words that vary like “often”), silent “s” (island, debris), silent “l” (salmon, talk), and it just goes on and on like that.  It is genuinely pretty hellish, but are other languages any easier?  Are there even more horrific systems?  How does English orthography really stack up against the rest of the world’s written codes?  Well, first we need to survey the landscape and see what’s out there.
Before that, though, if you’re hearing the word “orthography” for the first time, it simply means the whole system of writing.  It comes from the Greek stems orthós-, meaning “standard,” “legal,” or “correct” (think “orthodox”), and -graphéin, meaning “to write.”  Linguists tend to use the term “orthography” rather than “spelling” for two main reasons: first and foremost, it makes us sound smarter and thus feel better about ourselves.  Second and almost equally important is the notion that a strict definition of “spelling” is relatively narrow in scope.  Spelling means the way in which words are written using letters and diacritics (which are accent marks like in résumé), and therefore doesn’t necessarily apply to all languages that have writing.  In contrast, orthography is a very broad term that encompasses many aspects of writing.  Here’s a simple graph:
Put simply, spelling is one part of orthography, but spelling is not everything you’d have to learn in trying to master English writing.  If I “spell out” some of the additional features that orthography covers, it might give a good idea of how daunting it could appear to would-be learners:
  • Punctuation and other unpronounced characters
    • Some languages don’t (or at least, didn’t) use punctuation at all, which might make an adjustment to a fully-punctuated system more taxing. Where there is punctuation, rules on usage are often quite complex, and they differ across languages, even between sister languages like French « Allons-y! » and Spanish «¡Vamos!»
    • Languages that use punctuation often use it to contrast meaningfully different phrases, like the famous “A woman without her man is useless” and “A woman: without her, man is useless.”  Those differences aren’t universal; they have to be learned
    • Emoticons and other non-letter characters like ellipses and dashes often carry meaning or perform important discourse functions, especially in informal contexts.  Anyone who has ever tried learning the stupefying array of Japanese emoticons knows the true meaning of despair. I mean, for crying out loud, there’s a specific emoticon for the act of playing volleyball! (/o^)/ °⊥ \(^o\)  When I first got that in a text message, I didn’t read that as it was intended: an invitation to go to the beach. I thought someone had to go to the hospital
  • Orientation
    • Just to list some examples, English, Russian, and Inuktitut are written horizontally from left to right
    • Arabic and Hebrew are written horizontally right to left
    • Classical Mongolian is written vertically left to right
    • Chữ Nôm (Old Vietnamese) is written vertically right to left
    • Modern Mandarin and Japanese are written in multiple different orientations depending on the format and level of formality
    • Egyptian Hieroglyphs were prototypically horizontal and right to left, but varied based on other stylistic factors and some characters were read in their own special order
    • There are yet other ways of orienting writing, too
  • Penmanship, scripts, and written styles
    • There are sometimes large differences between the way (or ways) in which things are written in the same language depending on the people, place, purpose, and time.  When people study a Chinese language, for example, they have to learn that type-font characters like 書道 can appear radically different in hand-written forms, but they have to recognize that the intent is the same.  Beginning learners of English are often confused with the Times New Roman “g” and “a/ɑ”
    • Block-print and cursive writing are other good examples in English, but even within cursive, there are different styles like Spencerian and D’Nealian, and there are often idiosyncrasies across languages
      • French cursive “1” often has the hook extending all the way down such that, to North American eyes, it looks almost like “Λ”
      • Japanese teachers of English were taught for many years to write “s” with a hook such that many still write it as “ʂ,” which is a different letter in other languages
  • Alternate characters and characters for special purposes
    • In English, accountants and mathematicians often write 0 (as well as 7 and z) with additional slashes to disambiguate or prevent fraud, but Ø is a separate letter in Swedish, for example, so Swedish accountants tend to write 0 with a dot in the center instead.  Similarly, many Japanese and Chinese legal documents write the numbers 1, 2, and 3 as “壱、弐、参” instead of the more general “一、二、三”
    • Roman numerals (like MMXII) are another part of our writing system that has to be learned in order to be fully literate
For some languages, parts of orthography fall into spelling, too, such as:
  • Spacing
    • Some languages put spaces between words like modern Korean, others like Japanese and Chinese do not
    • Some languages or language varieties that put spaces between words write compound words as one continuous sequence. German is probably the most famous example with its words like “Geschwindigkeitsbegrenzung,” which English spaces out into “speed limit”
    • English is wildly inconsistent on this one, though. We write fireman and highway in unbroken strings, fire truck and high school with spaces. Rollerskate, roller-skate, and roller skate are all well-attested in corpora, and even the snootiest of dictionaries often have multiple listings
  • Capitalization
    • Most languages are unicameral, meaning that they don’t have upper- and lower-case letters, but even closely-related languages that are bicameral often differ wildly in this respect; English doesn’t capitalize every noun anymore, but German still does
    • English is perhaps unique among languages in requiring that the pronoun “I,” but not “we,” “you,” or any other pronoun for that matter, is always capitalized
    • Conventions change over time. If you have an old version of Word (which is capitalized) and you run a spellcheck (no space), you might get prompted to capitalize words like “internet,” but that’s no longer the case (if you got that last pun, you can join me in tears of shame)
    • We can capitalize words mid-sentence for emphasis or to make a contrast, such as religion/Religion, or we can sometimes use ALL CAPS
    • French, English, and several other languages also have an interesting trend in the opposite direction, writing common acronyms in all lower-case (e.g. HIV/AIDS in French is “sida”; the word “radar” started as an acronym)
Even after all that, the most important, and in some ways the most obvious difference between the terms “orthography” and “spelling” is that “orthography” can refer to more types of grapheme systems. (A grapheme is just a unit of writing)  For example, it would be a little strange to talk about “spelling” for logographic writing systems like Chinese characters.  There are different ways to represent characters in Chinese, and to be sure, there are strict rules for how to write them with correct stroke order and such, but the characters don’t directly or consistently represent sounds per se.  In fact, there are quite a number of different types of written systems:
  • Alphabets like Georgian (there are actually three different Georgian alphabets) or Korean, where each character (or component part of the character) represents usually only one sound, although even the purest ones like Korean have exceptions and situational rules
    • English is alphabetic in a sense, but there are only two letters that consistently represent only one sound, <v> and <q>, both of those represent sounds that are sometimes also written using other letters, and there are an increasingly large number of foreign loanwords that break even those patterns
  • Abjads like Arabic or Hebrew, where the consonant sounds are written but some or all of the vowels are left unspecified, and the reader has to figure it out from context
  • Syllabaries where each character represents a whole syllable, not just a sound, which can be further divided into:
    • Abugidas like Ge’ez in Ethiopia where relations between related syllables are shown by added marks or changes to the same base character.  For example, in Inuktitut, the language of the Inuit people in North America, /ki/, /ka/, /ku/ are written ᑭ, ᑲ, ᑯ
    • “Arbitrary syllabaries” like Cherokee or Japanese kana where there is no relationship between similar sounds. Using the same syllables, /ki/ /ka/ /ku/ in Japanese hiragana are written き,か,く
  • There are yet other types of absolutely crazy hybrid systems like Mayan that used logograms and syllabics and plentiful rebuses (think of things like “gr8” for “great”) all together in the same characters.  How Mayans were able to read without having an aneurysm is beyond me.
So after all that, there’s quite a lot to consider, really, when we try to compare written systems in terms of difficulty level.  In truth, any objective comparison is flat-out impossible.  How could we compare English to a system like Hong Kong Cantonese that has more than 20,000 characters in common usage, some of which are just brutal like 戲劇 (which means “movie”), but which are nevertheless regular, represent units of meaning and not always sound, are pronounced almost always in only one way, and are composed of a limited set of simpler parts?   Put simply, we can’t.  They’re completely different challenges.  So, instead, I’ve arbitrarily limited my list to languages that either use syllabaries or alphabets, I’ve kept the list to languages that are alive today (otherwise, in my opinion, Mayan is hands-down the most astonishingly complex orthography ever devised), and I’ve excluded exceedingly rare languages or scripts like Afaka.  The remaining list is, I think understandably, not perfect, but I challenge you to come up with a better one, haha!
Last note: I have a few runners-up.  First, Thai orthography is pretty rough.  There are several characters borrowed from Sanskrit that are pronounced the same as other Thai letters but which are not interchangeable.  Thai is also interesting in that it marks tone in writing.  Its Eastern neighbour, Khmer, from which large parts of Thai script are derived, is also highly complex and irregular, with lots of variability depending on the surrounding sounds.  Both of these are, however, not quite so irregular and sadistic to make it into my subjectively-judged top 5.  The reason is simple, and also explains why English does make the list: spelling reform.
Most languages have undergone at least one, and often multiple spelling reforms, usually because the government or another authoritative body wants to standardize the language and modernize it.  Languages change over time, and pronunciation in particular is highly variable, but ink blots printed on a page don’t often change shape in response to social trends.  In French, the Académie Française was created during the time of Louis XIV to standardize the language.  They publish dictionaries, make recommendations for school curricula, and have helped to rein in the chaos (note: rein, not reign, or rain).  The French language gets a lot of flack for words like “ils accueillent” where half the letters are silent, and then words like “lent” where most of those same letters are pronounced, but once you understand a few rules about grammatical categories, reading is not so bad.  Linguists call this a distinction between “encoding” (writing) and “decoding” (making sense of that writing, usually reading), and while French is difficult to encode, the decoding process is much more doable.  That doable-ness is largely thanks to the reforms enforced (sometimes even through violence!) by the French powers that be.  Khmer, Thai, Spanish, Russian, Italian, Swedish, Mandarin, Korean, Hindi, Mongolian, indeed a great many of the world’s languages have undergone systemic reforms to try to make the writing system more regular than it was before.  In fact, another “runner-up” for me would be Danish, which is a particularly interesting example since it’s so closely related to Swedish but hasn’t gone through the same kinds of spelling reforms.  As for English, there have been multiple attempts, but the only major influential spelling reform (Noah Webster’s) only succeeded in creating a chasm between two separate standards with equally absurd inconsistencies and idiosyncrasies.  Hopefully, that provides enough context and caveats.  Here, finally, is my top 5 crazy spelling systems:
  • Uyghur has four completely separate alphabets that are all standard in modern usage, and historically there have been quite a few others. It’s one of very few languages based on Persian to obligatorily represent vowels, and there are quite a few of them: /y/ /ɪ/ /ø/ /æ/ /ɑ/ /u/ /e/ /o/, but the real irregularities come from its many Chinese loanwords that are often quite difficult to distinguish from other words and follow their own set of phonological rules
4) Burmese
  • The fact that this language’s orthography doesn’t match up with its pronunciation is a point of pride for some Burmese nationals.  There are even different words for “written language” and “spoken language” that mark the distinction.  In many ways, literate Burmese people can be said to practice diglossia, the command of multiple different dialects for different functions.  The spelling system is more or less regular viewed from the inside, but its difficulties and irregularities come from being written in stone hundreds of years ago and far removed from what has happened with the language since then
3) Irish Gaelic
  • The language is growing after it had declined in previous generations, but the writing system reflects its diverse and chaotic history.  There are digraphs and complicated allophonic rules up the wazoo, such that the word for Prime Minister, Taoiseach, is pronounced /ˈt̪ˠiːʃəx/.  A combination of having no standard spelling until the mid-20th century and huge dialectal variation, especially in vowels, has created cases where the orthography is regular in some regions for some words, and in others for other words, but nowhere for all of them
2) English
  • Truly, English is among the craziest spelling systems in the world.  Ruth Shemesh and Sheila Waller’s book explains a great deal of the subtle regularities, and I highly recommend it, but even they can’t make sense of the huge number of exceptions that continue to grow by the day.  There are some theorists who believe that English is becoming more and more like a logographic system, where basically each word has to be memorized as one chunky symbol with component parts, rather than analyzing each word internally from left to right.  That’s not far off, and yet English, I think, still only gets the silver medal in my book
1) Japanese
  • Interestingly, many of the same historical reasons for English’s, shall we say, “diversity” of spelling rules are shared by Japanese: several, chronologically-disparate waves of mass importation of foreign words, several of which use entirely foreign writing systems that were only sometimes, and then only partially, regularized.  Japanese uses three largely independent writing systems together, or increasingly four if you include roma-ji, which many academics do because of words like t-shirt, “Tシャツ,” and acronyms.  Two of those systems are mostly faithful phonetic syllabaries, but there are exceptions like particles.  The third system has more than 2,000 characters in common usage each with numerous, largely unpredictable readings depending on when and where the word originated.  All three are commonly used together in the same sentence, and combinations of two are often together in the same word: サボる (to skip class), アメリカ的 (American-style).  Pitch accent, which is contrastive for most dialects, isn’t marked anywhere (e.g. the “three hashi”: 端、橋、箸), and all the while many other characters are written with several different variants despite no meaningful phonological or even semantic contrast in any dialect (again, I’ll use “hashi”: 橋、槗).  Like English, there are tons of heteronyms, sometimes among quite frequent words like 甘い(umai, delicious / amai, sweet), 辛い(karai, spicy / tsurai, painful, or here’s a better definition), and then we get into proper nouns and fossilized expressions and what little regularity was left in the system breaks down completely
So, in my opinion, when we ask how bad English spelling is compared to other languages, the answer is: among the worst.  There are other, frighteningly complex systems out there, to be sure, but English finds a way to take its deceptively simple 26 letters and make the absolute most it can out of them.  As a final note, I think it’s important to say that, by “worst,” I don’t mean to say that we should look down on English orthography, or Japanese orthography for that matter.  In fact, in my mind, I could have equally replaced the “most difficult” in the title with “most interesting.”  Irish, English, Uyghur, Burmese, Japanese, and even French are fascinatingly rich and complex, and indeed in many ways I think they are tremendously valuable in their idiosyncrasies.  The spelling of a word in these languages contains an immense amount of information; we can know just by looking at a word like “know” that it came from German, whereas a word like “ascertain” came from Latin through French, which in turn tells us more about the connotations of those words, the usage patterns, and the history of English-speaking people.  We’d have “no” way (or at least, no immediately apparent visual way) of doing that if we spelled no and know identically.  These “top 5” are a testament to society and language’s ability to evolve and thrive within the infinitely complex interactions of people and peoples, and while that does imply a lot of baggage, I for one am not upset about having all that stuff to cart around.  I like stuff.
So yes, English spelling is a handful, but it’s not alone, and that’s a good thing.  Perhaps, instead of denouncing others for their poor spelling of a choice few words, we should in fact celebrate the fact that people get so many other highly irregular, largely nonsense spellings correct!  (Like, for instance, “people”)  In the very least, if you deal with foreign learners of English, I hope you can sympathize (or even sympathise) with their struggle.  They truly do have it pretty bad.