Tibetan translation

"Meaning" isn't found in grammar

Meaning isn't in words; it isn't in the grammar; language isn't literal.

Take, for example, the innocuous question: "What do you do?"

I've seen this translated literally into Tibetan as: "ག་རེ་བྱེད་ཀྱི་ཡོད།". Now, this obviously misses the mark, and you might be tempted to say it has something to do with the additional "do" that was "not translated".

But this is what's called a "meaningless do". Compare, for example: "How do you do?" to "How are you doing?". The difference in meaning is degree (of formality), but not kind. The meaning is the same: "How are you?".

Whereas "What do you do?" has an entirely different meaning from "What are you doing?" — the first being "What's your occupation?" and the second meaning "What are you up to?".

1.) How do you do? = How are you doing? 
2.) What do you do? ≠ What are you doing?

This is a twist on Chomsky's famous example:

1.) John is easy to please = It’s easy to please John
2). John is eager to please ≠ It’s eager to please John

The point is this: Meaning is not encoded in grammatical syntax. Nor is it found in bare vocabulary (see my previous post on how and why). Language is not literal. 

Thus, we cannot simply use a "words plus grammar" approach to translate languages (Grammar Translation). "Having all the words" does not ensure accuracy. "Having all the meaning" is what ensures accuracy!

If we cannot assess accuracy by simply (and literally) ensuring that each source word is rendered as a target word, how do we do it?

One method is back-translation. We can translate from target to source to see if meaning is retained. (In an ideal and objective assessment, a second translator unfamiliar with the source text is provided the target, and translates back to the source; the translations are compared for meaning).

In our example, "ག་རེ་བྱེད་ཀྱི་ཡོད།" is back-translated as "What are you doing?" — and when compared with "What do you do?", we can see the mistake...

The Myth of the Literal Translation

In the popular imagination, words are real things that stand for real objects. The word “coffee,” for example, stands for the real object “coffee.” The coffee is here in my cup, and I am drinking it. What could be simpler? And what could be more real? Words, and the objects they stand for, are real, concrete entities. They are clear-cut and well-defined. And this is the world we believe we live in.

In this world, translation is the simple task of replacing one word in one language for one other word in another. An accurate translation is easy to produce: Each well-defined source word is replaced by an equivalent well-defined target word. If each word in the source language is transferred into the target language, it is “accurate.” That’s definitional.

We even have evidence of the well-defined nature of words. These evidences are collected and spelled out in our dictionaries. Each word has its definition, and it refers to something real. After all, that real something is what we are talking about when we use that word! For the translator, we have evidence of the well-defined nature of word-to-word correspondances. These are collected and spelled out in our translation dictionaries.

This is the kind of story we all tell ourselves. And this simplified version of “what language is” works well enough in the everyday lives of everyday people. What we don’t realize, however, is that words lose their meaning in isolation—that merely the act of uttering them within an everyday context is what imbues them with meaning. And to be a language professional necessitates that we wrestle with these ideas seriously...

A building is in the progress of being erected (shīgōng zhōng, 施工中). Of course, it would be absurd to claim that the hilariously translated “Erection in progress” is accurate, even though it is “literally correct.” A translation sensitive to contextual meaning would be: “Under construction,” even though the word "under" (xià, 下) doesn't literally appear in the Chinese... 

A building is in the progress of being erected (shīgōng zhōng, 施工中). Of course, it would be absurd to claim that the hilariously translated “Erection in progress” is accurate, even though it is “literally correct.” A translation sensitive to contextual meaning would be: “Under construction,” even though the word "under" (xià, 下) doesn't literally appear in the Chinese... 

Words themselves aren’t literal

The biggest stumbling block for literal translations is the simple fact that words themselves aren’t literal. Words aren’t real “things” (at least not in the way they think they are) and they don’t stand for real objects (at least not in the way we think they do). Words are ambiguous; contextual; metaphorical; and more. Like every other phenomena, words are dependent not independent. We can see that words are contextual, not literal, by taking a closer look at what words depend on for their meaning:

  1. Words depend on physical contexts

  2. Words depend on sociocultural contexts

  3. Words depend on linguistic contexts

  4. Words depend on their language

1. Words depend on physical contexts

The most obvious ways that words belay their un-literal-ness is that they don’t refer to “one thing.” Instead, they refer to a pattern of things—a class of things that share something in common. In the case of any one word, it represents a whole range of things. There are always borderline cases. If we’re lucky literal-ists, a word only applies to one thing... But usually, words also have multiple “things” that they refer to. For example, "LAP": 

  1. the flat area between the waist and knees of a seated person
  2. one circuit of a track or racetrack
  3. (of an animal) take up (liquid) with the tongue in order to drink
  4. ...
  5. ...

Let’s take a look at some more examples. Let’s say I utter the simple phrase, “Oops! The coffee spilled. Bring me a mop” And now, you have an image of spilled coffee in your mind. But now I change the physical context of my utterance. I say, “Oops! The coffee spilled. Bring me a broom.” Unless you’re in the habit of brewing your own coffee, or like me, spent time working in a coffee shop, the second context might not bring the right image to mind. But it looks something like this:

The coffee spilled; bring me a mop.

The coffee spilled; bring me a mop.

The coffee spilled; bring me a broom.

The coffee spilled; bring me a broom.

The word “coffee” (like all words) is actually ambiguous. It is the pattern of things that are somehow coffee-like. It’s the beans and the grounds and the liquid brewed from them... If I say, “I’m gonna go pick up some coffee” you literally don’t know exactly what I mean. You need more information. The reason we don’t realize words are ambiguous in this way is that we generally use them in contexts. What makes their meaning clear isn’t the words themselves, but the contexts they are used in.

For example, I might say, “I’ll pick up some coffee at the store.” Now you know I probably mean beans or grounds, and not the hot beverage. “The store” gives some context clues. The key here, though, is that you need to be in on the context clues I’m providing. You need the requisite linguistic experience that has primed you to interpret the word “store” the same way I do. After all, there is nothing inherent in the word “store” that precludes it from being a place that sells brewed liquid coffee!

All words are used by human beings, and all human beings are embedded in a culture. Our language itself is passed down to us. It is our cultural heritage—not an individual invention, but a shared, cultural good. So, of course, the most common way to be in on it is to have similar experiences of similar objects by being embedded in the same culture... 

2. Words depend on sociocultural contexts

calvin hobbes subtitles

The second way in which words don't have literal meaning is belayed by variance according to sociocultural context. We rarely spell out exactly what we mean (we're rarely "literal"). Instead, we expect others to be sensitive to the contexts in which we use words. When someone asks, "How are you," the response is nearly automatic—"Fine, how about you?" The question isn't literal, nor is the answer. The person asking generally has no interest in how you actually are, and neither would you tell them that things are going terribly (even if they were). 

That's because the question is not a "literal" question. It's real function is social in nature. It's meaning is something like: "a polite recognition of your existence in a shared space where an interaction is about to take place." As is the reply. It performs as a sort of signal between the two of you that, hey, I'm a person who's a decent person, and this interaction is going to go just fine. It works to simplify something that could be complicated and messy (the interaction of two people) into something simple and smooth (an everyday interaction).   

For another example, I think we're all aware of the difference in "the kind of language we use with our friends" and "the kind of language we use with our parents." Or how about "the kind of language we use with strangers at a formal event?" Part of being a well-socialized human is knowing what kind of language to use where. Word meanings shift according to social context—a word that's funny or natural with friends might be insulting or embarrassing to use with your parents and completely unacceptable in social company... 

Words are not literal because they are things that carry messages. Messages, by their nature, carry information from one person (the speaker or author) to another (the listener, reader, or audience). These people are active participants in assigning meaning to words. In other words, "words" don't exist out there all by their lonesome. They exist in relationship to speaker and audience. They are dependent, not independent! 

3. Words depend on linguistic contexts

First, let's note that "words" can vary in meaning depending on the words they are used with.  Let's begin by asking ourselves a very basic question: Why are we worried about "words" at all? After all, a "word” is an ambiguous and arbitrary unit of meaning; units of meaning can be larger and smaller than “a word.” A "word" can be used alone—lap—in a compound word—laptop—or within a phrase—lap of luxury—and communicate something different each time! 

Words even have connotations that are sometimes hidden. They might "literally" have a neutral meaning, but be used exclusively in negative contexts! In the linguistics literature, this is called semantic prosody. For example, the English word "spread" literally means "to expand or extend across an area." And yet, an investigation of how it is used in context shows us that it has a negative semantic prosody. We talk about diseases, cancers, and viruses as "spreading." When we talk about positive things, we use words like "growing" or "advancing" or "expanding!" 

If we're translators, we need to understand these inter-linguistic relationships in both languages. We need a sense of word connotation, not just of literal meaning, or we may mis-understand an inference the author is making; we may choose an inappropriate translation term; or, we may misunderstand the source text (by mistaking a word's secondary use as a primary use, for example). 

4. Words depend on languages

On top of all that, even if words were literal, they never "literally" mean anything in another language. Languages are self-contained worlds, worlds that stand on their own, and are self-referential. When I speak in English, I don’t “mean anything” in Tibetan—my words make sense in the English-only context in which they were uttered. Tibetan utterances, likewise, have no inherent English meaning—they only have a Tibetan meaning.

What I’m getting at here is that reality, whatever it is, is a vast sea of undifferentiated phenomena. And that our cognition, it seems, somehow does the work of filtering and categorizing it. Our mind is constantly searching for, and then constructing, patterns out of our raw sensory data. From this perspective, language is a tool we’ve inherited from our ancestors that help us match patterns, and communicate them (in useful ways) to our companions.

The key point, though, is that languages have solved the problem of “how to split up reality into useful objects” in different ways. And that means words across languages vary in the way they draw boundaries, even in the physical world. These variances aren’t just superficial; they can be fundamentally different, even at the deepest, most subconscious level. 

As we've already discussed, words are inherently ambiguous. Again, if I say, "I'll go get some coffee," you literally don't know what I mean (unless I give you the context, or unless the context is given by experience). Similarly, if I exclaim, "Ow!" you know that I encountered a sudden pain. But, there is still ambiguity—you don't know what kind of sudden pain I experienced. Was I burned? Pinched? Hit? Did I knock into something? 

In Tibetan, there are two possible translations of "ow!"—ཨ་ར་ and ཨ་ཚ་! When a Tibetan speaker encounters a sudden pain that is sharp or hot, they spontaneously utter ཨ་ཚ་ (atsha). If that pain is dull or blunt, the word used is ཨ་ར་ (ara). The point is that all words in all languages are always ambiguous to some degree. Reality is too complicated to be otherwise! But, all words in all languages aren't ambiguous to the same degree! In this case, Tibetan is less ambiguous than English: 

B. Student Presentation (What is language_ + Tibetan & more).png

This puts us in quite a bind. To translate "literally" from Tibetan into English, simply saying, "Ow!" leaves information out that is encoded in the Tibetan. Yet, putting more words in (to explain the context) means adding words that literally aren't there! In other words, there is no way to translate literally because there is no literal correlation between languages! 

It's an impossible standard because words themselves simply aren't literal—they depend the physical context in which they are uttered, the sociocultural context in which the author and audience are embedded, the linguistic context of the other words they are being used with, and finally, the specific language in which they are being used. 

Automating "Translation"

I've written a very simple script that "translates" Tibetan—or rather, it does as much Tibetan work as many of us do in the university-level classroom.

I'm going to pick on Rockwell here, because that's the book I was taught Tibetan from. All I've done for the script is input the glossary from Rockwell—the script simply converts the Tibetan symbols to Rockwell's given English equivalent.

In other words, I've outsourced the tedious work—work every student is taught to do in the grammar-translation classroom—of "using the glossary to look up words one-by-one" and then "write out the English words one-by-one," and given those jobs to the computer. 

What I've done is given the computer the following tasks:

  1. Recognize the Tibetan symbol(s)
  2. Look the symbol string up in the glossary
  3. Replace the Tibetan symbols with English symbols

Using the Script

For example, I can feed the following symbol string to the computer:

Input: 

  1. །སངས་རྒྱས་ཆོས་དང་དགེ་འདུན་ལ
    །སྒོ་གསུམ་གུས་པས་སྐྱབས་སུ་མཆི KP 3A:6 (verse)
  2. ་་་བླ་མས་བུ་ཆེན་རྣམས་ལ་ཆོས་དང་གདམས་ངག་གི་དཀོར་མཛོད་རྣམས་ཁ་ཕྱེ་་་ MINT 88:11-2...

What the script spits out is:

Output: 

  1. ། [Buddha] [dharma] [*AND*] [Buddhist community, sangha] [*TO(la)*]
    [./,/;/:/?/!]
    [three gates = body, speech, mind] [respect, devotion] [*AGENT/INST*] [refuge] [*TO*(ladon)] [to go] KP 3A:6 (verse)
  2. ་ [lama] [*AGENT/INST*] [son] [great] [*PLURAL*] [*TO(la)*] [dharma] [*AND*] [oral instruction] [*OF*] [treasury] [*PLURAL*] [pf. to open] ་ MINT 88:11-2...

Now I'm ready to "translate"! Using the grammar clues [*GRAMMAR*], all I need to do now is re-arrange the English words into a sentence that makes sense. (This is exactly what many Tibetan students are up to to this day!). 

But wait... I have to ask, couldn't I give this output to just about any native English speaker, even one who doesn't know a lick of Tibetan? With a few simple instructions (try reading it backwards; word order is SOV; etc.), they, too, could begin "learning Tibetan" without ever actually seeing even one word of Tibetan.

The question I'd like Tibetan teachers and students to think very hard about is this: If the computer can easily provide the English for us, what "Tibetan" work is really being done?

Reading

I've mentioned in a past post that "reading" is a complex process that requires: 1) decoding and 2) comprehension. What the script does is a version of the first bit of work for the student: decoding.

And, I'm arguing, that's all the Tibetan work a student can do in the Grammar-Translation classroom anyway. When we're taught in English, and we decode Tibetan into Englishall of our comprehension necessarily comes in English! We never learn how to comprehend Tibetan.

If students don't learn how to comprehend the source text (to truly read it), how can we expect them to learn how to translate?! If students can't learn how to think in Tibetan, how can they begin to understand an author who thinks in Tibetan?!

Machine Translation

One final point: This is not machine translation. Even if we started asking the computer to re-arrange the words according to the grammar (which we could do), it wouldn't be anything like how modern MT (machine translation) works.

Early on, this was exactly the method computer scientists tried to use to make a working MT. But natural languages are idiomatic; they break their own rules; they are metaphorical, not literal; they are, in the end, so much more than just bare vocabulary and grammar that this approach simply doesn't work.

So much so that every working MT model, like Google Translate, is based on other methods—like multi-word-level statistical analysis. 

If even machines can't make word-by-word translation work (and, they are much much faster at looking up words in the glossary than humans are), then why do we still expect Tibetan students to?!

screenshot-118.png