Tibetan

Myth of "Silent Letters"

There is no such thing as "silent letters"; there are only "outdated spellings". 

I'm sure you remember, like I do, learning how to spell using "silent letters". Like the "k-" in "knife", "know", and "knight". Or the "-gh-" in "though", "through", and "thought". 

But let's stop for just a second. Who's evil idea was this? Are we putting in random silent letters just to torture young children who are trying to learn how to read? Just to laugh when they try to sound out words with sneaky spellings? 

As it turns out, no. These aren't silent letters, but outdated spellings. Outdated spellings are ones that reflect how words used to be pronounced in English. The are spelled that way because they were pronounced that way! It's us moderns that are either 1) saying them wrong or 2) spelling them wrong. 

Let's look at a few examples: 

"Daughter" has a spelling that shows its Germanic roots—in German, it is spelled (and pronounced) "tochter". 

We can very easily see how English (a Germanic language) used to pronounce the now-silent "-gh-", almost just as the German still is pronounced. 

The word "knife" was borrowed by the French before the "k-" became silent—it's now the word for "pocket knife", "canif" (pronounced with a hard "c"). 

American spellings were, in part, an attempt to address some of these issues—which is why we have "draft" replacing "draught", among others. Before spelling standardization, all American and British spellings had their roots in, believe or not, Britain. But when the Americans tended towards "English" spellings that reflected English pronunciations, the Brits reverted to French spellings for French words. Which is how we get pairs like "color" and "colour", and "program" and "programme". 

In any case, it's a universal rule that speech naturally changes and evolves. Writing, on the other hand, changes much more slowly. But the more of these outdated spellings that pile up, the harder and harder it gets to learn how to read. Children in English-speaking countries have a more difficult time learning to read than German-speaking children, for example. That's because German is spelled how it is pronounced! 

Tibetan has the same issues. 

You'll notice that, if you've learned Lhasa or Central Tibetan, there are a plethora of "silent letters". The fact is that these weren't always silent letters. 

Just as spoken English has changed over time, leaving us with outdated spellings, Tibetan, too, has changed over time. The Central dialects have "silenced" many of the prefix letters, just as English silenced the "k-" prefix letter. 

Not all Tibetic languages are pronounced the same, however. This video is a bilingual, audio version of names of fruit in Chinese and Tibetan—I like it because we can see how the Tibetan is spelled alongside its pronunciation. You'll very clearly hear letters pronounced in this speaker's dialect that wouldn't be pronounced in Central Tibetan!

(On that note, I believe it's an Amdo dialect; if anyone can identify it with a bit more precision, please let me know and I'll add it here). 

 

Reading: What is it, really?

We all know what reading is, right? It's that thing you're doing right now. But do you really realize what you're doing right now? 

There are a few assumptions about reading that we all have. Most of us are used to reading being silent, private, easy, and fast. We're so well-practiced, that we don't realize what it took to get to the point where reading was simply second nature to us. 

Reading isn't silent!

But reading is a complex process. It takes many skills working in tandem. Although many of us today read "silently", what we are doing isn't silent—it's just very very quiet! 

Perhaps you can remember back to the days when you were learning how to read. Probably your parents and teachers and, if you were lucky, many other friends and family would read out loud to you. You started by looking at the pictures, hearing the words, and, over many repetitions, memorizing the story. 

You might have "read" the storybooks back to yourself or your family, using the pictures as clues, speaking out loud the main narrative in bits and pieces as you turned the pages. Slowly it dawned on you that these strange markings on the page—the letters—actually meant something, and were connected to the sounds of the words being read! 

The clear point here is that letters stand for sounds. Kids with good speaking skills and a natural sense of rhythm make good readers; Adults reading or thinking silently actually move their vocal chords and speech muscles in slight but measurable ways!

While we can suppress "reading out loud", but we can't get rid of it. We "hear" a voice in our heads, an author "speaks" to us; we re-present reading orally and aurally! Letters are cues for us to recreate speech sounds

This first step of the reading process is called decoding. Learning how to turn letters into sounds is a very important skill. BUT DECODING ISN'T READING! 

Again, think of what you're doing right now. Are you just making sounds? Or, are you also making sense?

"Reading" requires comprehension! 

Again, just making the sounds represented on the page isn't "reading". "Sounding out words" is an important skill for beginning readers to learn; but once the word is sounded out, we need to be able to connect it to a meaning! 

The fastest and best way to do this is by already knowing the word through experience. Once again, as you read this, how many times did you look in a dictionary? I'm willing to bet it's zero. How many unknown words can be on a page before you're doing more "looking in the dictionary" than "reading"?

The numbers from the research suggest unknown vocab can be up to 2%. Greater than 5%, and readers can't understand enough of the words to make sense of the sentences! That's why beginning readers need simple, speech-like material to begin reading. 

A real problem for Tibetan

That's a real problem for Tibetan. If we look at the current materials that exist for learning to "read" Tibetan, we have to admit that these materials do quite a lot of teaching us how to look in a dictionary—and not a Tibetan dictionary, but a Tibetan-to-English dictionary. 

If we look at the materials that exist for children, we have to admit that those materials contain very un-speech-like writing. And far greater than 2% un-speech-like words! So how do we really actually start reading, not "reading", in Tibetan?! 

Diglossia: Language Change & Standardization

I recently ran across some old notes from a presentation series that I'd like to publish as a resource here. The first topic I'd like to cover is "Diglossia". This is a key term that anyone with interest in Tibetan language should know; the term was coined in Ferguson's seminal work, title of the same name. Here is the citation and a definiton: 

Ferguson, “Diglossia,” (1959). Word 15: 325–340.

In linguistics, diglossia (/daɪˈɡlɒsiə/) is a situation in which two dialects or languages are used by a single language community. In addition to the community's everyday or vernacular language variety (labeled "L" or "low" variety), a second, highly codified variety (labeled "H" or "high") is used in certain situations such as literature, formal education, or other specific settings, but not used for ordinary conversation. In most cases, the H variety has no native speakers.

Ferguson expands on this definition of diglossia (di- two, glossia- languages) by offering us 3 aspects common to diglossic languages: 

  1. there is a large body of culturally defining literature;
  2. there are low literacy rates; and
  3. the literature has been around for centuries.

In other words, a large body of culturally defining literature, often religious in nature, tends to have an effect on the linguistic norms of a speech community. This effect is conservative in nature, so that as the spoken form (L) naturally changes and evolves, the literary form (H) remains frozen and fixed. 

This exacerbates a common phenomenon where any and all language change is viewed as "degradation". To quote Steven Pinker from his book "Sense of Style", 

"Every generation believes that the kids today are degrading the language and taking civilization down with it... Moral panic about the decline of writing may be as old as writing itself. According to the English scholar Richard Lloyd-Jones, some of the clay tablets deciphered from ancient Sumerian include complaints about the deteriorating writing skills of the young". 

For more on this, let's turn to John McWhorter (from his book, "Our Magnificent Bastard Tongue"):

“In ancient times, few societies had achieved widespread literacy. Writing was primarily for high literary, liturgical, and commercial purposes. Spoken language changed always, but the written form rested unchanging on the page. There was not felt to be a need to keep the written form in step with the way people were changing the language with each generation.

"For one, each language was actually spoken as a group of dialects very different from one another, such that there was no single spoken variety to keep up with. As long as the written form was relatively accessible to the general population, however they actually spoke, then the job was done. Old English, for example, came in four flavors: Northumbrian, Mercian, Kentish, and West Saxon.

"Most Old English documents are in the West Saxon dialect, because Wessex happened to become politically dominant early on. But this means that what we know as Old English is mostly in what is properly one dialect of Old English, and the speakers of the other dialects just had to suck it up. They did, and there is no evidence that anyone much minded.

"In addition, there was always a natural tendency, which lives on today, to view the written language as the 'legitimate' or 'true' version, with the spoken forms of the language as degraded or, at best, quaint—certainly not something you would take the trouble of etching onto the page for posterity with quill and ink. As such, the sense we moderns have that language on the page is supposed to more or less reflect the way the language is spoken would have seemed peculiar to a person living a thousand years ago, or even five hundred.

"In Europe, for example, it was the technology of the printing press and the democratic impulses in the wake of the Reformation that led to calls for written material in local languages. Until then, people in France, Spain, Italy, and Portugal readily accepted Latin—a different language entirely from what was spoken 'in the street'—on the page.”

In other words, we can imagine that “diglossia” was the world-wide standard among languages prior to the widespread literacy that was made available by the printing press and a move to written vernaculars... 

And that brings us to our final note on the day, that there is another modern diglossic language that closely parallels Tibetan: Arabic. Mohamed Maamouri writes: 

“There is a growing awareness among some Arab education specialists that the low levels of educational achievement and high illiteracy (and low literacy) rates in most Arab countries are directly related to the complexities of the standard Arabic language used in formal schooling and in non-formal education. These complexities mostly relate to the diglossic situation of the Arabic language and make reading in Arabic an overly arduous process.”

“A gap was formed between that standardized language of Islam as recorded in the Quran and related religious writings and the Arabic language commonly used by Arabs and non-Arabs alike. This language duality was reflected in the debate between Al-Kufa and Al- Basra, two major schools of .Arabic grammar, around issues of language 'degradation' and 'corruption' and the consequent issues of usage over linguistic purity and correctness.”

Diglossias form, in part, because all language change at odds with religious scripture is seen as “degradation”. And this lack of change in written norms leads, naturally and over time, to greater and greater gaps in "how things are spoken" and "how things are written". This gap, in turn, makes in harder and harder to become literate - hence the low literacy rates. 

For more, I suggest the following resources: 

  • McWhorter, John. "Our Magnificent Bastard Tongue". 
  • Gelder, Beatrice. "Speech & Reading: A Comparative Approach". A collection of essays on speech and reading.
  • Aitchison, Jean. "Language Change: Progress or Decay?" A discussion of how and why language changes. Spoiler: the answer to “Progress or Decay” is “neither;” language change is simply a fact. For example, if language is continually in a state of decline and decay (as each subsequent generation of every language that exists usually claims), how has language not ceased to exist? 
  • Arokay, Judit. "Divided Languages? Diglossia, Translation, and the Rise of Modernity in Japan, China, and the Slavic World". An in-depth survey of diglossic languages in Asia (particularly Japan, China, and Eastern Europe/Russia), and how they “dissolved” the gap between formal, traditional literature and informal, modern vernacular by adopting modern vernacular for literature.
  • Linn, Andrew R. "Standardization: Studies from the Germanic Language"
  • Freeborn, Dennis. "From Old English to Standard English: A Course Book in Language Variation Across Time"
  • Deumert, Ana. "Standard Languages: Taxonomies & histories"

Tibetan isn't special

Tibetan isn't special. Or rather, it isn't any more special than any other language. 

People speak, read, write, and translate Tibetan for many reasons. Some of us fell into it circumstantially; some were born into it; others are inspired, and see depth and beauty in Tibetan culture, religious texts, its people and its literature. 

There are a million different reasons Tibetan might be special to me or you in particular. But if we are even the littlest bit honest with ourselves, we have to admit that none of these qualify Tibetan as "more special" than any other language in general

There are many beautiful and inspirational literatures. There are many cultural heritages, religious traditions, and speech communities worldwide—and in all of them, there are proponents and believers and translators who swear that their language is special, beautiful, and unique! 

Once we accept the fact that Tibetan isn't special; that Buddhism is just another religion; and that our own personal biases and attachments needn't cloud our judgment on important matters, it opens up so many possibilities for learning and improving our Tibetan language work! 

We can learn from translators of other languages; we can analyze and adopt language practices that work well, are more efficient, or start using technological tools and common-sense solutions that have proven track records in other languages! 

Our relationship with the Tibetan language can actually improve if we look at what other people are doing in other languages, even if we have no relationship with those particular languages. And even if we have a broken relationship with our own language or religious tradition, there are things we can learn from them... 

How long to learn Tibetan? – FSI's "Language Difficulty" Rubric

The Foreign Service Institute (FSI) created a rubric to measure relative language difficulty. The idea behind this rubric is that the more closely related a language is to your mothertongue, the easier it is to learn. For example, Scandinavians have such an easy time learning English (and vice versa) because their native tongues are so similar to English. 

That's why FSI classifies languages like Dutch, Swedish, and Norwegian as "Category I". Learners can expect proficiency in Category I languages relatively quickly: after some 600 hours of language study. 

While Tibetan doesn't make the list, we can make the educated guess that it falls in the most difficult category: "Category V: Languages which are exceptionally difficult for native English speakers." Why? For one, this is how many of its Asian language peers, like Chinese, Japanese, and Korean are categorized.

For another, the significant difference between "How to speak Tibetan" and "How to read and write Tibetan"—diglossia—makes the language more difficult. And Tibetan's diglossic peer, Arabic, is also categorized as a Category V language (much for this very reason, we can assume). 

The FSI estimates that learning a Category V language takes some 2,200 hours of language instruction. This is a serious number. For comparison's sake, a student in the university system can graduate with a mere 280 hours (taught in the English medium, no less). That's well short of FSI's suggested number needed to attain proficiency... 

Hours of Tibetan Language Instruction

"Meaning" isn't found in grammar

Meaning isn't in words; it isn't in the grammar; language isn't literal.

Take, for example, the innocuous question: "What do you do?"

I've seen this translated literally into Tibetan as: "ག་རེ་བྱེད་ཀྱི་ཡོད།". Now, this obviously misses the mark, and you might be tempted to say it has something to do with the additional "do" that was "not translated".

But this is what's called a "meaningless do". Compare, for example: "How do you do?" to "How are you doing?". The difference in meaning is degree (of formality), but not kind. The meaning is the same: "How are you?".

Whereas "What do you do?" has an entirely different meaning from "What are you doing?" — the first being "What's your occupation?" and the second meaning "What are you up to?".

1.) How do you do? = How are you doing? 
2.) What do you do? ≠ What are you doing?

This is a twist on Chomsky's famous example:

1.) John is easy to please = It’s easy to please John
2). John is eager to please ≠ It’s eager to please John

The point is this: Meaning is not encoded in grammatical syntax. Nor is it found in bare vocabulary (see my previous post on how and why). Language is not literal. 

Thus, we cannot simply use a "words plus grammar" approach to translate languages (Grammar Translation). "Having all the words" does not ensure accuracy. "Having all the meaning" is what ensures accuracy!

If we cannot assess accuracy by simply (and literally) ensuring that each source word is rendered as a target word, how do we do it?

One method is back-translation. We can translate from target to source to see if meaning is retained. (In an ideal and objective assessment, a second translator unfamiliar with the source text is provided the target, and translates back to the source; the translations are compared for meaning).

In our example, "ག་རེ་བྱེད་ཀྱི་ཡོད།" is back-translated as "What are you doing?" — and when compared with "What do you do?", we can see the mistake...

The Myth of the Literal Translation

In the popular imagination, words are real things that stand for real objects. The word “coffee,” for example, stands for the real object “coffee.” The coffee is here in my cup, and I am drinking it. What could be simpler? And what could be more real? Words, and the objects they stand for, are real, concrete entities. They are clear-cut and well-defined. And this is the world we believe we live in.

In this world, translation is the simple task of replacing one word in one language for one other word in another. An accurate translation is easy to produce: Each well-defined source word is replaced by an equivalent well-defined target word. If each word in the source language is transferred into the target language, it is “accurate.” That’s definitional.

We even have evidence of the well-defined nature of words. These evidences are collected and spelled out in our dictionaries. Each word has its definition, and it refers to something real. After all, that real something is what we are talking about when we use that word! For the translator, we have evidence of the well-defined nature of word-to-word correspondances. These are collected and spelled out in our translation dictionaries.

This is the kind of story we all tell ourselves. And this simplified version of “what language is” works well enough in the everyday lives of everyday people. What we don’t realize, however, is that words lose their meaning in isolation—that merely the act of uttering them within an everyday context is what imbues them with meaning. And to be a language professional necessitates that we wrestle with these ideas seriously...

A building is in the progress of being erected (shīgōng zhōng, 施工中). Of course, it would be absurd to claim that the hilariously translated “Erection in progress” is accurate, even though it is “literally correct.” A translation sensitive to contextual meaning would be: “Under construction,” even though the word "under" (xià, 下) doesn't literally appear in the Chinese... 

A building is in the progress of being erected (shīgōng zhōng, 施工中). Of course, it would be absurd to claim that the hilariously translated “Erection in progress” is accurate, even though it is “literally correct.” A translation sensitive to contextual meaning would be: “Under construction,” even though the word "under" (xià, 下) doesn't literally appear in the Chinese... 

Words themselves aren’t literal

The biggest stumbling block for literal translations is the simple fact that words themselves aren’t literal. Words aren’t real “things” (at least not in the way they think they are) and they don’t stand for real objects (at least not in the way we think they do). Words are ambiguous; contextual; metaphorical; and more. Like every other phenomena, words are dependent not independent. We can see that words are contextual, not literal, by taking a closer look at what words depend on for their meaning:

  1. Words depend on physical contexts

  2. Words depend on sociocultural contexts

  3. Words depend on linguistic contexts

  4. Words depend on their language

1. Words depend on physical contexts

The most obvious ways that words belay their un-literal-ness is that they don’t refer to “one thing.” Instead, they refer to a pattern of things—a class of things that share something in common. In the case of any one word, it represents a whole range of things. There are always borderline cases. If we’re lucky literal-ists, a word only applies to one thing... But usually, words also have multiple “things” that they refer to. For example, "LAP": 

  1. the flat area between the waist and knees of a seated person
  2. one circuit of a track or racetrack
  3. (of an animal) take up (liquid) with the tongue in order to drink
  4. ...
  5. ...

Let’s take a look at some more examples. Let’s say I utter the simple phrase, “Oops! The coffee spilled. Bring me a mop” And now, you have an image of spilled coffee in your mind. But now I change the physical context of my utterance. I say, “Oops! The coffee spilled. Bring me a broom.” Unless you’re in the habit of brewing your own coffee, or like me, spent time working in a coffee shop, the second context might not bring the right image to mind. But it looks something like this:

The coffee spilled; bring me a mop.

The coffee spilled; bring me a mop.

The coffee spilled; bring me a broom.

The coffee spilled; bring me a broom.

The word “coffee” (like all words) is actually ambiguous. It is the pattern of things that are somehow coffee-like. It’s the beans and the grounds and the liquid brewed from them... If I say, “I’m gonna go pick up some coffee” you literally don’t know exactly what I mean. You need more information. The reason we don’t realize words are ambiguous in this way is that we generally use them in contexts. What makes their meaning clear isn’t the words themselves, but the contexts they are used in.

For example, I might say, “I’ll pick up some coffee at the store.” Now you know I probably mean beans or grounds, and not the hot beverage. “The store” gives some context clues. The key here, though, is that you need to be in on the context clues I’m providing. You need the requisite linguistic experience that has primed you to interpret the word “store” the same way I do. After all, there is nothing inherent in the word “store” that precludes it from being a place that sells brewed liquid coffee!

All words are used by human beings, and all human beings are embedded in a culture. Our language itself is passed down to us. It is our cultural heritage—not an individual invention, but a shared, cultural good. So, of course, the most common way to be in on it is to have similar experiences of similar objects by being embedded in the same culture... 

2. Words depend on sociocultural contexts

calvin hobbes subtitles

The second way in which words don't have literal meaning is belayed by variance according to sociocultural context. We rarely spell out exactly what we mean (we're rarely "literal"). Instead, we expect others to be sensitive to the contexts in which we use words. When someone asks, "How are you," the response is nearly automatic—"Fine, how about you?" The question isn't literal, nor is the answer. The person asking generally has no interest in how you actually are, and neither would you tell them that things are going terribly (even if they were). 

That's because the question is not a "literal" question. It's real function is social in nature. It's meaning is something like: "a polite recognition of your existence in a shared space where an interaction is about to take place." As is the reply. It performs as a sort of signal between the two of you that, hey, I'm a person who's a decent person, and this interaction is going to go just fine. It works to simplify something that could be complicated and messy (the interaction of two people) into something simple and smooth (an everyday interaction).   

For another example, I think we're all aware of the difference in "the kind of language we use with our friends" and "the kind of language we use with our parents." Or how about "the kind of language we use with strangers at a formal event?" Part of being a well-socialized human is knowing what kind of language to use where. Word meanings shift according to social context—a word that's funny or natural with friends might be insulting or embarrassing to use with your parents and completely unacceptable in social company... 

Words are not literal because they are things that carry messages. Messages, by their nature, carry information from one person (the speaker or author) to another (the listener, reader, or audience). These people are active participants in assigning meaning to words. In other words, "words" don't exist out there all by their lonesome. They exist in relationship to speaker and audience. They are dependent, not independent! 

3. Words depend on linguistic contexts

First, let's note that "words" can vary in meaning depending on the words they are used with.  Let's begin by asking ourselves a very basic question: Why are we worried about "words" at all? After all, a "word” is an ambiguous and arbitrary unit of meaning; units of meaning can be larger and smaller than “a word.” A "word" can be used alone—lap—in a compound word—laptop—or within a phrase—lap of luxury—and communicate something different each time! 

Words even have connotations that are sometimes hidden. They might "literally" have a neutral meaning, but be used exclusively in negative contexts! In the linguistics literature, this is called semantic prosody. For example, the English word "spread" literally means "to expand or extend across an area." And yet, an investigation of how it is used in context shows us that it has a negative semantic prosody. We talk about diseases, cancers, and viruses as "spreading." When we talk about positive things, we use words like "growing" or "advancing" or "expanding!" 

If we're translators, we need to understand these inter-linguistic relationships in both languages. We need a sense of word connotation, not just of literal meaning, or we may mis-understand an inference the author is making; we may choose an inappropriate translation term; or, we may misunderstand the source text (by mistaking a word's secondary use as a primary use, for example). 

4. Words depend on languages

On top of all that, even if words were literal, they never "literally" mean anything in another language. Languages are self-contained worlds, worlds that stand on their own, and are self-referential. When I speak in English, I don’t “mean anything” in Tibetan—my words make sense in the English-only context in which they were uttered. Tibetan utterances, likewise, have no inherent English meaning—they only have a Tibetan meaning.

What I’m getting at here is that reality, whatever it is, is a vast sea of undifferentiated phenomena. And that our cognition, it seems, somehow does the work of filtering and categorizing it. Our mind is constantly searching for, and then constructing, patterns out of our raw sensory data. From this perspective, language is a tool we’ve inherited from our ancestors that help us match patterns, and communicate them (in useful ways) to our companions.

The key point, though, is that languages have solved the problem of “how to split up reality into useful objects” in different ways. And that means words across languages vary in the way they draw boundaries, even in the physical world. These variances aren’t just superficial; they can be fundamentally different, even at the deepest, most subconscious level. 

As we've already discussed, words are inherently ambiguous. Again, if I say, "I'll go get some coffee," you literally don't know what I mean (unless I give you the context, or unless the context is given by experience). Similarly, if I exclaim, "Ow!" you know that I encountered a sudden pain. But, there is still ambiguity—you don't know what kind of sudden pain I experienced. Was I burned? Pinched? Hit? Did I knock into something? 

In Tibetan, there are two possible translations of "ow!"—ཨ་ར་ and ཨ་ཚ་! When a Tibetan speaker encounters a sudden pain that is sharp or hot, they spontaneously utter ཨ་ཚ་ (atsha). If that pain is dull or blunt, the word used is ཨ་ར་ (ara). The point is that all words in all languages are always ambiguous to some degree. Reality is too complicated to be otherwise! But, all words in all languages aren't ambiguous to the same degree! In this case, Tibetan is less ambiguous than English: 

B. Student Presentation (What is language_ + Tibetan & more).png

This puts us in quite a bind. To translate "literally" from Tibetan into English, simply saying, "Ow!" leaves information out that is encoded in the Tibetan. Yet, putting more words in (to explain the context) means adding words that literally aren't there! In other words, there is no way to translate literally because there is no literal correlation between languages! 

It's an impossible standard because words themselves simply aren't literal—they depend the physical context in which they are uttered, the sociocultural context in which the author and audience are embedded, the linguistic context of the other words they are being used with, and finally, the specific language in which they are being used. 

སྔོན་པོ། Tibetan 'Grue'

Did you ever notice that the colors in Tibetan pretty standard-ly end in 'po:'

མར་པོ།  སེར་པོ།  དཀར་པོ།  ནག་པོ།  སྔོན་པོ། ་་་་་་  

Except for one: ལྗང་ཁུ། ? Why the odd ending for the color "green"?

There are two curious entries in the བོད་རྒྱ་ཚིག་མཛོད་ཆེན་མོ། related to this:

  • ལྗང་བུ། གྲོ་ནས་ཀྱི་མྱུ་གུ་སྔོན་པོ།
  • སྔོན་པོ། ནམ་མཁའི་དོག རྩྭ་སྔོན་པོ། རས་སྔོན་པོ།

Both entries suggest that the natural color of grass, or barley sprouts, in the Tibetan color scheme is not "green" but "blue." If so, it would make sense that "green"—ལྗང་ཁུ—was coined with the import of Buddhist color schemes from India.

And that སྔོན་པོ is not "blue," but "grue."

This theory gains even more weight when we count the number of hits we get when searching for color terms (note that the frequency of all terms, across English and Tibetan, are %-wise comparable, with one exception—green):

TBRC (Tibetan Corpus):

  1. WHITE - dkar po - 52,623
  2. BLACK - nag po - 38,938
  3. RED - dmar po - 36,848
  4. BLUE - sngon po - 25,870
  5. YELLOW - ser po - 18,225
  6. GREEN - ljang gu - 7,402 + ljang khu - 3,708 = 11,110

COCA (English Corpus):

  1. WHITE - 208,113
  2. BLACK - 176,735
  3. RED - 90,798
  4. GREEN - 71,948
  5. BLUE - 59,997
  6. YELLOW - 26,990
grue.jpg

Automating "Translation"

I've written a very simple script that "translates" Tibetan—or rather, it does as much Tibetan work as many of us do in the university-level classroom.

I'm going to pick on Rockwell here, because that's the book I was taught Tibetan from. All I've done for the script is input the glossary from Rockwell—the script simply converts the Tibetan symbols to Rockwell's given English equivalent.

In other words, I've outsourced the tedious work—work every student is taught to do in the grammar-translation classroom—of "using the glossary to look up words one-by-one" and then "write out the English words one-by-one," and given those jobs to the computer. 

What I've done is given the computer the following tasks:

  1. Recognize the Tibetan symbol(s)
  2. Look the symbol string up in the glossary
  3. Replace the Tibetan symbols with English symbols

Using the Script

For example, I can feed the following symbol string to the computer:

Input: 

  1. །སངས་རྒྱས་ཆོས་དང་དགེ་འདུན་ལ
    །སྒོ་གསུམ་གུས་པས་སྐྱབས་སུ་མཆི KP 3A:6 (verse)
  2. ་་་བླ་མས་བུ་ཆེན་རྣམས་ལ་ཆོས་དང་གདམས་ངག་གི་དཀོར་མཛོད་རྣམས་ཁ་ཕྱེ་་་ MINT 88:11-2...

What the script spits out is:

Output: 

  1. ། [Buddha] [dharma] [*AND*] [Buddhist community, sangha] [*TO(la)*]
    [./,/;/:/?/!]
    [three gates = body, speech, mind] [respect, devotion] [*AGENT/INST*] [refuge] [*TO*(ladon)] [to go] KP 3A:6 (verse)
  2. ་ [lama] [*AGENT/INST*] [son] [great] [*PLURAL*] [*TO(la)*] [dharma] [*AND*] [oral instruction] [*OF*] [treasury] [*PLURAL*] [pf. to open] ་ MINT 88:11-2...

Now I'm ready to "translate"! Using the grammar clues [*GRAMMAR*], all I need to do now is re-arrange the English words into a sentence that makes sense. (This is exactly what many Tibetan students are up to to this day!). 

But wait... I have to ask, couldn't I give this output to just about any native English speaker, even one who doesn't know a lick of Tibetan? With a few simple instructions (try reading it backwards; word order is SOV; etc.), they, too, could begin "learning Tibetan" without ever actually seeing even one word of Tibetan.

The question I'd like Tibetan teachers and students to think very hard about is this: If the computer can easily provide the English for us, what "Tibetan" work is really being done?

Reading

I've mentioned in a past post that "reading" is a complex process that requires: 1) decoding and 2) comprehension. What the script does is a version of the first bit of work for the student: decoding.

And, I'm arguing, that's all the Tibetan work a student can do in the Grammar-Translation classroom anyway. When we're taught in English, and we decode Tibetan into Englishall of our comprehension necessarily comes in English! We never learn how to comprehend Tibetan.

If students don't learn how to comprehend the source text (to truly read it), how can we expect them to learn how to translate?! If students can't learn how to think in Tibetan, how can they begin to understand an author who thinks in Tibetan?!

Machine Translation

One final point: This is not machine translation. Even if we started asking the computer to re-arrange the words according to the grammar (which we could do), it wouldn't be anything like how modern MT (machine translation) works.

Early on, this was exactly the method computer scientists tried to use to make a working MT. But natural languages are idiomatic; they break their own rules; they are metaphorical, not literal; they are, in the end, so much more than just bare vocabulary and grammar that this approach simply doesn't work.

So much so that every working MT model, like Google Translate, is based on other methods—like multi-word-level statistical analysis. 

If even machines can't make word-by-word translation work (and, they are much much faster at looking up words in the glossary than humans are), then why do we still expect Tibetan students to?!

screenshot-118.png