Chinese words in English

There are hundreds of languages in Asia, and most of them are not written in the Roman alphabet. As is typical in situations where transliteration is necessary, there are competing systems at a given time plus the systems of transliteration change over time, which leads to variety in spelling. What is more, any given word may not enter English directly, but rather via intermediate languages, and an intermediate language may shape a word strongly.

Chinese and Hindi are discussed here, but there are many more Asian languages that have given words to English—languages such as Tagalog, Sinhalese, Malayalam, Tamil, Tibetan, Javanese, and Japanese. Not only are there hundreds of Asian languages, there are a very many Asian language families. It is because of geography rather than linguistic relatedness that words from Asian languages are grouped together here.

CHINESE IN ENGLISH
It is often claimed that more speakers speak Chinese than any other language. Such claims are intersting. The language referred to as "Chinese" seems more like a group of languages, not a single language. And what are often called "Chinese dialects" are different languages that are mutually incomprehensible (at least orally), languages as different as Danish and English.

For the purposes of exploring English language words, the two most important Chinese languages are Mandarin and Cantonese. Both are written in a 2,000-year-old system of ideograms or "characters" that requires a great deal of education to master: There are approximately 40,000 of them. Just reading the daily paper requires knowing several thousand. Thus attaining literacy in Chinese takes much longer than in English. The sounds of Chinese can, however, be represented in the Roman alphabet (by a system called Pinyin), just as Japanese can (Romaji). The Pinyin system is the current official way to represent Chinese in Roman letters. Historically, there have been other systems.

There is one aspect of Chinese that is quite foreign to English speakers. Namely, what English speakers would call the same sound can have several meanings depending on its "tone." Mandarin has four tones, Cantonese has nine. Even if the tone is the same, there may be more than one homophone (which are written by different characters). For instance, there are 24 unrelated Mandarin Chinese words that are spelled L-i-a-n in Pinyin. See "yen" in table below, which has three different English meanings and corresponds to several different etyma in Cantonese and Mandarin (aka Pekingese).

A factor that unites Cantonese and Mandarin is that a Cantonese speaker can read something written by a Mandarin speaker, and vice versa. That is because the characters' meanings are the same in Mandarin and Chinese. The sounds that correspond to a given character are, however, different in Cantonese and Mandarin. Thus in writing, it is one language, whereas in speaking, it is two. I do not know enough about the grammar of each to say anything about that aspect. I am not sure how to count languages in general, but this sure does throw a spanner into the works.

Most Chinese-derived English words are relatively easy to spell. Here are some of the most common Chinese-derived English words:

word/open compound
Etymological meaning
etymology comment First use
in OED
brainwashing

translation of Chinese 1950
bok choy
white + vegetable
Cantonese 1847
cheongsam
long + gown
Cantonese 1957
chow
meat dumpling
perhaps Pekingese 1856
dim sum
speck/refreshment + heart/center
Cantonese 1948
ginkgo
silver + apricot
came into English via New Latin and Japanese 1727
ginseng
?
Pekingese 1654
gung ho
short for: Light Industries Cooperative Society
Pekingese 1942
judo
gentleness + art, way
Pekingese, came into English via Japanese 1889
jujitsu
gentleness + art
Pekingese, came into English via Japanese 1875
kalanchoe
mustard + orchid + vegetable
perhaps Cantonese, came via New Latin
1830
kowtow
strike/bump + head
Pekingese 1863
kumquat
gold + orange
Cantonese 1699
kung fu
skill
Cantonese or Pekingese 1966
lo mein
stirred noodles
Cantonese not in OED
oolong
black + dragon
labeled simply Chinese in Webster's Third 1845
pekoe
white + down
Amoy (from city Xiamen in s. Fujian: ~=Taiwanese)
1713
pinyin
arrange + sound
Pekingese 1963
ramen
pull + noodles
Pekingese, came into English via Japanese 1972
shar-pei
sand + fur
Cantonese 1976
shih tzu
lion + dog
Pekingese 1921
Shogun

Pekingese, came into English via Japanese 1615
souchong
small + sort
Pekingese 1760
tai chi
"the Absolute in Chinese Cosmology"
(full name is tai chi chuan: chuan = fist, boxing)
Pekingese 1736
taipan

Pekingese 1834
tangram

Pekingese + Greek 1864
tea

Amoy 1598
tycoon
great + ruler
Pekingese, came into English via Japanese 1857
typhoon

great + wind
Cantonese and Greek influence on earlier Arabic,
which was from Greek
1588
wok

Cantonese 1952
won ton
Cantonese 1948
yen round, circle, dollar
Pekingese, came into English via Japanese 1870's-80's

craving
Cantonese or Pekingese

opium
Cantonese or Pekingese

zen
Sanskrit > Pali > Pekingese > Japanese > English


There are, of course, more Chinese-derived English words than those in the above table: the above table shows only a few more or less familiar ones.

Hindi-derived English Words

Hindi is another major world language that has had a significant lexical influence in English. Hindi is an official language of India that is spoken mostly by Hindus. (India has 15 "major" languages.) Spoken mostly in Northern and Central India, Hindi is an Indo-European language of the Indo-Iranian branch and is thus much more closely related to English than Chinese is. It is written in Devanagari script, which is an abjad script that is usually used for Sanskrit as well, having 34 consonant+schwa signs and 9 vowel signs (well, it does depend on how one counts). It descends from the Nagari script, which is from the Gupta Brahmi Script, which dates back to the 3rd century BCE and may have developed from Aramaic, which is a semitic script cognate with the script that Greeks and Romans borrowed. Devanagari is written left to right and has only consonant letters with a default /a/ that is not written ( other vowels are done by diacritics, but initial vowels have their own independent signs). Hindi has been extensively influenced by the older language Sanskrit. Among the most common Hindi-derived English words are:
 
Hindi-derived English word/open compound
etymology comment
(if word did not originate in Hindi or has other influences)
bandanna
from Sanskrit
bangle

banyan (tree)
from Sanskrit
bungalow

burka/burqa/burkha/bourkha
from Arabic
cheetah
from Sanskrit
chintz
chop ("mark on goods")

cot
from Sanskrit
cowrie
from Sanskrit, of Dravidian origin
cummerbund
from Persian
cushy
from Persian
dhurrie

dinghy
also from Bengali
dungaree

gunnysack
from Sanskrit, probably of Dravidian origin
gymkhana
probably modification of Hindi, from Persian: influenced by Greek
hindustani
from Persian
izzat
from Arabic
juggernaut
from Sanskrit
jungle
from Sanskrit
jute
also Bengali, probably from Sanskrit
lac
also Persian
loot
from Sanskrit
mahout
from Sanskrit
memsahib
from Hindi-ified English madam (> mem) + Hindi sahib "master"
mongoose
from Prakrit, perhaps Dravidian origin
nabob
from Arabic
pajama
from Persian
punch?
 from Sanskrit
raja
from Persian
rupee
from Sanskrit
seersucker
from Persian
shampoo
(I once saw a bumper sticker that read "Boycott shampoo: use the real thing": folk etymology joke)
sitar/sittar
from Persian
tom-tom

veranda
part Hindi, part Portuguese

And once again, this list is just a few more or less familiar words: there are many many more Hindi-derived English words.

For comparison's sake, to give us an idea of how many English words that were borrowed from Hindi were NOT listed in the list above, here is a list of all the English words in W3 that contain the language Malayalam in their etymology section in W3: that does not necessarily mean they have a Malayalam etymon in their past, because the dictionary sometimes lists Malayalam cognates. Malayalam is a language of southern India closely related to Tamil, one of the official languages of Sri Lanka. The list includes 47 words, 7 of which were familiar to me ("candy" is not one: this "candy" refers to a unit of weight). It's not scientific, but it can give you a very rough idea of how many words from Hindi (or Chinese) were NOT listed in the lists above. There are 7 familiar words out of a total of 47 words either from Malayalam or with cognates from Malayalam listed in the dictionary. If the same ratio holds for Hindi-derived words (it may not), then with 37 Hindi-derived words that were familiar to me, there ought to be 247 total Hindi-derived words. Someday I'll test that extrapolation.

ENGLISH WORDS FROM MALAYALAM
areca
avaram bark
ballam
blatti family
bola[2,noun]
cachou
candy[3,noun]
carandas
catechu
chakram
chatty[2,noun]
chay[1,noun]
chetty
choli
choultry
chuckler
copra
corge
cot[6,noun]
cowrie[1,noun]
curry[2,noun]
ilava
illipe[1]
illupi

jackfruit
jangada
kadamba
kandelia
katel
kathakali
katuka
moringa
muncheel
nayadi
nayar
niota
ola
pandal
pariti
pattamar
pettah
piney
dammar
poon[1,noun]
popadam
teak[1,noun]
urena
zamorin