Digital China Worldwide | Database of Neologisms in Modern Chinese

Database of Neologisms in Modern Chinese

Visit Resource's Website

Institute of Modern History (Academia Sinica)

The Database of Neologisms in Modern Chinese focuses on the etymology and development of new terms in Chinese, particularly those generally discussed as potential Japanese loanwords. Potential here refers to the fact that research on these words and concepts is still ongoing. The role of Japanese loanwords in modern Chinese has a complex history; initially, they were long downplayed and later regarded as a matter of fact. Both positions are questionable: indeed, Japanese loanwords constitute a significant part of the Chinese lexicon, but their status is not simply a matter of fact. Each individual etymology, but also their collective history of transmission are far from well understood. For anyone concerned with the modernization processes that unfolded over the last century in East Asia, understanding the conceptual history is crucial, and Japanese loanwords play a major role in this context. Since their influx into the Chinese lexicon 120 years ago, the debate on Japanese loanwords has undergone significant changes and spurred a variety of research. However, these studies have not really stood the test against historical facts. This database aims to reconcile two perspectives: academic research and actual historical sources.

Chinese long held the belief that their language was especially pure, authentic, and original, as, over extended periods, Chinese culture profoundly influenced its neighbors. It was thus particularly painful to confront the reality that the Chinese language had been influenced by its smaller neighbors just as much as the other way around. The impact began not only with the introduction of Buddhist vocabulary and religion but also as non-native rulers and cultures left a profound imprint on the Chinese language. This influence was not limited to absorbed loanwords but extended to such a degree that the northern dialect underwent significant phonological changes compared to its southern variants. Then, with the arrival of the Jesuits at the end of the Ming Dynasty, Western ideas and words started to infiltrate the Chinese lexicon. However, the most substantial and dramatic impact on Chinese vocabulary came from the influx of Japanese loanwords, which began with Chinese students returning from Japan from 1900 on. Bringing back a modern Japanese dictionary to a China dormant for centuries might have seemed innocuously harmless at first glance. However, each new word introduced a new idea, and every new idea increased the pressure to modernize, to transform society, and to explore new possibilities. A revolution became not only possible but inevitable, as the word and the idea of revolution became suddenly accessible.

Research into Chinese neologisms has been ongoing since the early 20th century, notably marked by the publication of the Xinerya (《新爾雅》) in 1903. However, due to the absence of Western linguistic methodologies and various political and societal factors, the study of Japanese loanwords in Chinese did not evolve into a legitimate research field until after the war.

Post-war, there was a renewed interest in Chinese neologisms, with efforts to gain a clear understanding of which terms were borrowed from Japanese, and which factors and aspects might facilitate a useful categorization system. Initially, the Chinese response was heavily influenced by the Western loanword framework, which primarily identifies loanwords through phonetic similarities. This approach led early modern Chinese scholars such as Wang Li famously dismiss the idea that Japanese loanwords could be considered as such. The historical adoption of Chinese characters by the Japanese centuries earlier further contributed to minimizing or outright rejecting the significance and status of Japanese loanwords in Chinese. The role of semantic change in these loanwords was not fully acknowledged in these studies.

But Chinese linguistics did not stop developing. Particularly in historical lexicology, it became more widely accepted over time that due to the extensive use of Chinese script as a “scripta franca” in East Asia, words were often transferred via writing rather than spoken language. Consequently, the emphasis on phonetic similarity began to diminish, and factors such as new character combinations (word form), comparing words morpheme-by-morpheme, and finally semantic change came to be recognized as important criteria for identifying loanwords. Furthermore, comprehensive comparisons between the modern and classical meanings of Chinese words revealed that many so-called backloans (homographs that have acquired new meanings) could in fact be considered new words, as their modern meanings are partly or entirely distinct from their classical counterparts.

From the post-war period up to 2000, research on Japanese-made Chinese words (和製漢語) significantly developed its own terminology and methodology. A milestone in this regard was Wang Lida’s work on loanwords back in 1958, and later the publication of the “Dictionary of Chinese Foreign Words” in 1984, which formally recognized the status of Japanese-made Chinese words as Japanese loanwords. Studies during that period established unique frameworks that applied methods different from their Western counterparts and adapted to problems unique to Chinese characters. During this period, the academic community reached a certain consensus on the classification of Japanese loanwords, broadly dividing them into three categories: backloans (回歸詞), transliterations (音譯詞), and calques in Chinese called (意義詞). Backloans refer to those words that originally existed in ancient Chinese texts but were given new meanings by the Japanese, such as “經濟” (economy); transliterations are new terms created by the Japanese that reflect the pronunciation of foreign languages, such as “雷達” (radar); calques, also known as imitations, are words that copy the original morpheme structure of the source word, either partially or wholly, such as “蜜月” (honeymoon).

In contemporary Chinese linguistics, the status of early modern Japanese loanwords is largely uncontested. However, differences of opinion still persist, on a word by word basis, mostly due to different access to historical sources. Particularly concerning certain subtypes of loanwords—specifically, mixed types where loanwords are combined with native morphemes, or cases where genuinely new, legitimate Chinese terms have been created, modeled after Japanese words. Shen Guowei recently introduced the notion that frequency of use also plays a role, and that some words, although never truly vanished from the Chinese lexicon, have become more popular due to Japanese influence.

However, modern studies also have their limitations. The most fundamental problem of all loanword studies in the Chinese language is that they are confined within the borders of the Chinese language and only in rare cases include Japanese, as if loanwords would have trouble crossing water, mountains, or the steppe. With the exception of Chen Liwei, who mostly publishes in Japanese, loanword studies do not consider deep etymologies. This is a stark contrast to Western etymology, where words are traced back many steps up to Indo-European roots.

In this context, there remain many topics to explore, which is why the current database has been developed. Here, we will briefly outline areas of ongoing academic debate. Most importantly, the issue of authorship in the realm of historical lexicography has not been fully acknowledged. This oversight is somewhat understandable given that this issue is almost unique to East Asian neologisms and is far less pertinent in the Western context. At the heart of this issue is the fact that Western ideas—and the words that express them—cannot be easily written in languages using some variant of Chinese characters at their basis.

Assume even the most direct borrowing method is applied—phonetic transliteration, not only do these new words undergo significant phonetic adaptation, but Chinese characters are also not the best tool for writing phonetically. Although they can be used to just represent sound, it is a rather awkward choice, and as a matter of fact, Chinese typically avoids this strategy. Furthermore, in the mid-19th century, phonetic transliteration was often viewed as a low-quality translation method. It was also seen as an admission of a lack of equivalent ideas and words in, for example, China or Japan—an acknowledgment associated with inferiority, which was unthinkable and intolerable.

Since phonetic translation was and is not a favored borrowing technique, only two methods of linguistic transmission remained viable: calquing, which involves imitating a foreign word morpheme by morpheme, and conceptual remaking, where the foreign idea is expressed using a custom-made morpheme construction. These methods allow for a more nuanced and culturally acceptable integration of foreign terms into the East Asian linguistic landscape. More often than not, to the untrained native eye, these loanwords go undetected and feel like a normal part of the lexicon.

However, calquing can be made difficult when the source word has an oblique morpheme structure, for example if the word was derived from a not well known language at the time, for example words like ‘kangoroo’ or ‘giraffe’, or had a complex morpheme structure from Latin or Greek, or when the word meaning was already metaphically shifted and not directly tied to its morphemic components, in words like (biological) ‘cell’, originally meaning small room (Lat. cella). But (biological) ‘cell’ has not been calqued into ‘小室’ or ‘小間’, but is an entirely new construction xibao (細胞). Furthermore, specialized vocabularies in the natural sciences, e.g. terms of chemical elements (‘Hydrogen’, literal ‘water-originated’), or modern technologies (‘satellite’, from Lat. satelles, an ‘attendant’), also presented challenges, as the literal meaning often diverged significantly from their extensional meanings.

In these cases, the most challenging and definitive method of borrowing was employed: creating a new word form based solely on and triggered by the foreign word’s meaning. Authors who downplay the importance of meaning and stress the notion of word form generally do not accept conceptual remakes as loanwords. Their argument is that semantic borrowing is hard to prove, not least because semantic equivalence is almost never possible between two words, let alone between two words in different languages. Although this is true, in the case of written borrowing between languages with different script systems, if the creation of a new word can be credibly associated with foreign sources and is documented as a translation, it can be assumed to be a motivated creation. Although conceptual remakes are the weakest form of borrowings, they are, as we think, loanwords.

Conceptual remaking requires, on the part of the translator, a deep understanding of the intricacies and nuanced details in the target language, alongside numerous iterations of fine-tuning. Starting shortly before and during the Meiji Restoration, there was a significant intellectual effort to create new words using Chinese characters, explicitly intended to convey Western modern meanings. The remarkable knowledge and linguistic sensitivity of these early Japanese scholars led to the creation of approximately 2,000 to 3,000 new words, meticulously crafted for use in our modern era. These efforts were not only a linguistic achievement but also a cultural bridge that helped integrate Western concepts into East Asian languages in a way that was both meaningful and accessible.

Notwithstanding the contributions of Japanese scholars, it’s crucial to acknowledge that not all new terms were solely their own creation. Apart from the integration of Chinese neologisms stemming from translations of Buddhist texts—a subject we already briefly mentioned—another significant subset of new words originated from Western missionaries during the latter half of the 19th century. These missionaries produced a range of materials, including dictionaries, textbooks, essays, and religious texts, all written in Chinese with the aim of educating and evangelizing.

While many Western contributions were mostly ignored by their Chinese counterparts at the time or lost to history, some of their works made their way to Japan, where they became essential reference materials for translating Western knowledge into Japanese, for example Wylie’s Liuhe Congtan (六合叢談). This historical circumstance led to the unique phenomenon where Western-made Chinese words were first incorporated into Japanese and then, disguised as Japanese loanwords, re-entered the Chinese lexicon. This pathway of lexical transmission underscores a complex layer of cultural and linguistic exchange, where the origin of words is intertwined with historical interactions among China, Japan, and the West. These so-called Western sources of Japanese loanwords in Chinese have been thoroughly investigated in the study “Western Origins of Japanese Loanwords in Chinese: Academic Evaluation and Lexical Resource Construction” by the author.

Given this historical context, the critical question is then how to differentiate Western-made Chinese words from legitimate Chinese neologisms. Within the current standard framework, words created by Westerners are typically labeled as “missionary words” or similarly. We suggest that the primary criterion for this classification is authorship, which serves as a fundamental measure to assess the nature of loanwords. The notion of authorship helps distinguish between Japanese, Chinese, and Western-made loanwords based on historical documentation. It also eliminates the somewhat awkward criterion often employed in earlier studies to judge the loanword status of a word by whether the word had Chinese characteristics. As we understand today, loanwords can be created perfectly just like Chinese words and nevertheless be loanwords, because they were constructed by non-native Chinese translators, be they Westerners or Japanese, with the intent of being used as legitimate Chinese words.

This approach necessitates proving, to some extent, the origin of each word by identifying in which type of source it first appeared: in Chinese texts written by Western authors, in bilingual dictionaries, in Chinese texts by Chinese authors, or in Japanese sources. By tracing a word back to its earliest record, we can infer its authorship and associate both the author and the language with the word’s creation. While this method does not relieve us from the need to understand the extent of any meaning changes that have occurred over time, using first historical records provides a more objective basis for evaluating loanwordness.

Given the significant burdon of proof that the first historical record holds in establishing authorship, it becomes essential to assess the reliability of historical records. Specifically, we need to evaluate how likely it is that the currently earliest record for a given word in the database truly represents the first occurrence of the word in reality, and conversely, how probable it is that an even earlier record might be discovered at a later point during research. Although absolute certainty is unachievable, by analyzing the distribution of all words in the database from 1600 to 1920, and examining the specific distribution of records for each individual word, we can calculate a specific likelihood, using Kernel Density Estimation, a method often used in analyzing time series. This likelihood informs us, for each word, how convincing our assessment of its authorship is.

This method marks a departure from previous approaches, where non-explicit processes or subjective judgments formed the basis for classification. In our database, we acknowledge that the historical borrowing of words is complex and subject to ongoing discovery. As new data are added to the database, the classification of some words may change, but this also enhances the overall robustness of our classifications.

In recent years, another significant development in the field of loanword etymology research has been the systematic organization of synonyms. In principle, synonyms can be organized into so-called synsets. There are two basic types: synchronic synsets contain words that share a similar meaning according to their current reading; and diachronic synsets, collecting older word forms or words with older meanings that are closely related. Diachronic research of synonyms is especially interesting because it can shed light on the initial period of confusion during the early formation of new terms, during which there was no consensus yet on a specific word form, and multiple synonyms might appear simultaneously. Over time, a consensus gradually formed, and certain terms emerged as mainstream, while others were gradually phased out. This process of vocabulary development varies in duration among different terms, some taking generations to reach a consensus, while others establish their status within a few years. In this regard, a significant contribution comes from Huang Heqing, whose work “Modern and Contemporary Chinese Etymology” is an important study on historical synsets.

We include both synchronic and diachronic synsets, as well as our own diachronic set. Although the inclusion of synsets was an integral part of the database from the beginning, the building of our diachronic synset was not a main focus of the undertaking. It developed rather naturally from the fact that we included many bilingual dictionaries in our resources. In this respect, our synset represents a rare type of diachronic multilingual synset, capturing the evolution and interconnections of concepts across different languages and time periods.

In total, the academic research section of this database includes the following research:

  • “The Chinese Loanword Dictionary”, Cen Qixiang (1984)
  • “List of Hezhihanyu”, Sato Takeyoshi (1996)
  • “Nurturing Esthetics”, Zhou Shenglai (2016)
  • “A Study of Japanese Loanwords from the Late Qing Dynasty to the Early Republic of China and the Reform and Opening Up and Their Sinicization”, Qu Zirui (2016)
  • “Intertwined Cultural History – Manuscripts of Early Missionary Sinology Studies”, Zhang Xiping (2017)
  • “Modern Etymological Dictionary”, Huang Heqing (2020)
  • “Modern Chinese loanword research”, Gao and Liu (1958)
  • “Words Borrowed from Japanese in Modern Chinese”, Wang Lida (1958)
  • “Chronicles of Foreign Students in China”, Saneto Keishu (1981)
  • “The Chinese Loanword Dictionary”, Liu, Gao, Mai and Shi (1984)
  • “Translingual Practice”, Liu He (1995)
  • “The Etymological Dictionary of Modern Chinese Neologisms” (2001)
  • “Conceptual History Research”, Jin and Liu (2010)
  • “Research on the vocabulary of Beijing Mandarin textbooks during the Meiji period in Japan”, Chen Ming’e (2014)
  • “The study of Japanese loanwords in Xiandai hanyu cidian”, Morita Satoshi (2016)
  • “The History of Loanwords in Ming-Qing Chinese”, Zhao Ming (2016)
  • “Xinhua Loanword Dictionary”, Shi Youwei (2019)
  • “Research on Modern Chinese 2-character words Shen Guowei (2019a)”
  • “Research on Yan Fu’s translations” Shen Guowei (2019b)
  • “History of Sino-Japanese linguistic exchanges”, Shen Guowei (2020)
  • “Lexical Studies During the End of the Qing and Beginning of the Republican Era”, Zhang Ye (2019)
  • “From East to East — Lexical concept between China and Japan in Modern Times”, Chen Liwei (2019)
  • “The Trajectory of Vocabulary Exchange between China and Japan in Modern Times”, Zhu Jingwei (2020)

Besides modern studies, we also include 8 earlier studies:

  • “Xinerya”, Wang Rongbao and Ye Lan (1903)
  • “New explanation of words”, Liang Qichao (1904)
  • “Blind men blind horses
  • New terms”, Peng Wenzu (1915)
  • “Etymology of new terms”, Zhou Shangfu (1917)
  • “Etymology of Japanese words”, Liu Zihe (1919)
  • “Development of terminology translated from Japanese”, Yu Yousun (1935)
  • “Sources of new terms”, Wang Yunwu (1944)

Furthermore, the database includes a list of around 600,000 modern Chinese terminologies across a large range of academic fields, compiled from online sources of the Ministry of Education in Taiwan.

In terms of search capabilities, the database offers a variety of search options to enable users to find and compare words according to their own interests and intuitions, thereby encouraging them to explore the borrowing history of words independently.

In terms of semantics, we provide translations for each Chinese term in different languages, mainly including actual historical source translations, such as those obtained from a variety of English Chinese dictionaries. Furthermore, we recognize the importance of Buddhist Chinese translation terms, thus providing definitions from three Buddhist dictionaries:

  • “Buddhist Dictionary”
  • “A Dictionary of Chinese Buddhist Terms”
  • Seishi Karashima’s “A Glossary of Kumārajīva’s Translation of the Lotus Sutra”

These three dictionaries can be publicly accessed at https://glossaries.dila.edu.tw.

All Japanese vocabulary, translations, text sources, and dates are authorized data from the “Chunagon” database of the National Institute for Japanese Language and Linguistics.

To collect these related studies as comprehensively as possible, we have established this database, aiming to provide an integrated platform for doing research about neologisms in modern Chinese. Currently, we believe there is still potential for more discoveries, and a need for continuing acedemic work, in the following areas:

Integration of Basic Data

  • Dialogue Between Academia and Historical Sources
  • Integration of Chinese Terms from Buddhist Texts (Sanskrit Terms) Within the Realm of Japanese Loanwords
  • Comparison of Early Academic Research with Modern Academic Research
  • Analysis and Integration of Western-Translated Chinese Texts and Words
  • Better Integration of Various Academic Approaches to Loanwords
  • Integration of Objective Approaches to Classifying Loanwords

Currently, the database has three features with limited excess:

  • Generating word lists: Generating word lists based on specific criteria such as scholars, time, language, etc.
  • Research data: Generating word lists for specific scholars.
  • Historical documents: Querying individual historical document word lists and related images.

Ultimately, this database serves as a dialogue between academic evaluation and historical facts. We provide a quantified judgement whether a word is generally considered a loanword by the academic community, and statistical data reflecting the degree of modern academic agreement on the classification of each word. We then contrast this academic evaluation with historical evidence. In some cases, historical evidence supports the academic classification, while in others, it suggests a different classification. This is all easily accessible in the summary of each word.

(From the original website.)