The three most widely used oriental languages in Asia

Asia is a land rich in beautiful and complex languages, which have maintained a mysterious aura over time.

For many years, Soget Est, as Language Service Provider, has been looking to the East with great passion. Over the years, we have developed specific professional skills in translating from and into the three most widely used oriental languages in Asia: Chinese, Japanese and Arabic.

Thanks to this short guide you will be able to learn about their history, their main linguistic features and the aspects that make them so fascinating.

Chinese: Traditional and Simplified

A short history

Chinese is one of the earliest known forms of writing.

The first historical attempt to analyse the form and structure of written Chinese, which in thousands of years has undergone various transformations, dates back to 100 AD. In this year, the philologist Xu Shen finished compiling the first dictionary of the Chinese language, comprising 9,500 characters.

Towards the end of the 19th century a movement was established to simplify written Chinese.

In the 1950s and 1960s, the government of the People’s Republic of China formulated a standard to simplify the language, thus leading to two parallel systems: Simplified Chinese and Traditional Chinese.

The People’s Republic of China and Singapore currently use Simplified written Chinese, whereas Taiwan and other Chinese communities around the world use Traditional written Chinese. Overall, it has been estimated that about a quarter of the world’s population uses Chinese.

Even if we tend to think of each Chinese character as a simplified image of an object or a concept, in actual fact this language is a complex combination of ideograms, pictograms and semantic and phonetic elements.

Let’s discover it in detail.


All Chinese characters have been classified into 5 main categories:

1 Pictograms

There are only a few hundred pictographic characters. It is a small number when compared to the tens of thousands of characters in the Chinese language.

However, they are still fascinating. This is because they are derived from ancient drawings of concrete objects used daily, even if the modern form has been simplified and standardized over time.

Although they are almost always abstract representations, traces of the primitive pictorial element can generally still be recognised in these characters.

2 Ideograms

Unlike pictograms, which represent concrete objects, ideograms represent abstract ideas and concepts.

For example, they can denote concepts such as “above” and “below”, or natural numbers such as “three” or “four”.

3 Compound Ideograms

The meaning of these ideograms can be deduced from the combination of their constituent parts.

For example, the ideogram that denotes the word “honest” comprises two characters: one that means “man” and the other that means “word”. A further example, the ideogram used to express the verb “sit” is formed by the character that denotes “man” positioned above the character that means “earth”.

4 Phonetic Loan Characters

Real rebuses, phonetic loan characters preserve the phonetic value of a homophonic character, without considering its meaning. It is therefore almost always impossible to explain their etymology.

5 Ideophonetic Compounds

These compounds, which account for 90% of all Chinese characters, combine the elements of two separate characters: the meaning of one and the phonetics of the other.

For example, the character for ‘sugar’ combines the semantic component that means ‘cereal’ with the phonetic component (tang), which, used separately, can indicate the Tang dynasty.

These ideophonetic compounds allow new symbols to be continuously created.

A curiosity: How are Chinese characters formed?

Chinese characters are composed of 8 fundamental (or radical) strokes, which form the basis for creating all complex characters. In the modern classification, 214 radical strokes are recognised and essential for collocating words in dictionaries of the Chinese language.

Each character comprises one or more strokes arranged according to a specific sequence. Regardless of the number of strokes involved, each character must remain within an invisible box.


A short history

The Japanese language consists of Kanji characters, based on Chinese characters, and two syllabic alphabets called Kana, which date back to the 9th century AD. However, their linguistic forms and use were only established in the 20th century.

Although the two systems can be used separately in modern written Japanese, it is customary to use a combination of Kana and Kanji. Some people believe that this is required to minimise any ambiguity in written Japanese, since the Japanese language largely comprises homophonic words.

In the syllabic nature of the Kana alphabets, the remote influence of the Indian script can be recognised, probably due to the spread of Buddhism from India to the Far East.

The 3 graphic forms

Written Japanese consists of three different graphic forms: Kanji, Hiragana and Katakana.

Hiragana and Katakana are syllabic systems and can be used as Furigana, i.e. short annotations, similar to a Kanji character, which represents a meaning or pronunciation.

Some people include the Latin script, Romaji, in written Japanese, as it is considered essential for a complete education.


About 2,000 Kanji characters are regarded as essential to current use, although the number of existing characters is significantly higher. For instance, it is very unusual to find characters used for proper names.

Kanji symbols are used for simple elements in sentences such as word roots, stems and base forms, whereas compound words are written using more than one Kanji symbol.

Since Japanese and Chinese significantly differ in phonological terms, the Japanese use of Kanji characters entails the additional difficulty of dual reading for each symbol.

For example, a character may have an “on” reading (phonetic) based on Chinese pronunciation if the symbol is a phonetic loan; or it may have a “kun” reading (semantic) based on the Chinese meaning transferred to Japanese. Naturally, not all Kanji characters are phonetic loans. It is estimated that more than 150 Kanji symbols, called “kokuji“, are used in Japanese.


With a more circular form, Hiragana is a syllabic alphabet comprising 46 characters and some diacritical symbols.

Originating from the cursive forms of Kanji, Hiragana is used for grammatical morphemes (elements) of a sentence, such as auxiliary verbs and inflectional affixes.

Except in the case of children’s books, Hiragana is not usually used alone in Japanese.


Katakana, with a more angular shape than the two Kana alphabets, also originates from the cursive forms of Kanji and can be considered parallel to the Hiragana alphabet. In fact, each Kana alphabet has different yet equivalent symbols for the same sounds.
Hiragana is used for grammatical elements, whereas Katakana is generally used for foreign words and names, onomatopoeic expressions and telegrams.


A short history

Arabic (al-ʿarabiyya or simply ʿarabī) is of Semitic origin. First appearing in North-West Arabia in the Iron Age, it is now the lingua franca of the Arab world.

Although the changes that have taken place throughout history have made it very different in appearance from Latin, written Arabic also has its origins in the Phoenician alphabet.

Furthermore, another interesting feature that makes it closer to Western languages is its influence on the Italian language, which occurred in the Middle Ages.

In fact, in this period, many Arabic words entered the common Italian language, especially in areas in which the Arabs excelled: navigation and commerce, mathematics and astronomy. In many cases, the origin of terms can be historically proven and traced back exactly to the historical period when they were used.

But how many people speak Arabic today? And what do we mean by the “Arab population”?

It is currently estimated that about 422 million people speak Arabic, making it the fifth most spoken language in the world and one of the six official languages of the United Nations. Moreover, classical Arabic is the sacred language of 1.7 billion Muslims.

The “Arab population”, instead, refers to those living in the area extending from Mesopotamia in the east to the mountains of Lebanon in the west, the Arabian Peninsula and North Africa.

Today, Arabic is divided into various dialects, which are not always comprehensible to everyone. Classical Arabic, instead, is known to everyone as the language of the media, publications, education, religion and international relations in the Arab world.

Main features

Arabic has a very complex grammar with a large number of uvular, spirant and pharyngeal consonants that make it difficult to pronounce, especially for Europeans. Despite this, it has proved to be a language well suited to poetry.

The main linguistic features of Arabic are listed below:

  • Arabic is written and read from right to left.
  • Exceptions are numbers, both in Western and traditional characters, which are written from left to right.
  • Consequently, the pages of a book are read in the opposite direction to that of books written in the Latin script, whereas lines are followed from top to bottom.
  • The 28 letters that comprise the alphabet have 4 different shapes, depending on whether they are at the beginning, in the middle or at the end of a word, or isolated.
  • There are neither upper case letters nor any clear distinction between italics and printed characters.