Can Siteimprove spellcheck Chinese, Japanese and other languages that use syllabic writing systems?
Modified on: Tue, 10 Aug, 2021 at 7:33 AM
Unfortunately Siteimprove cannot check "symbol" languages such as Chinese, Japanese or Korean. This article is intended to explain to why this is.
A word in Chinese usually consists of 1-3 symbols. Words and phrases that belong to each other are mostly written in one line without spaces because they are not necessary in the same way that they are necessary in English, French, Danish or German. So you may have a line of symbols that look like one word with 8 letters, but it might actually be a sentence that could be anything from 3-8 words.
Chinese symbols that appear on websites are mostly written on Roman alphabet keyboards using what is called the Pinyin input method. Pinyin is the way the symbols are written with roman letters.
For example "你好" is a Chinese greeting (Hello) that would be spelled like this in roman letters: "nihao". There are several applications that make it fairly easy to write symbols that way. A user would write "nihao" into his pinyin input application, would receive a few different possible symbols to choose from, needs to choose the correct symbol(s) and to enter them into a document or content editor. The potential problem arises when you do not use the correct symbol for what you actually want to say, because then the sentence may either change its meaning or lose its meaning entirely.
When it comes to proofreading the Chinese language, it is not the symbols themselves that would be useful to check, but rather if the chosen symbols match the intended meaning of the sentence. Unfortunately, an automated way to proofread the meaning of symbols in sentences is not possible at this time. This type of proofreading method would require artificial intelligence that actually understands the meaning of what a content editor is trying to say. So far this is only possible with human proofreaders.
Siteimprove is currently not able to spellcheck Chinese, Japanese or other "symbol" languages.