Multilingual structure modeling - Databases: design

Simulation of the multi-lingual structure

In the context of globalization of the world, there is an aspiration in the development of information systems to expand the use of several languages ​​to present information to the user in the language in which he communicates. The task of multilanguage has emerged with the introduction of online technologies and the need to present data not only to a specific user or organization for which an information system is being developed, but also outside the state where it is mainly used. First of all, the task of multilanguage of information systems is critical in the development of Internet representative offices of an organization or a transnational information system implemented in a network or Internet mode.

Suppose that the e-shop being developed should provide an opportunity to order goods to citizens of different countries, for which the main language is not only United States, but also possible, English, German, Korean, etc. In this case, the catalog of goods must be presented in the national language of the user,

which means that the names of the goods and their descriptions must be represented in any of the language versions.

The simplest solution, which begs for such a task, is the introduction of the corresponding language attributes (Figure 4.66), where the values ​​of the product name will be stored in the language for which the attribute is intended.

Fig. 4.66. Multilingual example based on language attributes


It is important to note one feature of the use of data types when it is necessary to present data in national alphabets. As is clear from the example, for the attribute "Product name (when) the data type is COOKED & quot ;, unlike other similar attributes. This is explained by the fact that Arabic and Asian groups of languages ​​are not represented by the familiar Latin or Cyrillic symbols, but by images denoting combinations of symbols or hieroglyphs. To store such data, the usual character data type is not suitable, since it is oriented to a code table of 256-element symbols, including a Latin or Cyrillic alphabet. To store string data for other language groups, much more memory is needed and the available 256 elements of the code table are not enough. Therefore, for similar text representations, two-byte elements are used and for them, separate data types are designated, which are denoted by .... Therefore, for Korean, Chinese or Japanese, this data type is used, and the maximum row dimension is determined twice as much as is necessary for the Latin alphabet and Cyrillic alphabet.

Multilingual representation in this form is obviously not a good solution, because for each language it is necessary to create a separate attribute. If necessary, add another language, you will need to restructure the database in the form of adding an attribute, and also rework the program logic, which depends on the set of attributes. This is a rather complicated and expensive procedure. Therefore, the use of this option, except for cases of guaranteeing these languages, is not appropriate.

Another option for solving the multilanguage data problem is to create an appropriate database structure that will take into account available language alphabets and to determine the correctness of the information presentation using data structuring. Primarily, to ensure working with languages, you need to have the entity "Languages" (Figure 4.67), which will allow the user to receive data in the desired language and share all the test data for these languages. Also, this entity will make it possible to add new languages ​​without restructuring the database.

Fig. 4.67. Example of the entity Languages ​​


This entity can also be created based on the normalization of the "Products" entity, where it is advisable to specify the "Language" attribute. The normalization process in this case will lead to the creation of an entity-bundle between the goods and the language (Figure 4.68).

Fig. 4.68. Normalization of multilingual representation of the goods