Post-shooting model, Multidimensional model - Informatics for economists

The post-election model

The classical relational model assumes the indivisibility of data stored in the fields of table entries. This means that the information in the table is represented in the first normal form. There are a number of cases where this restriction interferes with the effective implementation of applications.

The Post-Shared Data Model (PMD) is an extended relational model that removes the limitation of the indivisibility of data stored in table records. The PMDM admits multi-valued fields - fields whose values ​​consist of subwords. A set of values ​​for multivalued fields is considered an independent table embedded in the main table.

In addition to providing nesting, the post-election model supports associated multivalued fields (multiple groups). A collection of associated fields is called an association. In this case, the first value of one association column in the row corresponds to the first values ​​of all other columns of the association. Similarly, all the second column values, etc. are associated.

The length of zeros and the number of fields in the table entries are not subject to the requirement of constancy. This means that the structure of data and tables has more flexibility.

Because the PRMD allows non-normalized data to be stored in tables, there is a problem of ensuring the integrity and consistency of the data. This problem is solved by the inclusion in DBMS of mechanisms similar to stored procedures in client-server systems.

In order to describe the functions of checking values ​​in the zeros, it is possible to create procedures (conversion codes and correlation codes) automatically called before or after accessing the data. Correlation codes are executed immediately after reading the data, before processing them. Conversion codes, conversely, are executed after processing the data.

The advantage of MAP is the ability to represent a set of related relational tables with one post-relational table. This provides a high visibility of information and increasing the efficiency of its processing.

The disadvantage of the PMD is the complexity of solving the problem of ensuring the integrity and consistency of stored data.

The multidimensional model

A multidimensional approach to the representation of data in the database appeared almost simultaneously with relational, but really working multidimensional DBMS (ISMS) until the mid-1990s. was very small.

The impetus for using the multidimensional data model (MMD) was the programmatic article published in 1993 by one of the founders of the relational approach E. Codd. It sets out 12 basic requirements for OLAP (OnLine Analytical Processing) systems, the most important of which are related to the capabilities of conceptual representation and processing of multidimensional data. Multidimensional systems allow you to quickly process information for analysis and decision making.

In the development of OLAP concepts, the following two areas can be distinguished:

• systems of operational (transactional) processing;

• Analytical processing systems (decision support systems).

Relational DBMSs were intended for information systems of operational processing of information and in this area were very effective. In analytical processing systems, they proved to be somewhat clumsy and not flexible enough. More effective here are multidimensional DBMS (IMSB).

Multidimensional DBMSs are highly specialized databases designed for interactive analytical processing of information. Let's open the basic concepts used in these DBMS: aggregability, historicity and predictability of data.

Aggregativeness of data means the consideration of information at various levels of its generalization. In information systems, the degree of detail of the information representation for the user depends on its level: analyst, user operator, manager, manager.

The historicity of data implies ensuring a high level of static (unchanged) actual data and their interrelationships, as well as the binding of data binding by time.

Static data can be used for processing specialized methods of loading, storage, indexing and sampling.

Temporary data binding is necessary for frequent execution of queries that have time and date values ​​in the sample. The need for ordering data and time in the process of processing and presenting data to the user imposes requirements on mechanisms of storage and access to information. So, to reduce the processing time of requests, it is desirable that the data is always sorted in the order in which they are most frequently requested.

Predictability of data implies setting prediction functions and applying them to different time intervals.

The multidimensionality of the data model does not mean the multidimensionality of the visualization of digital data, but the multidimensional logical representation of the information structure when describing and manipulating data.

Compared to the relational model, multidimensional data organization has higher visibility and informativeness.

If we are talking about a multidimensional model with a dimension greater than two, then it is not necessary that the visual information is represented in the form of multidimensional objects (three-, four- and more-dimensional hypercubes). The user and in these cases it is more convenient to deal with two-dimensional tables or graphs. The data is thus clippings (more precisely, "slices") from a multidimensional data store, executed with varying degrees of detail.

Let's consider the basic concepts of multidimensional data models, which include measurement and cell.

Dimension (Dimension) is a set of one-type data that forms one of the faces of the hypercube. Examples of the most commonly used time measurements are Days, Months, Quarters and Years. As a geographical measure, Cities, Regions, Regions and Countries are widely used. In a multidimensional data model, measurements play the role of indices that serve to identify specific values ​​in hypercube cells

Cell (Cell), or a metric, is a field whose value is uniquely determined by a fixed set of dimensions. The field type is most often defined as a digital field. Depending on how the values ​​of a certain cell are formed, usually it can be a variable (the values ​​change and can be downloaded from an external data source or generated programmatically) or by a formula (values, like formula cells in spreadsheets, are calculated according to predefined formulas).

In existing multidimensional DBMS, there are two main options (schemes) for organizing data: polycubic and hypercubic.

In a polycubic scheme it is assumed that several hypercubes with different dimensions and with different dimensions as faces can be defined in the database.

In the case of a hypercubic scheme, it is assumed that all indicators are determined by the same set of measurements. This means that if there are several hypercubes of the database, they all have the same dimension and coincident dimensions. Obviously, in some cases, the information in the database can be redundant (if you require mandatory filling of cells).

In the case of a multidimensional data model, a number of special operations are applied, including the following: slice formation, rotation, aggregation, and detailing.

A slice is a subset of a hypercube, obtained as a result of fixing one or more dimensions. Slicing is performed to limit the user-defined values, since all hypercube values ​​are almost never used at the same time.

The Rotate operation is used for two-dimensional data representation. The essence of it is to change the order of measurements when visualizing data. The operation of rotation can also be generalized to the multidimensional case, if by it we mean the procedure for changing the order of the measurements. In the simplest case, for example, it can be a mutual permutation of two arbitrary dimensions.

The Drill Up and Drill Down operations mean, respectively, the transition to a more general and more detailed presentation of information to the user from the hypercube.

The main advantages of a multidimensional data model are the convenience and efficiency of analytical processing of large amounts of data related to time. When processing similar data on the basis of the relational model, the nonlinear growth of the laboriousness of operations takes place depending on the size of the database and a significant increase in the cost of RAM for indexing.

The disadvantage of the multidimensional data model is its cumbersomeness for solving the simplest tasks of ordinary operational processing of information.

thematic pictures

Ошибка в функции вывода объектов.