Database Methodology, Data Warehouse Design Methodology

Database Operations Methodology

We will primarily talk about the operation of operational databases within the database.

Regardless of the database class, you have to solve problems, which include the provision of single and multi-user operation, data protection, integrity, data recovery after a failure in the database.

In centralized single-user databases, the operation is provided by so-called transactions, as a result of which the data in the database is either updated (committed) or remain the same (rollback). For operational databases (DBMS) are characterized by so-called short transactions with a duration of micro- and milliseconds. Data warehouses should work with long (hours) transactions. In the case of multi-user mode , there is additionally a need for simultaneous access of several users to the same data, which is most often achieved by data blocking.

Data protection from unauthorized access is either denied access (password) or permission to access, which is especially easy to provide with the help of the SQL programming language. Ensuring integrity is determined by special programs called triggers. They implement various kinds of restrictions. For example, for the Paul field, the program limits the task of only the values ​​ husband and wives & quot ;. Other values ​​are not accepted by the database.

Data recovery after a database failure is determined by the nature of the failure. For short-term failures, the database is restored itself: database data are used at checkpoints and uncompleted transactions. For long-term failures, the recovery of the database is possible only on the basis of the database backup that is being created and periodically updated.

In distributed peer-to-peer databases, additional difficulties arise in solving the above problems. Distributed transactions can be used. Complicated and the procedure for simultaneous access, for which additional methods have been created. Their essence - this or that scheme of centralization of database management. The importance of duplicating data is especially important.

New difficulties arise when integrating non-homogeneous distributed databases with previously constructed, operational local databases with different data models.

The solution to these problems is somewhat simplified when using the client-server mode in distributed databases.

For more information about the functioning of the database, see Chap. 4 and 5.

Data Warehouse Design Methodology

Let's move on to the methodological side of data warehouses. Here you can still highlight the procedure for creating (designing) and using HD.

The procedure for using HD differs little from a similar procedure in the database, so we will not consider it in detail, but note one feature. Due to the high requirements for speed of HD, special attention should be paid to optimizing queries.

The following work is performed when creating the CD.

1. The composition of the final information with the maximum allowable response time and the maximum period of storage of detailed information is formed.

2. The estimated set of queries based on the detailed data is determined. At the same time, it is necessary to find a compromise between the creation of the final statistical information and its calculations on the basis of detailed data.

3. Choosing a way to store data time in the tables.

4. There is a choice of DBMS, which should take into account the needs of the OLTP system. The most suitable is OODBMS using the multidimensional data model MOLAP. The dimensions of the data model are determined.

At the same time, you can also use a relational database with the use of a variety of ROLAP. You should select a scheme ( star or snowflake ). Then it is useful to build a fact table and accompanying reference tables with a minimal change of keys in them. When using the star should be denormalized.

5. The requirements for additional data that are not present in OLTP are determined, and unnecessary, extra columns in the detailed data are deleted.

