Principles of the relational model - Databases: design

Principles of the relational model

In the mid-1980s. IBM employee E. Codd, considering the features of representing relational databases and working with them, formulated the basic principles. These principles formed the basis for the creation of all modern systems for managing relational databases and are used in the development of relational models. These principles were called the "Codd Rules".

Rule About

The fundamental rule. The relational database must be able to completely manage the database through its relational capabilities.

This rule was formulated by E.Kodlom some time after the formulation of all the basic rules, but it became the main rule with the number "O", because it is fundamental, reflecting the need for strict compliance with all other rules.

The rule itself determines the mandatory adherence to the laws of relational algebra when dealing with relationships in all databases that claim to work with relational databases.

Rule 1

Information rule. All information, including the names of tables and fields (columns), is represented explicitly only at the logical level and only in the form of values ​​stored in tables.

From this rule follows the fact that the order of rows (records) and fields (columns) in the tables is not ordered, which was determined when considering the essence of the relational model. Given that all the data in the relational database is represented as flat tables, where the values ​​are placed in the cells at the intersection of the data record and the corresponding field (column), the rule reflects the need to represent in the table representation all data to which the functional information from the domain and information about the structure of the database, including information about tables, fields, constraints, defaults, keys, links, etc.

From the same rule implicitly follows the requirement of finding the data in the first normal form, namely, the atomicness of the data stored at the intersection of records and table fields. Nevertheless, some modern DBMS, implementing object-relational and other data models, assume the use of complex data structures in values ​​but to some table zeroes. This assumption somewhat violates the atomicity rule and this Codd rule, but is sometimes useful when working with modern data when it is necessary to perform complex data processing.

Also this rule speaks about the possibility of representing a database structure in the form of a model at the logical level, where tables, fields (columns) and links can be represented explicitly, visualized in the form of corresponding diagrams (models), allowing for better understanding of the presentation and data storage.

Rule 2

Guaranteed access. Logical access to any data element must be provided through the use of a combination of the table name, field name (column) and primary key value.

This rule, based on its wording, assumes the existence of three basic properties of the database:

• each table in the database has a unique name;

• each field (column) of the table has a unique name;

• In each table, the value of the primary key is unique.

These properties, taking into account the continuity of the properties and principles of data organization, should be implemented at the level of logical database design, when the developer receives a relational data model.

In modern DBMS, the development of which led to the introduction of additional rules for working with data, expanding this rule, ensure that access is guaranteed not only on the basis of the table name, field name (column) and the value of the primary key, but also using database names, user and so on. The list of database objects that determine the availability of data can be individual for each DBMS, but the objects specified in the rule are mandatory for all relational DBMSs.

Rule 3

Handling unknown values. The ability to represent and process unknown values ​​must be implemented in the relational database.

It is assumed that during the processing of data, some values ​​will not be determined by the user, DBMS, or any other method or data source. Such values ​​are unknown (unknown), and in many data tools such values ​​lead to errors and unpredictable results.

So, for example, in programming languages, if there is no implementation of the mechanism for setting the default value, when you need to process a variable for which the value is not defined, data may appear from the RAM where the corresponding variable is located, which will lead to unpredictable results of the calculation or other processing of this variable.

Given this peculiarity of working with data, the rule requires the implementation of mechanisms for storing, presenting and processing unknown values, which are denoted by the "NULL" representation in the DBMS. This variant of representing an empty value is standard for modern programming languages ​​and is used in modern DBMSs. Visualizing NULL value in databases, the place of values ​​may differ in different DBMS, but, as a rule, its representation is indicated by an empty area. True, there is a problem with the difference between the empty string from the value of "NULL", which in the DBMS can be provided by indicating the value empty string value with the display of double quotes or double apostrophes.

Because the value of NULL is not a standard value accepted in the subject areas, then for its processing in the DBMS are implemented

specialized tools that allow you to interpret it in standard data types, for example: string (empty string), logical (true, false), etc.

Rule 4

Access to the data dictionary. The logical structure of the database must be relational and stored in the form of relational tables.

The data dictionary is represented by information about the structure of the database, objects used in the database, procedures, queries, etc. The rule defines the strict need for access to this information, with appropriate access rights, at the level of relational operations and is referred to as the IML and.

In essence, this means that all information about tables, fields (columns), links, queries, procedures, users, and so on. must be represented by the same tables, fields (columns) and links, as well as ordinary data of the domain, using the terminology of the relational model. This is done to ensure that the database administrator has the opportunity, using standard DBMS operations, to process information about the structure of the database and not to resort to any specialized means of access to data.

As a result of applying this rule, when it becomes necessary to work with the database structure at the administration level or using the user application, there is no need to create a specialized application, and you can use standard relational tools and.

Rule 5

Completeness of a subset of a language. There must be at least one language that allows all operations on data.

This rule does not deny the existence of multiple data-handling languages, but it obliges you to use a language that will allow you to perform any operations on data, including information about the data structure (data dictionary). This language was B () C, which is the standard for working with data in relational databases. At the same time, almost all DBMSs have BOB language extensions that form their own database language.

A single language for working with data must have the following properties:

• a linear syntax that allows you to build the program logic of data processing by applying simple operators sequentially written one after the other;

• The ability to be used in an interactive or integrated version, which is reflected in the use of the language as stand-alone program procedures, executable as necessary, providing information flow between the user and the DBMS, and in quality

components of the application program, where the language operators are components of the application programming language.

However, the language must provide a number of operations that implement a comprehensive set of actions with data:

definition of the data structure, where the language operators allow the data dictionary to be created and allow the user to obtain the necessary information about the database objects;

- Definition of representations, when the language operator enables to generate the result of a sample of data based on the user's request for data;

Data processing provided by a limited set of operators, combined in some cases with views, to modify database information, including adding, modifying and deleting;

definition of integrity parameters, where the rules that the DBMS should follow are formed so that the structure and data are in the correct view, do not distort information and allow to receive from the database at any time the information that the user expects from the database;

- identification of access rights when language operators can assign or remove the right to access data and database objects for any data processing operators and provide information to user requests;

- the definition of transaction boundaries, when for each transaction the beginning, completion or cancellation is clearly defined, considering that the transaction contains a lot of language operators, thus allowing better processing of data without creating a "garbage" in the database.

In fact, this rule requires that the language used can be assigned to the relational database language class and provides the creation of a database, manipulating the data structure, reading and modifying data, protecting data, allocating access rights, etc.

Rule 6

Modifying Views. All views must be updatable.

Given that the view is nothing but the result of retrieving information from a database but a certain algorithm, this rule requires that you can perform operations for adding, changing and deleting data at the level of the rows (records) to be selected for any view, and used fields (columns).

The task stated in the rule is not trivial, because the views can use several tables to represent the result of the selection and the modification of the view will require agreement

Data in multiple tables. Given such a modification feature, many DBMSs allow performing data update operations through the view only in the case of using one data table.

The important thing here is the fact that among the columns (columns) of the changed view all columns should be present, the values ​​of which are determined by the rules for determining the data structure to be mandatory. Otherwise, a situation can be created that the modification of the view will not be implemented because some values ​​(columns) will not have values ​​defined, and the rules for filling in the values ​​must be entered. This will make it impossible to modify the data through the view.

Rule 7

Using high-level operations. Data modification operations are supported not only with respect to one line (record), but also to any set of rows (records).

The operations of adding, changing, and deleting data, due to relational properties, namely, part of working with sets, should provide the ability to uniformly process a set of rows (records) of tables defined for such processing using a single operator. In this case, since one row (record) can be considered a subset consisting of one element, or represented as an empty set, it is also processed by the same operator.

Rule 8

Independence of physical data. Application programs should not depend on the rules for storing and posting information in the database.

This rule divides application solutions into two levels: an application program and a database. Each of the levels is conditionally independent of the other, assuming that by adjusting the physical location of the database on the media and changing the hardware of computers and servers, we should in no way change the application programs by accessing data in the database.

All the parameters of the application's interaction with the data are provided at the DBMS level, and only it determines the rules for storing data and their structures within the operating system, file system and hardware configuration.

Rule 9

Independence of logical data. Application programs should not depend on changes to the database structure that should not change stored data.

This rule is applied taking into account the many conditions that must be taken into account, determining the logical independence. To apply the rule, it is advisable to take into account the peculiarities of the interaction of the application program with the database.

For example, adding a new field (column) to a table, according to the rule, should not affect the functioning of the application program. This is because the application program must access data in the database by the field names (columns), and not by their sequence numbers in the table or as a result of the query. In fact, it turns out that adding a field (column) does not in any way affect the already formed procedures for processing and retrieving data. However, there is a subtlety that limits the use of the operation of adding a new field (column) to the table. If for the added field (column) nc a value with a default value or the possibility of storing an empty (unknown) value is specified, then adding the new record (row) to the table using existing procedures will not perform the operations until the addition of the data record is corrected taking into account the added field (column).

Unlike the independence of the application program from changing the structure of the database in the part of adding a new field (column), the operation of deleting the field (column) will result in the loss of certain data in the database and may disrupt the functioning of the application program, since the deleted field (column) obviously, is used in the processing and sampling of data, which will lead to the inability to use them.

At the same time, changing the structure of the database should not affect the stored data, which is realized in the DBMS. By adding a new field (column), no data is affected and all information stored in the database remains unchanged in it. Deleting a field (column) also does not affect the data that is stored in other fields (columns) and tables, and only the information stored in the deleted field (column) is used. The same rule works in the same way when changing the processing rules for a field (column), affecting only the data in the field (column) being changed.

Given such features of working with the database structure, DBMS implements specialized mechanisms that do not allow performing corresponding actions.

Rule 10

Independence of integrity control. The database should provide the ability to define integrity rules using the language of relational databases.

Implementing this rule in a DBMS requires a set of objects that make it impossible to perform a series of operations on data under certain conditions. Such database objects are defaults, restrictions, triggers. The use of these objects in the database will make it possible to clarify the correctness of the values ​​stored in the database, such as defaults or restrictions, or

define the operations that must be performed to control or modify the data when the action of adding, changing or deleting a record (row) in the table occurs.

This rule specifies that all integrity information must be stored in the data dictionary and processed using relational database language operators, which is the language of the BCB. Also, the rule determines the location of integrity rules - at the DBMS level, rather than the application program, which in turn allows the logical independence of the application program from the database.

Rule 11

Distributed independence. The location or distribution of the database on physical media must not affect the functioning of the application program.

Until the appearance of databases as they are presented today, the data was stored in file systems on the basis of separate files, and often the transfer of files to other devices resulted in the inability to use the application program without transferring it to the same device. This created serious problems for the organization of networking with data or the organization of multi-user work. The solutions were systems for configuring data sources at the operating system level, where application programs are located, and organizing network file servers.

The emergence of relational databases using DBMS, implementing this rule, made it possible to organize the availability of data through the communication of the application program with the DBMS, on the basis of which the distribution of data on physical media and computer systems is determined. As a result, modern relational databases can physically separate, preserving the integrity of the data structure, into a variety of disk media or computer systems, and also allow the database to be transferred to another server system and, by establishing a connection between the application program and the DBMS, to ensure the full functionality of the application program for work with data.

In particular, modern cloud storage systems, working in database technologies, in no way depend on how they are geographically distributed and on disk devices. Applications that work with cloud-based storage systems need only have information about the location of the data access tool, which makes it possible to work with stored data from any geographical location or from any device without performing any additional settings.

Rule 12

Linguistic consistency. A low-level data access language should not ignore the security and data integrity rules supported by a higher-level language.

A low-level language determines the ability to access data, bypassing specialized data processing operators. Such an opportunity is realized in any DBMS by organizing the user's access to direct data in the cells of the database tables. In this case, the low-level language processes at a time no more than one record (line) and one field (column) in a specific table.

This rule requires the DBMS to control the integrity and protection of access to data, even if low-level access directly to stored information. This is provided by the mechanisms implemented in the DBMS, which, when performing any low-level action, access the data dictionary, which stores information about the structure of stored data and the rules imposed on working with them, which prevents an illegal operation, even with such a (low-level) access.

Although these rules are more relevant to the implementation of data technology in the database, they provide a broader understanding of the organization of the data model and parameters that should be described in the data model. The relational model, in addition to organizing the data structure, involves specifying information about the types and powers (cardinality) of links, defining some integrity rules that will be used in the database at the physical level. Specifying defaults and constraints for attributes is also used to ensure integrity and creates prerequisites for the correct application of the rules discussed. The same applies to many other parameters of models, which include the definition of trigger actions, the formulation of representations, the establishment of rules for keys (primary, external, etc.).

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)