Formation of the domain model - Information Technology

Forming a domain model

The modern level of tools allows you to work on a computer at a certain level, for example, when using editors, many users. However, when working in depth with information related to its collection, creation of a database, information processing, presentation for further use, considerable difficulties arise. This is due to the inability to work in a computer environment in a natural language. All information describing a particular subject area must be abstracted and formalized in a certain way.

The main areas of formalization of information about the domain are:

• The theory of classification, based on the taxonomic and meromonomic description of information. The taxonomic description is based on the ideology of sets, and the meronomical description is implemented through a strictly formalized definition of classes;

• The theory of measurements, which offers a basis for qualitative and quantitative measurements through classification and ordinal scales;

• Semiotics studying sign systems in terms of syntactics, semantics and pragmatics.

Before going directly to the questions of formalization and the abstracted description, let us briefly touch upon questions of terminology.

The concept of information in the general plan should be connected with a certain subject area, the properties of which it reflects. In a narrower sense, the concept of information is associated with a specific object. In this case, the relative independence of information from the carrier is observed, since it is possible to transform and transmit it through various physical media with the help of a variety of physical signals, regardless of its content, i.e. to semantics, which was the central issue of many studies, including in the philosophical science. Information about any material object can be obtained by observation, full-scale or computational experiment, and also on the basis of logical inference. Therefore, they say about the pre-experienced, or a priori information and the post-experimental, i.e. a posteriori, obtained as a result of the experiment.

The subject area is the real world that must be reflected in the information base.

Facts is the result of monitoring the state of the domain.

Data is a kind of information that is highly formatted, unlike the more free structures that are characteristic of speech, text and visual information

Information base (database) - a set of data intended for joint use.

Knowledge is the result of theoretical and practical human activity, reflecting the accumulation of previous experience and characterized by a high degree of structuring.

In knowledge, there are three main components:

• declarative (factorial knowledge), representing a general description of the object, which does not allow them to be used without preliminary structuring in a specific subject area;

• conceptual (system) knowledge that contains, in addition to the first part, the relationship between concepts and the properties of concepts;

• procedural (algorithmic) knowledge, which allows to obtain a solution algorithm.

Subject - every material thing, the object of knowledge. In logic, the object is all that is aimed at our thought; all that can be somehow perceived, named, etc. In this sense, the subject is also considered a judgment, a concept, a conclusion. In mathematical logic, objects are denoted by symbols - subject constants and object variables.

Property is what is inherent in objects, which distinguishes them from other objects or makes them look like other objects. Each item has an infinite number of properties. Properties are manifested in the interaction of objects.

Symptom - everything in which objects, phenomena are similar to each other or in what they differ from each other; an indicator, a side of an object or phenomenon by which one can recognize, define or describe an object or phenomenon.

The attribute (Latin attributum - intended, endowed, attached) is an inalienable, essential, necessary property, a sign of an object or phenomenon, without which they can not exist, be themselves, unlike accidental , transitory, non-essential properties, or accidents.

Thus, for the current state of information technology, a transition from an information description of the domain to a presentation at the data level is necessary, carried out on the basis of decomposition, abstraction, aggregation.

Decomposition is the partitioning of the system (programs, tasks) into components, combining them to solve this task.

Abstraction allows you to select the right components for decomposition correctly.

Abstraction is an efficient way of decomposition, implemented by modifying the decomposition list.

Abstraction involves a thoughtful selection of components. The process of abstraction can be considered as a generalization. It allows you to forget about differences and treat objects and phenomena as if they were equivalent.

Identifying the common processes and phenomena is the basis of classification. The hierarchy of abstractions is actually a classification scheme.

Aggregation - the process of combining objects into a certain group is not necessary for classification purposes. Aggregation is performed for some purpose.

Methods of abstraction:

• abstraction through parameterization;

• abstraction through a specification.

Abstraction through parametrization - selection of formal parameters with the possibility of replacing them with actual ones in different contexts.

The selection of formal parameters allows you to abstract from a specific application and is based on the commonality of certain properties of specific applications.

Abstraction through the specification allows you to abstract from the inner structure to the knowledge of the properties of the external manifestations (result).

Data Model is a model used for abstracting. Conceptual model is an abstracted domain description.

After getting acquainted with the questions of terminology, you got the opportunity to speak a professional language and you can proceed to the problems of designing information support. The first in this series is the problem of analyzing the domain.

When analyzing a domain, it is common to distinguish three stages:

• Analysis of requirements and information needs;

• the definition of information objects and the relationships between them;

• construction-conceptual model of the domain.

The stage of the analysis of requirements and information needs includes the following tasks:

• Define a list of tasks for retrieving, processing, storing, transporting and submitting (including documenting) information;

• Definition of requirements to the composition, structure, forms of information presentation;

• predicting possible changes in information resources in both quantitative and informative ways.

Consider an example of an analysis of a domain. We will choose the field of activity, familiar to all students - the educational process. Suppose we are instructed to develop an information system "Schedule of lessons."

Each of the participants in the action has their own idea of ​​the information of the given subject area. Our task is to generalize these views, obtained by interviewing participants in information processes and analyzing documents. It is desirable to record all actions in the form of certain documents on paper or in computer memory. The form of fixation can be anything: a block diagram, a block diagram, a table, etc. For example, the following types of documents can be used as such: the scheme of external information links (Figure 7.5), the detailed action scheme (Figure 7.6), the data flow scheme (Figure 7.7), etc.

Scheme of external information links

Fig. 7.5. Outline of external information links

Detailed action plan

Fig. 7.6. Detailed action plan

We use the proposed document types.

1. Scheme of external information links. Action - Work_with_Schedule.

2. Detailed action plan. Action - Work_C_Record.

3. The scheme of data flows. Action - Work_with_Schedule.

4. Schemes of classification: the object is the users of the schedule (Figure 7.8, a); the object - the room (Figure 7.8, b); data - requests to the schedule (Figure 7.8, in).

Data Flow Schema

Fig. 7.7. Data Flow Diagram

Classification Schemes

Fig. 7.8. Classification schemes

5. Scheme of detail. Data - help on the schedule (Figure 7.9).

6. Classification scheme. Data - help on the schedule (Figure 7.10).

The thoroughness of the analysis stage determines in the future the effectiveness of the information system, it is possible -

Detailing Scheme

Fig. 7.9. Drilldown Chart

the further development of information resources, adaptability to changing requirements for the system.

After analyzing the requirements and information needs, you can move on to the next phase-the definition of information objects and the links between them.

The main task of this stage is the division of the subject area into its component parts by decomposition, carried out according to certain rules.

At the moment there are two main approaches to this process, differing in the decomposition criteria: functional - module (structural) and object - oriented.

The function-modular approach is based on the principle of algorithmic decomposition with the isolation of functional elements and the establishment of a strict order of the actions performed, i.e. the basis is a hierarchical approach with the separation first of the functional actions, then the independent components with their further detailed elaboration.

Classification scheme

Fig. 7.10. Classification scheme

The object-oriented approach is based on object decomposition with a description of the behavior of the system in terms of the interaction of objects.

The main disadvantage of the functional-modular approach is the unidirectionality of information flows and insufficient feedback. In the event of a change in the system requirements, this leads to a complete redesign, so the errors inherent in the early stages have a profound impact on the duration and cost of development. Another important problem is the heterogeneity of information resources used in most information systems. For these reasons, at present, the object-oriented approach has become most widespread.

Basic concepts used in object domain decomposition based on an object-oriented approach - object, class, instance, attribute, relationship between objects, relationship between attributes.

An object is an abstraction of many real-world objects that have the same characteristics and laws of behavior. An object is a typical indefinite element of such a set. The main characteristic of an object is the composition of its attributes (properties). In other words, an object can be characterized as a fact, a person, an event, an object, determined by a collection of data. In the primitive plan - the object is what answers the question "who?", "What?". The object can be real (for example, a person, an object, a geographical point) and an abstract one (for example, an event, a buyer's account, an academic course).

Attribute is an information display of the properties of an object.

Instance instance is a specific, defined element of the set. For example, the object may be the state number of the car, and an instance of this object is K-number 173 PA.

Class is a set of objects in the real world, connected by a common structure and behavior.

A class element is a specific element of a given set. For example, the class of vehicle registration numbers.

Summarizing these definitions, we can say that the object is a typical representative of the class, and the terms instance of the object and class element are equivalent. In Fig. 7.11 shows the relationship between classes, objects and objects in the real world. Relationship between objects (attributes) - information display

Relations between classes, objects and objects of the real world

Fig. 7.11. Relationships between classes, objects, and objects in the real world

functional, "related", species or other dependency (subordination).

When selecting information objects, you can follow the following sequence of actions:

• the formation of classes, to which you can split the data to be stored;

• assigning a unique name to each feature class;

• the allocation of information objects by analyzing information flows, documentary sources and interviewing the participants of information interaction;

• assigning a unique name to each data object and checking their syntactic and semantics;

• definition of a set of characteristics of each object and the formation on this basis of the composition of attributes;

• assigning unique names to selected attributes;

• setting constraints on objects and their attributes (quantitative restrictions - the range of the change: the maximum (minimum) value, etc., the integrity constraint (the invariance of the state of the object in the considered time interval).

In the process of reflection between the states of interacting objects, a certain relationship arises. Information as a result of the reflection of one object by another reveals the degree of correspondence of their states.

There are three types of connection: one-to-one relationship (1: 1), the link "one to many" (1: M), the link "many to many" (M: N).

Examples of these links:

One to one relationship (1: 1) displays a unique relationship between the objects (the patient Ivanov lies on the bed 73 - the patient Ivan is lying on the bed 73, the student Petrov has a record book No. 131056 - the student's record book No. 131056 belongs to the student Petrov).

One to many relationship (1: M) or many to one (M: 1) reflects the ambiguous dependence of one object in relation to the other (patient Ivanov lies in ward number 6 - in ward number 6 lie patients Ivanov, Petrov, Sidorov, Mikhailov, student Petrov studies in group №131 - in group №131 study students Petrov, Maximov, Korobkin, Ilyin, Kruglov, and others).

Many to many (M: N) reflects the ambiguous dependence of objects in relation to each other (the patient Ivanov is treated by doctors Sokolov, Vorobyov, Voronov - doctor Sokolov treats Ivanov, Petrov, Sidorov, student Petrov attends lectures of professors Yashin, Vasiliev, Volkov - Professor Yashin reads lectures to students Petrov, Maksimov, Korobkin, Ilyin, Kruglova, etc.).

The isolation of these links is extremely important, since the 1: M and M.N bonds have an internal uncertainty, which affects the operations of searching and modifying (changing) data. To overcome the uncertainty in the implementation phase of the logical model, it is required to enter redundant information.

The final phase of the analysis of the domain is the design of a certain information structure in the form of a conceptual model. To build a conceptual model, aggregation and generalization operations are used.

Aggregation is based on combining information objects into one based on semantic links between objects. For example, an airplane of the type X transports cargo from the departure point A to the destination B. Using the aggregation, create the FLAC information object with the attributes "aircraft type", "departure point", "destination", "airplane flight".

The generalization is based on the association of related information objects into a generic object. For example, objects VEHICLE, AIRCRAFT, SHIP, BICYCLE, MOTORCYCLE are combined into the object VEHICLE. One of the attributes of this object is the attribute "vehicle type".

The stage of conceptual design is specific, because here it is required simultaneously knowledge of the features of the domain and the methodology of design. Typical is the use of different models (models entity-connection & quot ;, binary data models, semantic networks, infologic data models, etc.). Negative point is the inadequacy of the results obtained both with the use of different models and within the collective of performers. The peculiarity of the conceptual model is its orientation on the one hand information interests of the user, on the other - the information needs of the subject domain itself. Users can choose from two models: the entity-relationship and a simple relational model with functional relationships between attributes.

One common model is the entity-relationship ( entity relationship), the literature also uses the term ER model & quot ;, or the Chen model & quot ;.

The basic structures in the ER-model are the types of entities and the types of connections (Figure 7.12). The difference between the type of connection and the type of entity is the establishment of the dependence of the realization of one type on the realization of the other.

Example: PERSONALITY - the type of the entity, the type is IN MARRIAGE - no, since the implementation of the latter type does not exist if there are no two persons. Therefore, the type of connection can be considered as an aggregate of two or more types of entities.

The relational model is the most common in practice in modern IS, therefore it is advisable to consider

Semantic Diagram of a Relational Model

Fig. 7.13. Semantic diagram of the relational model

ER Chart Elements

Fig. 7.12. ER diagram elements

Example of a relational model

Fig. 7.14. Example of a relational model

its capabilities. Most DBMSs on the market are relational or object-relational. The semantic diagram of the relational model is shown in Fig. 7.13, and an example of a relational model is shown in Fig. 7.14.

thematic pictures

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)