Induction and Statistical Analysis - History, Philosophy...

1.8. Induction and statistical analysis

The experiment delivers the facts. They, as a rule, do not coincide with predictions. This circumstance is taken into account already at the planning stage of the experiment. Researchers understand that after the experiment an hour of induction occurs, which, like deduction and adduction, is also associated with many surprises. The researcher prepares in advance for overcoming new difficulties.

Are there any exact values ​​for the measured parameters? It seems that each attribute (parameter) can have a so-called exact, or point, value, but it is probably impossible to justify this opinion. If, for example, the magnitude of the current is 4.0 ± 0.2 ampere (A), then there is no reason to believe that it has a point value in the interval 3.8-4.2 A. A very common mistake is the arbitrary assertion of existence the exact values, and then the determination of the absolute and relative errors (errors) allegedly admitted in the measurement. So, with reference to the example above, it is indicated that the absolute error is 0.2 A, and the relative error is equal to -0.055.

Errors, of course, can take place. But in conceptual analysis, the first attention should be paid not to errors, because they have a secondary meaning. Even in the absence of errors, the value of the characteristic value must be correlated with a certain interval: the value of the characteristic value is always interval. As for the exact value, it is the result of the simplification operation. A point value does not distort the interval value, but is an oversimplification with respect to it. It is the interval value that is intended to fix the corresponding device.

Selective average, probability, mathematical expectation and uncertainty. So far, problems associated with determining the value of a parameter with a narrow interval have been considered. But with respect to the so-called probabilistic quantities this is clearly not enough, for in this case one of the central plans comes out concepts of probability and mathematical expectation. Both concepts seem quite unusual, but their nature can be quite clearly interpreted.

The key value in understanding the concepts of probability and mathematical expectation has a selective mean. Let us assume that the quantity Y is considered. Let us denote the value of Y by the UI measured in the 5 test. The total number of trials included in the corresponding sample is n. In this case, the sample mean A is determined by the formula

The determination of the sample mean requires the experimenter to have high competence in selecting the appropriate samples and determining their characteristics, in particular stability. But in this case, we will not be distracted by these subtleties. We note only the main point in determining the nature of mathematical expectation and probability from the standpoint of the experimenter. Both these values ​​represent some limits of the sample mean. In the case of mathematical expectation (H), we are dealing with the value of the parameter being measured: E [V] is the limit of A [V], determined on the basis of not one, but many samples. In the case of probability (P), we are talking about the limit of the sample mean with respect to the relative frequency of outcomes:

where t is the number of favorable outcomes from the total n.

Since t is determined based on many samples, it also acts as a certain averaged value. It seems that the experimenter, with all his efforts, is not in a position to determine either the mathematical expectation of the magnitude or the probability of its occurrence, for the limiting transitions considered above assume an infinite number of both tests and samples. But in the conditions of a shortage of time, the experimenter must confine himself to a very definite number of trials. He seems to be entitled to state that he should strive to approach as closely as possible the exact (true) value of the values ​​of mathematical expectation and probability, respectively. However, this is the exact value is introduced a priori, which should alert the experimenter, since apriorism leads to metaphysics. The paradox of unattainability of the exact value of mathematical expectation and probability can be completely overcome if one carefully takes into account, on the one hand, the status of concepts, and on the other - the correlation of certain stages of conceptual transduction in composition of experimental sciences. Consider this imaginary paradox in the example of probability analysis.

There are various approaches to understanding the nature of probability1. Particularly often puzzling is the seeming total lack of possibility to reconcile the understanding of probability as the relative frequency determined in the experiment and its mathematical counterpart. In the latter case, the probability is understood either according to R. von Mises, namely, as the limit of the relative frequency, or according to AN Kolmogorov, ie. as a measure given on the algebras of sets. The paradox arises insofar as the mathematical realities are accepted for quite real idealized objects and their signs. In the experiment, such realities can not be detected. To avoid paradoxical judgments, there is only one possibility: to believe that the stage of experimentation is followed by the stage of idealization, and the most genuine in science is supposedly idealization.

The way out of the situation is if we recognize mathematical objects as not idealisations, but formalizations. By attentive researchers, mathematics can not be transferred directly and directly to the field of experimental sciences. The mathematical apparatus is surely checked for its consistency. This aspect of the matter is extremely important in the understanding of mathematical modeling. With all its merits, mathematics should be taken critically. As soon as mathematical formalizations begin to be identified with realities, their approximations are immediately revealed. Due to the formal nature of mathematics, there is no need for an experimental comprehension of its content in non-mathematical sciences. It is enough to determine its strengths in the projection on experimental sciences.

So, mathematical expectation and probability, being the most important scientific concepts, are not measured directly, but are determined through initial experimental data, which we prefer to call facts. By the way, the name mathematical expectation can not be called successful. It was first used in the XVII century. applied to the theory of gambling B. Pascal and H. Huygens. It should be borne in mind that not every expectation is mathematical. So, the expectations with which the economy takes place are economic, not mathematical. It should also be borne in mind that in modern science very often expectations are tightly linked with forecasts, but the concept of mathematical expectation is used outside of forecasts. The definition of mathematical expectations and probabilities is associated with numerous difficulties, each of which gives this or that certainty to the experiment as a stage of transduction. Let's mention some of them:

1. Enumerate the factors that are relevant in determining mathematical expectations and probabilities. It is possible only after a careful study of the features of the experimental situation. Factors are ranked, but some of them are unaccounted for.

2. Subjective (expert) assessment of probabilities. It is necessary in case there is a need for a new theory. The work of experts needs to be understood.

3. Restoration of the statistical ensemble for a limited experimental sample. As a rule, there is not enough data, so they are being thought out, but the criteria for guessing themselves need critical analysis.

4. Definition of mathematical expectations and probabilities under conditions of nonstationarity and instability. Here, any prediction proves to be connected with new difficulties.

5. Interpretation of rare phenomena. Since rare phenomena, as a rule, are not reproducible, then their study is difficult.

6. Attracting the law of large numbers. Contrary to popular belief, increasing the sample size does not necessarily entail a reduction in the scattering of experimental data. The law of large numbers occurs only in the presence of factors that ensure its existence.

The concept of uncertainty is closely related to the concepts of mathematical expectation and probability, about which there are great ambiguities. Usually indefinite is interpreted as the negation of a particular. From this point of view, a quantity that does not have an exact value must be recognized as undefined. In epistemology, uncertainty is often associated with a lack of knowledge that can be overcome. This understanding was questioned by the discoveries made in quantum mechanics. Consider, for example, one of the Heisenberg uncertainty relations: ApxAx & gt; N/2. It indicates that when the momentum is measured simultaneously along the x-axis (Apx) and the coordinates (Ax), their uncertainty is unrecoverable. Understanding the Heisenberg uncertainty relation has shown that uncertainty is real and, therefore, is not related to a lack of knowledge. The connection of uncertainty with probabilities also became apparent. At least that's the way things are in physics.

In technical science the situation is different: here there are situations of risk and situations of uncertainty. It is believed that the probability of occurrence of interesting events is known in the risk situation. In a situation of uncertainty, the probabilities are unknown. As we can see, here again the notion of uncertainty as a lack of knowledge again makes itself felt, but this time uncertainty is estimated as the absence of not exact but probabilistic knowledge. Note that the concept uncertainty and situation uncertainty are two different things. We are interested in the concept of uncertainty, therefore it seems extremely important to understand the ontological status of uncertainty.

Understanding the optical status of uncertainty.

Probability characterizes the possibility of occurrence of certain events. But in this case, it is necessary to recognize the presence of a certain concentrate of activity, ensuring the occurrence of the above events. In our opinion, the characteristic of such activity is precisely uncertainty. It is not enough just to emphasize the uncertainty of the magnitudes of the signs. It is extremely important to single out their origins. And they are such that they overturn our usual ideas. It is very significant in this connection that, because of the uncertainty characteristics of elementary particles, even ... universes arise. There are reasons to believe that amazing opportunities can also generate people's activities. The world is saturated with not exact quantities and the compulsory movement along a narrow trough of necessity, but with uncertainty generating a wide range of probability events. We live in an amazing world, realized thanks to not so many mathematical expectations and probabilities, as much uncertainty.

Above we considered the basic concepts needed to analyze the data extracted from the experiment: average sample value, mathematical expectation and probability. Dispersion should be added to these (from Latin - scattering). Dispersion fX) is defined as the square of its deviation from the mathematical expectation. Dispersion is necessary in two respects. First, it allows to keep in the field of researcher's attention the whole set of measurement results, which does not reduce to mathematical expectations. Secondly, with support on it it is possible to characterize various kinds of errors.

Analysis of experimental data. There are various ways of analyzing the experimental data. Their full list goes far beyond the scope of this book, so they will be considered only to the extent that it allows us to arrive at certain methodological conclusions.

1. Factor analysis and the method of principal components. In 1901, the distinguished English statistician K. Pearson proposed the method of principal components, the essence of which is that, in relation to the results of measurements presented on a two-dimensional plane, a straight line is searched, capable of satisfying two conditions. Changes along it should be maximum, and in the orthogonal direction, on the contrary, minimal. The main component is the one that is counted along the found straight line. Such an analysis can be continued, and then a set of components ranked according to the degree of their relevance is obtained. If desired, you can refuse to consider those components that will be considered insignificant, and as a result, there will be a reduction (reduction) in the number of variables. The method of principal components perfectly illustrates the basic idea of ​​factor analysis, which consists, first, in data reduction, and secondly in their classification.

2. Correlation analysis. Its purpose is to identify some connection between statistical variables (sample means). Statistical analysis is never complete without identifying correlation relationships, , usually characterized by some correlation coefficients. Establishing these links is the first step in identifying empirical laws. As a rule, the correlation analysis is supplemented by a regression study.

3. Regression analysis is performed to determine the equation that combines the dependent variable Y with the independent variables Xg If , it is possible to predict how its value varies according to the changes Xg Not always, but most often linear regression is defined as a straight line:

The coefficients b { characterize the degree of contribution of independent variables to the value of Y. But when choosing a straight line, it is necessary to use some criterion that allows one function to be selected from the set of linear dependencies. For this purpose, the least squares method is often used, which makes it possible to minimize the sum of the squared deviations of the actually observed Y from their postulated quantities.

The least squares method was developed more than two hundred years ago by Karl Gauss and Adrien Legendre. As they found out, it is necessary to minimize the sum of squares of deviations, and not the sum of deviations. It can be shown that the sum of the squares of the deviations of individual measurements from the sample mean will be less than the sum of the squares of the deviations of the individual dimensions from any other value. The very concept of the sample mean is such that it brings to life the least squares method.

Planning an experiment. The process of cognition never begins from scratch. This circumstance makes it possible to plan an experiment. In fact, all pre-experimental knowledge is generalized in the technical model. It includes a number of stages, which constitute the essence of the planning of the experiment.

1. A problem is identified, which forces you to apply to the experiment. Obviously, the experiment is conducted in connection with the desire to develop new knowledge, since the level of knowledge achieved before does not satisfy the researcher.

2. Choose optimization parameters, which are the characteristics of the goal. In the simplest case, they are limited to one optimization parameter. If there are several of them, then, as a rule, a generalized optimization parameter is defined, which is a function of the initial optimization parameters.

3. The composition of the factors is analyzed on which the optimization parameters depend, their list is determined. Factors are divided into different levels. Each level combines a class of factors that does not depend on another class. So, if you compare the characteristics of three aircraft, you have to deal with three levels of factors. Sometimes a number of factors are combined into blocks. The basis for such an association is their similarity in the degree of influence on the optimization parameters. For the sake of simplifying the experiment, it is permissible to consider only the most essential parameters of a block. The main and minor factors are identified and the degree of correlation between them is determined. Orthogonal factors by definition do not depend on each other.

4. Determine the number of trials that need to be carried out. They should be enough to achieve the purpose of the experiment. The methods developed in statistics allow us to determine their number.

5. A sample of data is established, which is necessary for finding sample averages and empirical laws.

6. The ways of ensuring the reliability of data, the possibility of repeating the experiment by the researcher himself and by other scientists are determined.

7. Ethical and legal aspects are analyzed, since the experiment should be legitimate, not only from the legal, but also from the ethical point of view.

8. In the preliminary plan there are possibilities to correct the experiment , depending on the receipt of certain intermediate results.

The planning of the experiment allows us to proceed to its direct preparation and conduct, but this has been said enough earlier. We considered the planning of the experiment in the final part of the paragraph, since it was necessary to introduce a number of concepts, for example, on the method of principal components.

Philosophical discussion of induction. As a rule, the experiment is evaluated in two ways: it is stated that it is necessary, firstly, to test the existing and, secondly, the development of a new theory. This, of course, is relevant, nevertheless the main thrust of this section is different. We wanted to pay special attention to the transduction line, which reached induction. It was important for us to understand the potential of the first stages of intra-theoretical transduction at the stage of processing experiments. We did not discuss the relationship experiment - theory ", and the place of processing the experimental results in the transduction line.

The experiment delivers the facts. It is generally accepted that it is the facts that are truly non-objective objective events; they are primary in relation to the theory, which is necessary for their comprehension. To the question: "What exactly is there?" answer: "Facts". But are there principles, laws, models? Factualists recognize the existence of these concepts only if they are reduced to facts. Laws, they say, express the connection of facts. However, the factualists absolutize the significance of the facts, and therefore consider them to be the primary link in all conceptual frameworks. We propose to consider the facts as an intermediate, rather than a primary or final link in intra-theoretical transduction. The transduction line from the facts leads the researcher further. What for? - this is the key issue of the issues under discussion.

We suggest that the factual analysis as a stage of transduction leads to referents (from Latin - to report, to report), by which are meant the objects identified at the induction stage. Fact - the component of the experiment (adduction). The referent is an induction component, which does not exist at the experiment stage. Referents, inductive laws and principles are the links of inductive analysis, without which they can not be withdrawn nor understood. Only after a comprehensive analysis of the reference can we say what really exists. As is known, the question of the existence of reality is considered to be the subject of ontology (from Greek to l-existent).

Theoretical development of the problem of ontological relativity

Willard Quine, the patriarch of American analytical philosophy, came to three actual conclusions about ontological problems.

1. All objects are theoretical. It means that we learn about objects from the theory. What exactly are the objects, the researchers learn from theories.

2. To exist is to be a value of a variable. What exactly is recognized as existing? What is present in universal scientific laws, which, as is known, are written in terms of variables. But this is exactly what the position put forward at the beginning of the paragraph follows.

3. The reference is incomprehensible. The reference as a designation of objects by words and other signs seems quite obvious operation, but it turns out that this is not so. Even in the case of ostensivnogo definition, i.e. pointing a finger at something, it is not clear what exactly is indicated: whether on the whole body, or on its part. Many words, in particular, unions, proposals, interjections, ostensivnye definitions are generally counter-indicative. So, the reference as an independent act, not mediated by language, is, in principle, impossible.

W. Quine's conclusions are in many respects correct, however, they are not without some flaws. Theories do provide knowledge about objects, but it is also necessary to take into account the relative independence of objects. The objects themselves are not determined by theories, and the connection between objects, mentality and language has no causal nature. The variables that appear in the laws really have a direct bearing on the characteristics of the objects. But, when discussing reality, it is not enough just to emphasize the relationship of variables with the characteristics of objects. In doing so, you can pass by and interval values, and selective means, and uncertainties.

There is something that is revealed in the process of reference, but Quine considered the reference incomprehensible. This conclusion was the result of an extremely depleted notion of reference, understood as designating the signs of objects and their signs. Meanwhile, reference should be understood as an induction stage, and then its content is nontrivial, and it is quite possible. Moreover, without reference it is impossible to understand the meaning of the theory, including its laws. Referential analysis shows that the principles and laws include special variables - sample means, including probabilities and uncertainties. Short expression there are objects and their attributes gives only the first idea of ​​reality. There are also principles, laws, and objects with their characteristics, but it is useful to highlight the statistical features of features. Triumph in the comprehension of reality is unattainable without the development of a model, the subsequent planning of the experiment, the measurement, and finally, the interpretation of the data, which acts as a reference.

Of course, in each science the reference has its own characteristics. Technological knowledge, as we know, is among the pragmatic sciences, and according to a widely held opinion, reference only applies to semantic sciences. In pragmatic disciplines, it is not about what is, but about what will happen. The reference is supposedly correlated only with what is, therefore, for pragmatic disciplines, the concept of reference is alien. However, the above argument does not take into account the most important fact: originally the concept of reference was developed with reference to natural science. Later, following the development of pragmatic sciences, this concept had to be generalized, which, in fact, was not done, and as a result, a certain conceptual confusion arose. It seems to us that the concept of reference remains valid in the field of pragmatic sciences. Technological reality is the reality of technical and cultural values ​​and all that is associated with them. In particular, it includes expectations about achieving certain usefulnesses.


1. Induction is the actual phase of conceptual transduction.

2. Induction is carried out through statistical analysis, with crucial correlation and regression analysis.

3. Induction is the separation of referents and inductive laws, as well as principles.

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)