Distribution parameters and statistics, Mathematical...

Distribution parameters and statistics

Any parameters of the distribution of a random variable, for example, such as mathematical expectation or variance, are theoretical values ​​that are inaccessible to direct measurement, although they can be estimated. They are a quantitative characteristic of the general population and can only be defined in the course of theoretical modeling as hypothetical quantities, since they describe the distribution of a random variable in the general population itself. In order to determine them in practice, the researcher conducting the experiment performs their selective evaluation. This estimate involves statistical calculation.

Statistics is a quantitative characteristic of the studied parameters characterizing the distribution of a random variable, obtained on the basis of a study of sample values. Statistics are used either to describe the sample itself, or, which is of paramount importance in fundamental experimental studies, to estimate the distribution parameters of a random variable in the population under study.

Separation of concepts parameter and statistics is very important, since it avoids a number of errors related to misinterpretation of data obtained in the experiment. The fact is that when we estimate the distribution parameters with the help of statistical data, we get values ​​that are only to a certain extent close to the parameters being estimated. There is almost always a difference between parameters and statistics, and, as far as this difference is large, we, as a rule, can not say. Theoretically, the larger the sample, the closer the estimated parameters are to their sample characteristics. However, this does not mean that by increasing the sample size, we will inevitably come closer to the estimated parameter, reduce the difference between it and the computed statistics. In practice, everything can be much more difficult.

If in theory the expected value of the statistics coincides with the estimated parameter, then such an estimate is called unbiased. Estimate, at which the expected value of the estimated parameter differs from the parameter by some value, called biased.

We should also distinguish between point and interval estimates of the distribution parameters. Spot is called an estimate using a number. For example, if we claim that the value of the spatial threshold of tactile sensitivity for a given subject under these conditions and in this area of ​​the skin is 21.8 mm, then such an estimate will be a point estimate. Similarly, a point estimate takes place when in a weather report we are told that outside the window is 25 ° C. Interval score involves using a set or a range of numbers in an estimate. Estimating the spatial threshold of tactile sensitivity, we can say that it was in the range of 20 to 25 mm. Similarly, weather forecasters can report that, according to their forecasts, the air temperature in the next 24 hours will reach 22-24 ° C. An interval estimate of a random variable allows us not only to determine the desired value of this quantity, but also to set the possible accuracy for such an estimate.

Mathematical expectation and its evaluation

Let's return to our experience with tossing a coin.

Let's try to answer the question: how many times should the "eagle" drop out, if we throw a coin ten times? The answer seems to be obvious. If the probabilities of each of the two outcomes are equal, then the outcomes themselves should be equally distributed. In other words, with tenfold tossing an ordinary coin, we can expect that one of its sides, for example, the "eagle", falls exactly five times. Similarly, with a 100-fold toss of the coin eagle must drop exactly 50 times, and if the coin is thrown 4236 times, then the side that interests us should appear 2118 times, no more and no less.

So, the theoretical value of a random event is usually called mathematical expectation . The mathematical expectation can be found by multiplying the theoretical probability of a random variable by the number of trials. More formally, however, it is defined as the central moment of the first order. Thus, the mathematical expectation is that value of the random variable to which it theoretically strives in repeated trials, of which it varies.

It is clear that the theoretical value of the mathematical expectation as a distribution parameter does not always turn out to be equal to the empirical value of the random variable of interest to us, expressed in statistics. If we do an experience with tossing a coin, then it is likely that out of ten outcomes, the "eagle" will drop out only four or three times, and maybe, on the contrary, he will fall out eight times, and maybe never fall out. It is clear that one of these outcomes turns out to be more, some less likely. If we use the law of normal distribution, we can conclude that the more the result deviates from the theoretically expected, given by the value of the mathematical expectation, the less likely it is in practice.

Suppose further that we have done this procedure several times and have never seen the theoretically expected value. Then we may have doubts about the authenticity of the coin. We can assume that for our coin the probability of falling out of the "eagle" in fact it is not equal to 50%. In this case, it may be necessary to estimate the probability value of this event and, accordingly, the value of the mathematical expectation. This need arises whenever in the experiment we investigate the distribution of a continuous random variable, such as the reaction time, without having beforehand any theoretical model. As a rule, this is the first mandatory step in the quantitative processing of the results of the experiment.

The mathematical expectation can be evaluated in three ways, which in practice can give slightly different results, but in theory they must necessarily lead us to the value of the mathematical expectation.

The logic of this assessment is illustrated in Fig. 1.2. The mathematical expectation can be considered as the central tendency in the distribution of the random variable x, as the most probable and therefore most frequent its value and as a point dividing the distribution into two equal parts.

Estimating the expectation for a normal distribution

Fig. 1.2. Estimating the expectation for a normal distribution

Continue our imaginary experiments with a coin and perform three experiments with ten-fold tossing. Suppose that in the first experiment, the "eagle" fell out four times, the same thing happened in the second experiment, in the third experiment, the "eagle" fell out more than 1.5 times more often - seven times. It is logical to assume that the mathematical expectation of the event of interest actually lies somewhere between these values.

First , the simplest evaluation method of the mathematical expectation will consist in finding < Then the estimate of the mathematical expectation on the basis of the above three measurements will be (4 + 4 + 7)/3 = 5. Similarly, in the experiments with the reaction time, the mathematical expectation can be estimated by calculating the arithmetic mean of all the values ​​of x. So, if we performed n measurements of the reaction time x, then we can use tsya following formula, which shows that for the calculation of the arithmetic mean values ​​of X must be added to all the values ​​obtained empirically and divide them by the number of observations:

(1.2)

In formula (1.2), the measure of mathematical expectation is usually denoted as ̅ x (it is read as x with a slash "), although sometimes it can be denoted as M (average mean ).

The arithmetic mean is the most commonly used estimate of mathematical expectation. In such cases, it is assumed that the measurement of the random variable is carried out in the metric scale. It is clear that the result obtained may coincide, and may not coincide with the true value of the mathematical expectation, which we never know. It is important, however, that such a method is an unbiased estimate of the mathematical expectation. This means that the expected value of the estimated value is equal to its mathematical expectation: .

The second way of estimating a mathematical expectation is to accept the most frequently occurring value of the variable of interest for its value. This value is called the distribution mode. For example, in the just considered case of tossing a coin for the value of a mathematical expectation, you can take "four", since in three tests conducted this value appeared twice; that is why the mode of distribution in this case turned out to be equal to four. The evaluation of the mode is applied mainly when the experimenter deals with variables that take discrete values ​​specified in the non-metric scale.

For example, by describing the distribution of student grades in the exam, you can build a frequency distribution of the students' grades. This frequency distribution is called a histogram. The most common estimate for the magnitude of the central trend (mathematical expectation) is in this case. In the study of variables characterized by continuous values, this measure is practically not applied or rarely used. If, however, the frequency distribution of the results obtained is nevertheless constructed, then, as a rule, it does not concern the experimentally obtained values ​​of the test feature, but some intervals of its manifestation. For example, by examining the growth of people, you can see how many people fall in the interval to 150 cm in height, how many in an interval from 150 to 155 cm, etc. In this case, the mode will be related to the interval values ​​of the trait in question, in this case growth.

It is clear that the mode, like the arithmetic mean, can coincide, or it may not coincide with the actual value of the mathematical expectation. But just like the arithmetic mean, the mode is an unbiased estimate of the mathematical expectation.

Let's add that if two values ​​in the sample are encountered equally often, then this distribution is called bimodal. If three or more values ​​in the sample occur equally often, then say that such a sample has no fashion. Such cases with a sufficiently large number of observations, as a rule, indicate that the data are extracted from the general population, the distribution pattern in which differs from the normal one.

Finally, the third way of estimating the mathematical expectation is to divide the sample of subjects by the parameter of interest exactly in half. The value characterizing this boundary is called the median distribution.

Suppose we are present at the ski competitions and after their completion we wish to assess which of the athletes showed the result above the average and who is lower. If the composition of participants is more or less even, then in estimating the average result it is logical to calculate the arithmetic mean. Suppose, however, that among the professional participants there are several amateurs. They are few, but they show results that are significantly inferior to the rest. In this case, it may turn out that out of 100 participants of the competition, for example, a result above the average was shown by 87. It is clear that such an assessment of the average trend of us can always be arranged. In this case, it is logical to assume that the average result was shown by participants who occupied somewhere 50th or 51st place. This is just the median distribution. Up to the 50th finalist 49 participants finished, after 51 - 49 too. Unclear, however, whose result from them to take for the average. Of course, it may turn out that they finished with the same time. Then the problem does not arise. There is no problem even when the number of observations is odd. In other cases, however, you can use the averaging of the results of the two participants.

The median is a special case of a quantile of distribution. Quantile is part of the distribution. Formally, it can be defined as the integral value of the distribution between two variables of the variable X. Thus, X will be the median distribution if the integral value of the distribution (probability density) -∞ to X is equal to the integral value of the distribution from X to + ∞. Similarly, the distribution can be divided into four, ten or 100 parts. Such quantiles are respectively called quartiles, deciles and percentiles. There are other types of quantiles.

As well as the two previous ways of estimating a mathematical expectation, the median is an unbiased estimate of the mathematical expectation.

Theoretically, if we are really dealing with a normal distribution of a random variable, then all three estimates of the mathematical expectation should yield the same result, since they all represent the unbiased version estimates of the same distribution parameter of the estimated random variable (see Figure 1.2). In practice, however, this is rare. This can be due, in part, to the fact that the analyzed distribution differs from the normal one. But the main reason for such discrepancies, as a rule, is that, by estimating the value of the mathematical expectation, one can obtain a value very different from its true value. However, as it was mentioned above, in mathematical statistics it is proved that the more independent tests of the variable under consideration, the closer the estimated value should be to the true one.

Thus, in practice, the choice of the method of estimating the mathematical expectation is determined not by the desire to obtain a more accurate and reliable estimate of this parameter, but only for reasons of convenience. Also, a measuring scale, in which the observations of the estimated random variable itself are reflected, plays a certain role in choosing the method of estimating the mathematical expectation.

thematic pictures

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)