Analysis of tables with one input, Comparison of two...

Analyzing tables with one input

From the point of view of mathematical statistics, the simplest experimental plan is an intergroup plan.

In the intergroup plan, the levels (values) of the independent variable vary between the groups of subjects in such a way that in each experimental group the level (value) of the independent variable is unchanged for all subjects.

The results of the intergroup experiment can be described as a table with one input (Table 3.1), where the levels of the independent variable T (from English treatments ). The arbitrary level of the independent variable T is here indicated as the level j . Theoretically, the number of these levels must be at least two (otherwise the independent variable will become a constant) and can be arbitrarily large, although in practice this number k is always limited either by the logic of the experiment itself, or by the need to save the experimenter's forces and time, as well as the resources available to him. Therefore, as a rule, the number of levels of an independent variable is limited to a range from 2 to 5.

Table 3.1

Single-entry table

Levels of the independent variable T

1

...

j

...

k

X 11

...

X j1

...

X k1

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

X 1n

...

X jn

...

X kn

The table contains the results of measuring the values ​​of the dependent variable for each subject. Since the complete set of data for such a table is a matrix of k columns in n rows, the result of each subject is denoted by two indices. The first index indicates the number of the subject in the group: 1 to n. Second - to the group number: 1 to k. Thus, the result of an arbitrarily chosen subject under the number i in the group j X ij.

To begin with, we simplify the problem of analyzing such tables by assuming that the number of levels of the independent variable and, correspondingly, the number of groups of subjects is equal to two. We know that when comparing two samples, it's easiest to use the Student's t-test known to us. However, this test does not allow comparing more groups of subjects, since it is limited to only two-level experimental plans.

However, there is a method that is free of these limitations. This is the variance analysis method . Sometimes it is confused with the F test (discussed in Chapter 2 as an example of a preliminary test evaluating the uniformity of dispersions). In fact, the method of variance analysis is more complex and is designed to achieve other goals. He, just like Student's t-test, is used to compare mean values ​​in different samples. However, its logic is somewhat different, and the possibilities are much wider.

Comparison of two samples

Next, we will somewhat change the direction of our study of statistical procedures. To begin with, consider a specific example, then try to generalize the principles of its analysis.

Quite a long time ago, in the period of the emergence of modern cognitive psychology, American psychologists J. Bransford and M. Johnson [18] studied the question of how contextual prerequisites influence the processes of understanding and reproducing texts. They designed texts, understanding of which was as difficult as possible, in view of the fact that the topic of the text was deliberately hidden.

Here is an example of such a text:

"Actually, the procedure is quite simple. First of all, you have to decompose things. Of course, one pile may be sufficient, depending on how much you need to do. If you have to go somewhere else because of a lack of opportunities, then this is the natural next step, otherwise you can start work. Doing it, it's important not to overdo it. In other words, it's better to do less than try to do too much immediately. Errors can lead to unpleasant consequences. At first everything can seem complicated. Gradually, however, this will become one of your usual duties. It is difficult to foresee such a moment in the future, when the need for doing this business completely disappears. After the completion of the procedure, the material is again decomposed into groups. Then it can be placed in the appropriate place. After some time, all of this will be used again, and then we will have to repeat this cycle .

One group of subjects, before presenting this text, was informed that they will be instructed to wash laundry. The second group of subjects did not receive such preliminary information. In Table. 3.2 shows the possible results of such an experiment in terms of assessing the understanding of this text. These data were collected by the author of this textbook at the sessions of the general psychological workshop at the Institute of Psychology. LS Vygotsky RSUH and never published.

In the left part of the table. 3.2 evaluations of the understanding of this text by randomly selected subjects (9 of them) who received the preliminary instruction are given, and on the right side are estimates of the understanding of the text by other subjects (9) who did not receive such preliminary instruction. These 18 estimates are statistically independent, since they are obtained in 18 independent tests. In Table. 3.2 also indicates the average values ​​of the level of understanding of the text for both groups of subjects.

As shown in Table. 3.2, the subjects who received the preliminary instruction demonstrate, on average, a higher level of understanding of the text proposed by him compared to the subjects who did not receive such instruction. However, if we compare the individual results within the experimental groups, we can see that the results of the experiment in two independent samples of the subjects overlap each other to some extent. Thus, one of the subjects who received the preliminary instruction demonstrates a very low level of understanding (6 points), while one of the subjects who did not receive prior instruction demonstrates rather high level of understanding of this text (3 points). The question is how important the differences between the two groups are. To answer this question, it is necessary to compare the differences between the mean values ​​of the dependent variable in the two groups with those differences that occur within each group. For this, in turn, it is necessary to perform some transformations of the original data. The meaning of these transformations will just be to compare differences between groups with differences within groups.

Table '32

Assessing the understanding of the text by subjects who received and did not receive prior instructions about the content of the text

With a preliminary instruction

Without previous instructions

3

6

3

4

2

4

6

4

2

6

2

5

1

3

2

7

2

4

Average 2.56

Average 4.78

Note: 1 - maximum understanding; 7 - minimal understanding.

Let's designate the result, which was received by the first subject in the first group (with the preliminary instruction), as X 11. Similarly, the result obtained by the seventh subject in the second group (without preliminary instruction) can be designated as X 72. In general, we will denote the result of i in j -th group as Hu.

The sum of the results of the subjects of one group will be denoted as T j. Thus, , where n is the number of subjects in the group.

The average result for the corpse will not be denoted as

Then the formula can be used to determine the spread of data within a group:

Statistics SS (square sum square) is called the total square or variation. It determines the degree of variability in the data. It is easy to see that the total square of the data within the first group is 16.22, within the second - 13.56.

The cumulative spread of data within the two groups is obviously made up of the total squares for the first and second groups. It can be determined by the formula

(3.1)

For this case, the aggregate spread, or variation, of the data within the two experimental groups is 29.78. Since this cumulative intra-group spread of data is not related to the action of an independent variable (after all, within the experimental group the value of the independent variable does not change), it is usually referred to as experimental error (from English error ), i.e. .

Similarly, the differences between the two groups of subjects can be assessed. Let's denote the sum of all the values ​​of the subjects of the two groups as G, the average value for all subjects as

where k is the number of groups (there are two in the experiment).

Now the total spread between the two experimental groups can be found by the formula

(3.2)

Since the difference between groups is related to the action of an independent variable, it is logical to designate it as a measure of the experimental impact, . It is not difficult to establish that for the present case

Adding the total scatter within the experimental groups and the spread between the groups, we find the total spread, or variation, of the data ( SS tota j ). In this case, it turns out to be 52.00.

(3.3)

The formula (3.3) is the basic ratio of the variance analysis (from Analysis of Variance, abbreviated ANOVA ), the meaning of which is to decompose the general spread of the data into additive (statistically independent) components for their subsequent comparison. The general logic of the simplest version of the variance analysis is shown in Fig. 3.1.

One-way variance analysis

Fig. 3.1. One-way analysis of variances

The overall spread can also be found in another way by comparing the result of each subject in each group with a total average:

(3.4)

Thus, the total spread between the groups caused by the differences in the experimental instruction turned out to be 22.22, and the total scatter within the groups caused by the side uncontrolled experimental influences (experimental error) turned out to be equal to 29.78. But this does not mean that the effect of the experimental error in this experiment exceeded the effect of the experimental effect. To clarify this fact, consider the concept of the degree of freedom.

As you can see, in this case, when estimating the total spread within the experimental group, the evaluation was performed with respect to one linear constraint ( T of the mean) and nine elements forming this spread (), when calculating the total spread across the two groups, the estimate was already made with respect to two linear limiters and 18 elements forming this spread, and in calculating the total spread of data between the groups, the evaluation was carried out with respect to one linear constraint ( G and the two elements that make up this spread ( T 1 and T 2). Suppose we already know the value of G, and the task was to arrange the unknowns T in any order relative to G. Clearly, at the first step of solving this problem, we have absolute freedom in choosing T. On the second step, we already have to measure the following value of T with G and the first value of T. Thus, here we were free only once. The task has only one degree of freedom. Another thing is to arrange the nine values ​​of X ij relative to the group's average. Here we have the opportunity to choose any values ​​of X, eight times, and only in the ninth step will we have to correlate our choice with the previous values ​​ X and the overall average. Thus, this problem has eight degrees of freedom. Finally, as is not hard to see, the problem of placing 18 values ​​relative to the two averages has 16 degrees of freedom.

In general, the number of degrees of freedom df can be estimated by counting the total number of independent elements that form a random spread around some linear constraints, and subtracting the number of linear delimiters from this value. If we denote the number of values ​​(levels) that an independent variable takes, through k, the number of degrees of freedom for statistics SS treatment will be equal to k - 1, and the number of degrees of freedom for the SS statistics will be < strong> k ( n - 1). The basic ratio of the variance analysis is also maintained for degrees of freedom. In other words, we can decompose the total number of degrees of freedom into additive parts, just as we did with the total squares (see Figure 3.1). As you can easily see, the number of degrees of freedom for the SS statistics will be kn - 1, which is exactly the sum of the number of degrees of freedom between the groups of subjects k - 1 and the number of degrees of freedom within the groups of subjects k ( n - 1).

It is clear that the fact that the degrees of freedom of two statisticians differ is to be taken into account when comparing them. That is why in the analysis of variance, not total scatter within and between groups is compared, but the spread per one degree of freedom. This variation is referred to as middle square (denoted as MS - mean square). The average square is counted as a result of dividing the total square by the corresponding number of degrees of freedom. The average square is called variance . For it, the basic ratio of variance analysis is not met.

By dividing the counted statistics by the corresponding number of degrees of freedom, we find that the variance between groups caused by the effect of the independent variable (the difference in the instruction between the two groups) is 22.22,

Persia within the experimental groups, caused by the side experimental factors (experimental error), is only 1.86, i.е. in 11,94 times less. It remains, however, the question of how significant these differences are.

It turns out that the differences can be estimated using the already known F -distribution. Find the value of F -statistics by formula

(3.5)

It turned out to be 11.94. Using the F -distribution tables (see Appendix 4), we find the boundary value for F with one degree of freedom in numerator and 16 degrees of freedom in the denominator. It is obtained equal to 8.53 for a 1% quantile of the distribution. Thus, it can be concluded that the probability of obtaining a value of F with one degree of freedom in the numerator and 16 degrees of freedom in the denominator, greater than or equal to 11.94, is less than one chance of 100. As already indicated, usually in a psychological study such a small value of the probability is considered sufficient to reject the statistical hypothesis about the absence of differences between variances and, consequently, to adopt an alternative hypothesis.

Thus, it can be considered statistically proven that significant (or reliable) differences in the level of understanding are observed between the two groups of subjects participating in the experiment, which repeats in a somewhat simplified form the experiments of J. Bransford and M. Johnson/18 /.

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)