An arbitrary number of independent variables - Mathematical...

Arbitrary number of independent variables

The above concepts and concepts can easily be generalized to the situation when the researcher studies the effect of an arbitrary number of independent variables on one dependent variable.

The equation of linear regression (9.1) for to of independent variables under the condition of z-transformation of raw data takes the following form:

The general methodology for calculating regression coefficients in a complex regression situation with one dependent variable and independent variables does not fundamentally differ from what we already know by the example of a simple linear regression considered in Chap. 7. In fact, we need to draw a straight line in space with the dimension k + 1 by points, the coordinates for which were obtained in the experiment. The number of these points is determined by the number of subjects participating in the experiment, and the specific dimensionality of the space is the number of independent variables, to which one more dimension is added, which is assigned to the dependent variable. The optimal solution is a solution in which the sum of the squared differences is minimal. This is achieved using the differential calculus method known as the least squares method. The minimum value for the next sum that determines the regression error is then searched for:

The partial derivative for each value of βj is set to zero. Thus, to find all values ​​of the regression coefficient, it is required to solve a system of equations of the following form:

We rewrite this system of equations somewhat differently:

Obviously, the right-hand side of this system of equations is a bivariate correlation vector of the dependent variable with all independent variables, and the left-hand side is the result of the product of the correlation matrix of all independent variables with a friend on the vector of values ​​of all sought-for regression coefficients. In other words, the system of equations under consideration, using the apparatus of vector algebra, can be rewritten in such a vector form:

Thus, to find the desired vector of the values ​​of the regression coefficients βj, it is enough for us to multiply the correlation vector r by inverted matrix of intercorrelations Rjj. Then the desired solution for the vector of standardized regression coefficients will look like this:

It is clear that the problem of inversion of the matrix of intercorrelations can be quite complicated in "manual" calculations, especially if the number of independent variables is sufficiently large. Modern computer programs, however, can easily cope with it without requiring large computational resources.

With the values ​​of the standardized regression coefficients β, we can calculate the values ​​of the regression coefficients B in the complex linear regression equation (9.1). This can be done using the formula

Coefficients R and R 2 for k of independent variables are defined exactly as in the case considered, when the number of independent variables was limited to two. In other words, the multiple correlation coefficient R is estimated as the usual bivariate correlation between the observed and predicted values ​​ Y (formula (9.4 )), and the coefficient of determination R 2 - as the variance ratio of the predicted values ​​ Y to the observed values ​​of the dependent variable (formula (9.5)).

The same coefficients can also be expressed with the help of regression coefficients β and the bivariate correlation values ​​of the dependent variable with each of the independent variables:

It should be borne in mind, however, that the determined coefficient of determination R 2 is not an unbiased estimate of the determination coefficient p2 for the general population. Therefore, in order to obtain a more realistic estimate of p2, some adjustments must be made to the calculations R 2. This is achieved using the formula

The adjusted coefficient of determination is often called "shrunk", as its value is somewhat less than the value of R 2.

The correlation of the part in the case of k independent variables is treated in a manner similar to what we already know with the example of two independent variables. Thus, the coefficient sr 2 is understood as that part of the variance of the dependent variable Y , which is associated with the variance of the independent variable X j, minus that portion of the variance Y , which is simultaneously related to the variance of other independent variables. This can be expressed by the following equation:

Here denotes the proportion of the variance of the dependent variable Y associated with the variance of all independent variables, except that its part, which is associated with the variance of X j. In other words, the value means the value of the determination coefficient without taking into account the contribution of the considered variable, for which the correlation of the part is calculated. If this value is subtracted from unity, we get the tolerance variable X j : < img border = 0 src="images/image748.jpg">

The very value of the correlation of honor for the variable X j can be found on the basis of the standardized regression coefficient βj and the corresponding tolerance:

Also, the correlation value of a part can be calculated based on the available values ​​of the partial correlation coefficients for the variable X j and the determination coefficient R 2 :

Formulas (8.7) and (8.8) can be useful if the statistical program by which regression analysis is performed does not directly yield correlation values ​​of the part.

Private correlations for k independent variables are interpreted in the same way as for two independent variables. As we recall, pr 2 represents the ratio of that part of the variance of the dependent variable that is related to the variance of a given independent variable and is not simultaneously related to the variance of other independent variables, variable, i.e. the variance of the dependent variable that is not associated with any independent variable. Formally, this definition of the partial correlation can be expressed by the following relationship:

The part of the variance of the dependent variable that is related to the variance of the considered independent variable and for which the value of the partial correlation is calculated and at the same time it is not related to the variance of other independent variables is by definition the correlation square of the part. Thus, the square of the private correlation pr 2 variables X j with the dependent variable Y can be expressed as follows:


Note that formula (9.8) demonstrates, among other things, the fact that under no particular conditions a particular correlation can be less than the correlation of a part. On the contrary, its value is almost always greater than the correlation value of the part.

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)