Methods of correlation and regression analysis - Information...

Correlation and regression analysis methods

Analysis and synthesis of marketing research data is carried out by manual or computer processing methods. For processing, both descriptive and analytical methods are used. Among the analytical methods in marketing research, the following are mainly used: trend analysis, non-linear regression and correction methods, discriminant analysis, cluster analysis, factor analysis, etc.

Considering the dependencies between the signs, it is necessary to distinguish first of all two categories of dependence: functional and correlation.

Functional relations are characterized by a complete correspondence between the change in the factor sign and the change in the resultant value, and to each value of the characteristic-factor there correspond quite definite values ​​of the effective characteristic.

In correlations , there is no complete correspondence between the change of the factor and the resultant attribute, the effect of individual factors is manifested only on average in the mass observation of actual data. In the simplest case of applying the correlation dependence, the magnitude of the effective attribute is considered as a consequence of a change in only one factor (for example, the advertising budget is the reason for the growth in sales volume).

Correlation analysis provides an opportunity to calculate the level of confidence in the results of the analysis. In the process of this analysis, correlation indices are calculated, which include correlation coefficients and correlation ratios.

When comparing functional and correlation dependencies, it should be borne in mind that in the presence of a functional relationship between the signs, knowing the magnitude of the factor sign, you can accurately determine the magnitude of the resultant trait. In the presence of the correlation dependence, only the tendency of the change in the resultant attribute is established with a change in the value of the factor sign. Unlike the rigidity of a functional connection, correlation links are characterized by a variety of causes and effects and only their trends are established.

Let's consider an example of correlation dependence. Let's analyze and confirm or deny the statement that the number of promotions held by the company to promote a new product increases sales. To do this, we will sample territories (Table 6.8).

Table 6.8

Product sales data for different areas of the city

Territory (territory code)

The volume of sales of packages, pcs.

Number of promotions

I

30

2

2

60

5

3

40

3

4

60

7

5

40

2

6

80

6

7

60

4

8

90

9

9

90

8

10

50

4

The simplest way to detect a connection is to match two parallel rows. From the general analysis it is clear that an increase in the number of promotions contributes to an increase in sales.

Another way is to construct a scattering field on the diagram (Figure 6.2).

Dispersion Field

Fig. 6.2. Dispersion field

The nature of the distribution in Fig. 6.2 indicates that the volume of sales increases with the increase in the number of promo actions. Consequently, there is a positive relationship between the factors.

Regression analysis will give an opportunity to answer the question about the quantitative measure of the influence of various factors, for example, on demand (the volume of possible sale). It is a selection and solution of mathematical equations describing the investigated dependences. Elements of the market depend on many factors, and the forms of these dependencies can be very diverse. Therefore regression analysis begins with the plotting of the dependence curve, on its basis a suitable mathematical equation is selected, and then the parameters of this equation are found by solving the system of normal equations.

Regression analysis is used to study the relationships between a dependent variable and one or more independent variables, determining the tightness of the relationship and the mathematical relationship between them, predicting the value of the dependent variable.

The simplest correlation system is the linear relationship between two characteristics, or a pairwise linear correlation. The equation of a pairwise linear correlation relationship is called the pair regression equation:

where - the average value of the effective characteristic y for a certain value of the factor sign x; a - the free term of the equation; b is the regression coefficient that measures the average ratio of the deviation of the effective characteristic from its average value to the deviation of the factor characteristic from its average value by one unit of its measurement, the variation y, per unit variations x.

Let's consider an example. We establish the relationship between sales and the number of promo actions using the following linear regression model:

where Y i - the sales volume on the i -th territory; X i - the number of promotions on the i -th territory. The parameters α and β are calculated by the following formulas:

where

Let's compose the table with intermediate calculations (Table 6.9).

Table 6.9

Calculating the sums for calculating the equations of a straight line

п/п

Sales volume (y)

Number of promotions ( x )

xy

x 2

1

30

2

60

4

2

60

5

300

25

3

40

3

120

9

4

60

7

420

49

5

40

2

80

4

6

80

6

480

36

7

60

4

240

16

8

90

9

810

81

9

90

8

720

64

10

50

4

200

16

Σ

600

50

3430

304

Then

The equation of the predicted line will take the form

The value of β = 7.96 indicates that for each additional promotion, sales increase by 28 packages (20.2 + 7.96).

The average coefficient of elasticity is determined by the formula where

The coefficient of elasticity, equal to 0.66, shows that with an increase in promo actions by 1%, sales increase by 0.66%.

Consider another example of correlation-regression analysis. The administration of the trading company decided to introduce a new type of price incentive service. A sample of 10 stimulus cases analyzes the dependence of sales revenue at the time of stimulation on stimulus costs (Table 6.10).

Table 6.10

Marketing data on incentive results

Metric

Event number

1

2

3

4

5

6

7

8

9

10

Revenue from sales Υ, thousand rubles.

26.2

17.8

31.3

23.1

27.5

36.0

14.1

22.3

19.6

31.3

Expenses for stimulating X , thousand rubles.

3.4

1.8

4.6

2.3

3.1

5.5

0.7

3.0

2.6

4.3

Let's construct the correlation field of the result and the factor (Figure 6.3).

Based on the correlation field, it can be concluded that there is a direct relationship between the factor ( X ) and the resulting ( Y ) characteristics.

Define the parameters a and b of the linear regression equation y = a + bx by the least squares method. the equations have the following form:

where n is the number of observations in the aggregate (in our case 10); a, h - the required parameters; x , y are the actual values ​​of the factor and result attributes.

The correlation field of the result (sales revenue) and the factor (costs of stimulation), as well as the machine linear regression equation with the reliability of the approximation 0.938

Fig. 6.3. The correlation field of the result (sales revenue) and the factor (incentive costs), as well as the machine linear regression equation with the confidence of approximation 0.938

The coefficients of the system of equations can be found from the formulas

To facilitate the calculations, we compile a five-graph calculation table (Table 6.11).

Table 6.11

Calculation table for determining the parameters of the regression equation

Parameter Number

x

y

x 2

y 2

xy

1

3.4

26.2

11.56

686.44

89.08

2

1.8

17.8

3.24

316.84

32.04

3

4.6

31.3

21.16

979.69

143.98

4

2.3

23.1

5.29

533.61

53.13

5

3.1

27.5

9.61

756.25

85.25

6

5.5

36

30.25

1296

198

7

0.7

14.1

0.49

198.81

9.87

8

3

22.3

9

497.29

66.9

9

2.6

19.6

6.76

384.16

50.96

10

4.3

31.3

18.49

979.69

134.59

Σ

31.3

249.2

115.85

6628.78

863.8

We will calculate the regression parameters from the data in Table. 6.11:

The regression coefficient ( b ) shows the absolute strength of the relationship between the variation x and the variation of y:

With regard to this problem, we can say that when you contribute to the promotion of sales for 1 million rubles. the total amount of revenue increase changes on average by 4,686 million rubles.

Thus, the linear regression equation takes the following form:

The linear correlation coefficient is determined by the formula

In terms of the correlation coefficient r = 0.957, we can speak of the high closeness of the relationship between y and x.

thematic pictures

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)