Validity of Statistical Inference - General Psychological Practice

Validity of Statistical Output

To the violation of the validity of the experiment, i.e. the correspondence between the goals and conclusions of the study, often leads to misuse of measurement and statistics methods. The validity of the statistical inference reflects the extent to which the conclusions of the study are confirmed by the collected data, whether these conclusions can be drawn from the data presented, based on the methods of statistical processing of the results used in the study. The reduction in the validity of the statistical inference reflects the fact that from the data obtained in the research it is necessary to do completely the conclusions made by the researcher.

The principle of violation of the validity of the statistical inference is completely different than in the case of internal and external validity. If there is a violation of the validity of the statistical inference, one can not speak about the influence of some conditions that is inseparable from the influence of the independent variable, and therefore does not allow us to draw definitive conclusions about the reasons for the change of the dependent variable (but analogy with the effect of side and additional variables).

In the event of a violation of the validity of the statistical inference, the researcher makes wrong conclusions, because he has not correctly processed the data. One of the main threats to this kind of validity is not unaccounted-for influences, but actions of the researcher himself. The essence of this violation is that as a result of incorrectly processed data it is impossible to understand whether the dependent variable has changed from preliminary measurement to final measurement or these changes are just an illusion caused by an incorrect processing method. Thus, the violation of this kind of validity is not due to the fact that some additional influence has changed the dependent variable, but with the fact whether the dependent variable has changed at all or its change is only an appearance.

Violation of the validity of statistical inference leads to two possible distortions of the findings of the study, which correspond to the concept of errors of the first and second kind in statistics. An error of the first kind suggests that the researcher concluded that the hypothesis of the study was confirmed, whereas in fact, data for such an inference is not enough. The researcher concludes that the dependent variable has changed under the influence of the independent, while the dependent variable has not really changed or has changed insignificantly. An error of the second kind is a missed result when the researcher concludes that his hypothesis was not confirmed, the dependent variable did not change under the influence of the independent, while in fact the dependent variable changed, but the researcher did not see this change due to incorrect data processing. >

Both errors are fatal for research hypotheses, but can be corrected when working correctly with data. In this case, it is often enough to re-statistically process the data already collected and to change the research objectives and hypotheses in accordance with the possibilities and limitations imposed by the methods of statistical processing. In some cases, which will be discussed when discussing threats to the validity of the statistical inference, in order to correctly process the data, they must be supplemented by referring again to those already participating in the study, or by typing in an additional group of subjects and obtaining the missing data.

It is not difficult to prevent violations of the validity of the statistical inference, for this it is sufficient at the planning stage of the research to determine clearly how and by what scale the variables will be measured, with what statistical methods the obtained data will be processed. The scales and methods must be mutually consistent.

If the researcher understands that the methods chosen by him originally are based on scales that do not allow the desired method of statistical processing, then he must use other methods or methods of processing that are adequate to the methods of measurement. It should be remembered that the methods of measuring and processing data impose limitations on the findings of the study. These conclusions must necessarily answer questions about whether the goal of the study was achieved, whether the hypothesis was confirmed or not. Therefore, the goal and hypothesis of the study are ultimately formulated taking into account the possibilities of measurement methods and statistical data processing.

Any research in which goals and hypotheses are formulated taking into account the possibilities and limitations of measurement scales and methods of statistical data processing is quite reliable, although not 100% protected from violations of the validity of the statistical inference. Therefore, the main threat to this validity is the gaps in the planning of research, depending primarily on the zeal and competence of the researcher. Campbell et al. In their works discuss in less detail the threats of validity that do not concern the validity of the internal and external. Their followers identify several of the most common threats to statistical validity, corresponding to the most frequent gaps in research planning. They can be arranged as follows:

Select the processing method that does not match the measurement scale. We already know that, according to Stevens, there are four types of measurement scales: nominative, scale of order, scale of equal intervals and scale of relations. Any experimental data represents the results of a measurement using one of these scales.

For any method of statistical processing, there are certain requirements (such as the correspondence to the normal distribution, the equality of variances, the number of gradations of factors, etc.) that determine the purpose and scope of its application. These requirements are mainly due to the possibilities and limitations of the measurement scales, as well as the conditions for matching the differently scaled scales (for example, if you want to compare the data measured in the name scale and in the interval scale). If the researcher neglects these requirements, he violates the statistical validity of the study, because he uses the criterion for other purposes or goes beyond the scope of its permissible application.

As a result, the researcher can not make an unambiguous conclusion about whether the dependent variable has changed in reality, or it comes to a false conclusion about its change as a result of a measurement error (commits an error of the first kind). In the case of a refutation of the research hypothesis, a similar question arises: did the dependent variable really change or the researcher did not see its change as a result of incorrect statistical data processing, having made a mistake of the second kind?

Examples of actions leading to the actualization of the threat are the application of parametric statistics to non-metric scales (names and order), the use of a linear correlation coefficient to evaluate nonlinear dependencies, and the verification of a large number of hypotheses in a single sample.

In order to avoid this threat, even before the data collection begins, it is necessary to determine the scale in which the data will be presented, to plan the methods for their statistical processing, and then to apply methods of data collection taking into account the requirements for data in the statistical processing methods.

Invalid reference of the researcher to the measurement results. The most common forms of such incorrect treatment are the formulation of conclusions beyond the limitations of measurement scales and statistical processing techniques, and the selectivity of the researcher in the data message.

The first violation is due to the fact that for each scale of measurement there is a set of relationships that allows you to make only certain conclusions about objects measured with this scale. The stronger the scale, the more relationships are observed on it, the more diverse the conclusions can be made on the basis of the corresponding measurements.

Thus, the scale of names is the weakest, only one relation is observed on it - the equivalence relation. Therefore, with respect to the data measured in this scale, it is possible to draw conclusions about whether the objects of measurement are equivalent or interchangeable or not.

On the scale of order, in addition to the equivalence relation, the order relation is observed, and the conclusions that can be drawn on this basis relate not only to the ability to group objects by similarity, but also to determine the nature of the differences: the differing objects differ in the expression of the measured quality, and equivalent objects have this quality to the same extent.

When using the scale of equal intervals, the researcher can make both of these conclusions and also assess how much more or less the measured quality is expressed in different objects. However, the researcher can not infer how many times one object exceeds the other in terms of the measured quality, as well as the complete absence of this quality, since there is no absolute zero on the interval scale, reflecting the total absence of measurable quality.

Such conclusions (how many times more/less and the total absence of measurable quality) can be done only if the scale of relations is used. The relationship scale is the strongest and allows you to make all of the above conclusions.

If the researcher draws conclusions inadmissible for the measurement scale he uses, it is a violation of the validity of the statistical inference. For example, the evaluation of the degree of his disappointment, in accordance with the scoring method, is a measurement using a scale of order. The researcher can only infer that some object causes more disappointment than the other, but can not draw conclusions about how much or how many times this disappointment is stronger or weaker, even despite the formal presence of figures in its dimensions. Similarly, the researcher has no right to draw conclusions about low, high or moderate severity of disappointment, to declare his complete absence or excessive expression. All these conclusions are inadmissible for the scale of order, since it lacks zero, expressing the absence of measurable quality, and a standard unit that could help distinguish normal expression from excessive.

If the researcher draws such conclusions, he will exceed his authority and thereby violate the statistical validity of the study, because the measurement scale used by him does not provide the information necessary for such conclusions. In doing so, the researcher admits a gross violation of the validity of the statistical inference in his classical definition: his conclusions are not based on the measurements used in the study.

The second type of incorrect treatment of the results of the study - selectivity in the data message - assumes that the researcher for some reason or other does not report some of the results he obtained. The obvious reason for this is the desire to confirm your hypothesis. The researcher excludes from the processing those data that contradict the hypothesis or do not carry unambiguous information about it, thereby reducing in the sample the proportion of subjects in whom the experimental impact has not affected the way the researcher would like.

This mode of action is called fraud and is punishable by disqualification of work or research report. As an example, here is the work of the British psychologist S. Bert, published in the British Journal of Psychology , and then recognized as scientific falsification.

The selectivity in the data message can also occur for a different reason. For example, if the researcher checks out those subjects who did not perform the task correctly (from his point of view), failed to complete the task within the allotted time, performed extravagantly during the research, asked strange questions, provided conflicting information or information that could not be unequivocally interpreted in process of data processing. In all these cases, the researcher is not to blame for reporting only part of the information, since all the information he did not disclose, he could not process or counted that the subjects are too deviant from the usual behavior - perhaps because of problems with motivation or even deeper problems.

In this case, firstly, there is a dropout of the subjects, and there are problems with internal validity associated with the phenomenon of depletion of the sample. Secondly, external validity is violated, since the sample becomes less representative, some representatives with common features that are manifested in the fact that they did not understand the task or spoil the form are excluded. And, finally, the distribution of the traits under study is changing, since some of their parts are excluded from the general distribution of data, which has a direct relation to the violation of statistical validity.

When a researcher changed the number of people or the distribution of answers, eliminating those that could not be handled or trusted, he limited the range of the measurement scale and reduced the sample size. Further statistical processing is not entirely correct, since the results of statistical processing depend on both the scale range and the number of subjects. Because of this, it is impossible to say what determines the conclusion about the results of the research and hypotheses by the real relations between the independent and dependent variable or the incorrect work with the data (in other words, whether the error was of the first or second kind, or the conclusions were made correctly).

This threat to the validity of the statistical inference is directly due to the researcher's skills in planning data processing. An experienced researcher uses methods that enable him to process the results of the vast majority of subjects, for the work with which they are intended. In doing so, he will necessarily ask the question of whether it is possible, with the help of these scales and methods, to obtain the information that he needs for his conclusions about the goals and hypotheses of the study. Depending on the answer to this question, he can either formulate research objectives in view of the possibility of a scale, or use scales suitable for his purposes, allowing to obtain the necessary information.

The number of measurement errors is also a source of violations of statistical validity. There are many sources of measurement errors - from incorrectly given instructions, which leads to shifts in the distribution of the answers of the subjects and in the performance of their tasks, the specific shortcomings of the tasks themselves, such as their inconsistency with the research objectives, the weakness of the experimental influence created in them, the unreliability of the procedure for measuring the dependent variable and low sensitivity to the level of expression of the variable, to the unfairness of the subjects, their inattentive attitude to instructions, tasks, lack of I desire and readiness to carry out assignments.

The more measurement errors allowed, the worse the data collected in the study reflect the features of the reality being studied. In the worst case, the subjects perform the tasks of the researcher, giving answers randomly, at random. No statistical processing of such data will lead to meaningful conclusions, because the data themselves do not initially carry information about the psychological qualities that the researcher planned to measure.

Only good research planning can be protected from such a threat, involving the selection of qualitative case studies, a clear definition of measurement scales, and thought-out of data processing procedures.

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)