Validity - Psychodiagnostics. Theory and practice


After reliability, the key criterion for evaluating the quality of techniques is validity. The question of validity of a technique is solved only after its sufficient reliability is established, since an unreliable technique can not be valid. But the most reliable technique without knowing its validity is practically useless.

It should be noted that the question of validity until recently seems to be one of the most difficult. The most entrenched definition of this concept is that given in A. Anastasi's book: "The validity of the test is a concept that tells us what the test measures and how well it does."

Validity is inherently a complex characteristic, including, on the one hand, information about whether a technique is suitable for measuring what it was created for, and on the other - what is its effectiveness, efficiency, practical utility.

There is no single universal approach to determining validity. Depending on what side of the validity the researcher wants to consider, different methods of proof are used. In other words, the concept of validity includes different types of it, having their own special meaning. Validation of the validity of the technique is called validation.

Validity in the first sense of it (is the methodology suitable for measuring what it was created for) is relevant to the essence of the methodology itself, i.e. this is the internal validity of the measuring instrument. This check is called theoretical validation.

Validity in the second sense (what is the effectiveness, effectiveness, practical utility of the methodology) refers not so much to the technique as to the purpose of its use. This is pragmatic validation.

Summarizing, we can say the following:

- in the theoretical validation of the researcher, the property (construct) itself, which is measured by the technique, is of interest. This, in effect, means that the actual psychological validation

- with pragmatic validation, the essence of the subject of measurement (psychological property) is out of sight. The main emphasis is on proving that the "something", as measured by the methodology, has a connection with certain areas of practice.

Theoretical validation of the methodology is carried out by proving its constructive validity. Constructive validity, grounded by L. Cronbach in 1955, is characterized by the ability of the technique to measure a feature that was theoretically justified (as theoretical construct). When it is difficult to find an adequate pragmatic criterion, an orientation toward hypotheses formulated on the basis of theoretical assumptions about the measurable property can be chosen. Confirmation of these hypotheses testifies to the theoretical validity of the methodology. First, as far as possible, it is necessary to describe the construct constructively, meaningfully, to measure which it is intended. This is achieved through the formulation of hypotheses about it, prescribing what this construct should correlate, and with what it should not. After that, these hypotheses are checked. Such a method is most effective for the validation of personal questionnaires, since establishing a single criterion for their validity is difficult.

As a construct may be intelligence, personality traits, motives, settings, etc. An appeal to constructive validity is necessary in those cases when the results of diagnostic measurements are used not simply for predicting behavior, but for conclusions about the extent to which subjects have a certain psychological characteristic. At the same time, the measured psychological characteristic can not be identified with any observable behavioral feature, but represents a theoretical concept. Construct validity is important in the development of fundamentally new techniques for which external validity criteria are not defined.

Thus, to carry out the theoretical validation of the methodology - this is to prove its constructive validity, i.e. to establish that the methodology measures exactly that construct (property, quality), which it according to the plan of the researcher should measure. So, if a test was developed to diagnose the mental development of children, it is necessary to analyze whether it really measures this development, and not some other characteristics (for example, personality, character, etc.). Consequently, for theoretical validation, the cardinal problem is the relationship between psychological phenomena and their indicators, through which these psychological phenomena try to cognize. This check shows how much the author's intent and the results of the methodology are the same.

More often, the constructive validity of a technique is defined through its internal consistency, and also through convergent and discriminant validity.

Internal consistency reflects the extent to which the tasks and questions that make up the material of the methodology are subordinate to the main direction of what is measured as a whole, focused on the study of the same phenomenon. The internal consistency analysis is performed by correlating the responses to each task with the overall result of the methodology. So, if the test consists of tasks that showed a significant correlation with its total score, then the test is said to have internal consistency, since all of its tasks are subject to the construct represented in the test.

The criterion of internal consistency is also the correlation between the total score of the technique and the results of the execution of its individual parts. Tests, where the intellect serves as a construct, always consist of separately applied subtests (such as awareness, analogies, classifications, conclusions, etc.), from which the total score of the test is added. Significant correlations between the results of each subtest and the total score also indicate the internal consistency of the entire test.

In addition, contrasting groups are used to prove internal consistency, which are formed from subjects who showed the highest and lowest total results. The implementation of the methodology by a group with high results is compared with the performance by a group with low results, and if the first group copes with tasks better than the second, the technique is recognized internally consistent.

As Anastasi emphasizes, the criterion of internal consistency of the methodology is an essential measure of its homogeneity. Since this indicator helps to characterize the field of behavior or property, selectively verified by the technique, then the degree of its homogeneity has to do with construct validity. Of course, the intrinsic consistency of methodology alone can not say much about what it measures. However, in the presence of carefully worked out theoretical bases for the creation of a methodology and a well-founded scientific basis, this procedure reinforces the theoretical notions of its psychological essence.

Another way to determine constructive validity involves evaluating the methodology by two indicators that are opposite to each other. It is important to compare the indicators of the validated methodology, on the one hand, with the methods having the same theoretical construct, and, on the other hand, with the techniques having a different theoretical basis. For this, the procedure for estimating convergent and discriminant validity, proposed by DT Campbell and DV Viske, is used.

Convergent validity (from Latin - converge to one center, convert) is a conclusion about the similarity (isomorphism - homomorphism) of this method (methodology, test, measure) to another method intended for the same purposes (convergent, similar). It is expressed in the requirement of statistical dependence of diagnostic indicators, if they are aimed at measuring conceptually related mental properties of the individual.

Discriminant validity (from Latin - difference, difference) - conclusion about the difference between one method , test, measure) from another, theoretically different from the first. It is expressed in the absence of a statistical relationship between diagnostic indicators reflecting conceptually independent properties.

Convergent and discriminant validity are types of criterial validity. This category includes any validity types that are evaluated using an independent criterion that is the criterion of evaluation, comparison.

So, the procedure for evaluating convergent and discriminant validity consists in establishing both similarities and differences in psychological phenomena, measured by the new methodology, with already known techniques. It involves the use, along with the validated methodology of a special battery of control techniques, selected in such a way that it includes both methods, supposedly associated with the validated, and not associated with the pei. The experimenter must predict in advance which methods will be highly correlated with the validated one, and correlations with which techniques will be low. In accordance with this, the convergent validity is distinguished (checking the degree of proximity of direct or feedback) and discriminant validity (establishing the absence of communication). The methods, which, however, are highly correlated with the validated ones, are called convergent, and not correlated - discriminant.

Construct validity can be considered satisfactory if the coefficients of correlations of the validated methodology with the group of converging techniques are statistically significantly higher than the correlation coefficients with the group of discriminant methods.

Confirmation of the totality of theoretically expected links constitutes an important circle of information of constructive validity. In English psychodiagnostics, this operational definition of construct validity is denoted as assumed validity.

The presence of a correlation between a new and a similar technique, the validity of which has been proved before, indicates that the developed technique "measures" about the same psychological quality as the reference technique. And if the new method simultaneously turns out to be more compact and economical in carrying out and processing the results, the psychodiagnostics get the opportunity to use the new tool instead of the old one. This method is especially often used in differential psychophysiology when creating methods for diagnosing the basic properties of the human nervous system. A special place in the procedure for determining the construct validity is factor analysis (factor validity). It allows you to strictly statistically analyze the structure of the relationships of the indicators of the method being studied, determine their factor composition and factor loadings, reveal hidden signs and internal patterns of their interrelation.

So, in the theoretical validation of the technique, a variety of experimental procedures are required that contribute to the accumulation of information about the diagnosed construct. If these data support the hypothesis, then the psychological concept underlying the methodology is confirmed, and the ability of the technique to serve as a tool for measuring this concept. The more convincing the confirmation, the more definitely one can talk about the validity of the methodology in relation to the psychological concept underlying it.

An important role in understanding what the methodology measures, is the comparison of its indicators with practical forms of activity. But here it is especially important that the methodology is carefully worked out in the theoretical plan, i.e. that there is a solid, sound scientific basis. Then, when comparing the methodology with the external criterion taken from everyday practice, corresponding to what it measures, information can be obtained that supports the theoretical notions of its essence.

It is important to remember that if the theoretical validity is proved, then the interpretation of the obtained indicators becomes more clear and unambiguous, and the name of the methodology corresponds to the sphere of its application.

As for pragmatic validation, it means testing the methodology in terms of its practical effectiveness, relevance, usefulness, since it is only reasonable to use the diagnostic technique when it is proved, that the measured property is manifested in certain life situations, in certain types of activity. She is given great importance especially where there is a selection issue.

If you look at the history of development of testology, you can distinguish a period (1920-1930), when the scientific content of the tests and their theoretical "baggage" interested in a lesser degree. It was important that the test worked, helped quickly select the most prepared people. The empirical criterion for evaluating test tasks was considered to be the only correct benchmark in solving scientific and applied problems.

The use of diagnostic techniques with a purely empirical justification, without a clear theoretical basis, often led to pseudoscientific conclusions, to unjustified practical recommendations. It was impossible to name precisely those features, qualities that revealed, for example, tests. BMTeplov, analyzing the tests of that period, called them "blind tests".

This approach to the problem of the validity of the method was typical until the early 1950s. not only for the United States, but also for other countries. The theoretical weakness of the empirical methods of validation could not but cause criticism on the part of those scientists who, in the development of the methodologies, urged to rely not only on the "naked" empirics and practice, but also on the theoretical concept. Practice without theory is known to be blind, and a theory without practice is dead. Currently, the pragmatic assessment of the validity of the methodologies is perceived as the most productive.

To conduct a pragmatic validation of the methodology, i.e. to evaluate its effectiveness, effectiveness, practical significance, an independent external criterion - is usually used, an indicator with immediate value for a particular area of ​​practice. Such a criterion can be achievement (for tests of learning abilities, tests of achievements, tests of intelligence), and production achievements (for techniques of professional orientation), and the effectiveness of real activities - drawing, modeling, etc. (for tests of special abilities), and subjective evaluations (for personality tests).

American researchers D. Tiffin and E. McCormick, after analyzing the external criteria used to prove the validity, distinguish four types:

1) performance criteria (these may include such things as the amount of work performed, academic performance, time spent studying, the rate of qualification growth, etc.);

2) subjective criteria (they include different types of responses that reflect the person's attitude toward something or to someone, his opinion, views, preferences, usually subjective criteria are obtained through interviews, questionnaires, questionnaires) 3) physiological criteria (they are used in studying the influence of the environment and other situational variables on the human body and psyche, measuring the pulse rate, blood pressure, electrical resistance of the skin, symptoms of fatigue, etc.);

4) randomness criteria (used when the research objective is concerned, for example, with the selection problem for the work of such persons who are less susceptible to accidents).

The external criterion must meet three basic requirements: it must be relevant, free of interference (contamination), and reliable.

By relevance means the meaning of the diagnostic tool to an independent vital criterion. In other words, there must be a certainty that the characteristics of the individual psyche, which are measured by the diagnostic technique, are involved in the criterion. The external criterion and diagnostic technique should be among themselves in the internal semantic correspondence, be qualitatively homogeneous in the psychological essence. If, for example, the test measures individual features of thinking, the ability to perform logical actions with certain objects, concepts, then in the criteria it is necessary to seek the manifestation of precisely these skills. This applies equally to professional activities. It has not one, but several goals, tasks, each of which is specific and presents its conditions for implementation. This implies the existence of several criteria for the performance of professional activities. Therefore, it is not necessary to compare the success of the diagnostic method with the overall production efficiency. It is necessary to find a criterion that is correlated with the methodology by the nature of the operations performed.

If it is not known about the external criterion, whether it is relevant to the measured property or not, then comparison with it of the results of the psychodiagnostic technique becomes practically useless. It does not allow to come to any conclusions that could give an assessment of the validity of the methodology.

Requirements freedom from interference (contamination) are caused by the fact that, for example, educational or production success depends on two variables: from the person himself, his individual characteristics, measured by methods , and from the situation, the conditions of study, the work that can cause interference, "pollute" applicable criterion. To some extent avoid this, it is necessary to select for study such groups of people who are in more or less the same conditions. You can use another method. It consists in correcting the effect of interference. This adjustment is usually statistical. So, the performance should be taken ns in absolute values, and in relation to the average productivity of workers working under similar conditions.

When it is said that a criterion must have a statistically significant reliability, this means that it must reflect the constancy and stability of the function being studied.

The search for an adequate and easily identifiable criterion refers to very important and complex tasks of validation. In Western testology, many techniques were disqualified only because it was not possible to find a suitable criterion for their verification. For example, for most of the questionnaires, data on their validity are questionable, since it is difficult to find an adequate external criterion that corresponds to what they measure.

Evaluation of pragmatic validity of methodologies can be quantitative and qualitative.

To calculate the quantitative indicator - the validity coefficient - the results obtained using the diagnostic methodology are compared with the data obtained by an external criterion of the same persons. Different types of linear correlation are used (according to Spearman, according to Pearson).

How many subjects are needed to calculate validity? Practice has shown that they should not be less than 50, but it is better than more than 200. Often the question arises, what should the value of the validity coefficient be, so that it is considered acceptable? In general, it is noted that enough that the coefficient of validity was statistically significant. The low validity coefficient is of the order of 0.20 0.30, the average of 0.30-0.50 and the high of 0.60.

But, as A. Anastasi and KM Gurevich emphasize, and other authors, it is not always right to use linear correlation to calculate the validity coefficient. This method is justified only when it is proved that the success in some activity is directly proportional to the success in performing the diagnostic technique. The position of foreign testologists, especially those who are engaged in proficiency and vocational training, most often comes down to the unconditional recognition that the one who fulfilled the tasks in the test is more suitable for the profession. But it may also be that for success in activities you need to have a property at the level of 40% of the test solution. Further success in the test has no meaning for the profession. A clear example from the monograph of KM Gurevich: the postman should be able to read, but whether he reads at the usual speed or at a very high speed - this already has no professional significance. With such a ratio of the indicators of the methodology and the external criterion, the most adequate way to establish validity can be a criterion of differences.

As the experience of foreign testers has shown, no statistical procedure is able to fully reflect the diversity of individual assessments. Therefore, often to prove the validity of the techniques, another model is used - clinical estimates. This is nothing more than a qualitative description of the essence of the property being studied. In this case, we are talking about the use of techniques that are not based on statistical processing.

In modern psychometry, dozens of various methods for verifying the validity of diagnostic techniques due to their characteristics have been developed, as well as the temporary status of an external criterion. However, the following methods are most often called.

1. Validity by means that the methodology is valid according to experts. This technique is used, for example, in tests of achievements. Usually, the tests of achievements do not include all the material that the students have gone through, but some of its small part (3-4 questions). Is it possible to be sure that the right answers to these few questions indicate the mastery of all the material? This is what the content validity check should answer. To do this, the success of the test is compared with the teacher's expert assessments (for this material). Validity under the also approaches criterion-oriented tests, because they use expert methods. Specific is the object of examination - the content of the test. Experts should evaluate the content of the test assignments for their compliance with the psychic property declared as the content of the validated test. To this end, the experts are presented with a specification for the test and a list of tasks. If a particular job fully complies with the specification, then the expert designates it as corresponding to the content of the test. Sometimes this technique is called logical validity or validity by definition & quot ;. .

2. concurrentness & quot ;, or current validity, is determined using an external criterion by which information is collected simultaneously with experiments on the test method. In other words, data related to the present time are collected: academic performance, productivity in the same period, etc. They compare the results of success with the test.

3. Predictive validity (another name is prognostic validity). It is also determined by an external criterion, but information about it is collected some time after the test. Although this method is most consistent with the task of diagnostic techniques - predicting future success, it is very difficult to apply it. The accuracy of the diagnosis is inversely related to the time set for this prediction. The more time passes after measurement, the more factors that need to be considered when assessing the predictive value of the technique. However, it is practically impossible to take into account all the factors influencing the prediction.

4. Retrospective validity. It is determined based on a criterion that reflects events or the state of quality in the past. It can be used to quickly obtain information about the predictive capabilities of the technique. So, to check the extent to which good results of the test of abilities correspond to the rapid learning, it is possible to compare the previous assessments of academic performance, past expert opinions, etc. in people with high and low at the moment diagnostic indicators.

When summarizing the validity of the developed methodology, it is important to specify exactly what kind of validity is meant (by content, by simultaneity, etc.). It is also desirable to report information about the number and characteristics of the individuals on whom the validation was carried out. Such information allows the psychologist who uses the technique to decide how valid this technique is for the group to which it intends to apply it. As in the case of reliability, it must be remembered that in one sample the technique can have high validity, and in another - low. Therefore, if the researcher plans to use the technique on a sample of subjects that is substantially different from the one on which the validity test was carried out, it must be re-tested. The validity factor in the manual is applicable only to groups of subjects similar to those on which it was determined.

thematic pictures

Also We Can Offer!

Other services that we offer

If you don’t see the necessary subject, paper type, or topic in our list of available services and examples, don’t worry! We have a number of other academic disciplines to suit the needs of anyone who visits this website looking for help.

How to ...

We made your life easier with putting together a big number of articles and guidelines on how to plan and write different types of assignments (Essay, Research Paper, Dissertation etc)