You get a bonus - 1 coin for daily activity. Now you have 1 coin

2.1. content validity - Means of control of the diagnostic

Lecture



Это продолжение увлекательной статьи про средства контроля диагностических качеств психологических тестов.

...

changed in order to determine certain topics repeated, a very slight correlation was obtained between the primary and subsequent studies. The same pattern was observed in the absence of special instructions. Most of the subjects tried to create new stories, make drawings and the like. Perhaps this happened because in the majority of projective tests in the explanations to the subject it was said: to study the characteristics of imagination, fantasy. The test subject, of course, does not want his data to be worse than others due to the fact that he gave the same answers in the initial and subsequent tests.

2. The validity of the psychological test

Turning to the consideration of issues related to the definition of indicators of the validity of the test, we note first of all that this category of psychological testing belongs to the small fragmentation of the complex knowledge of psychological diagnostics: there are only a few fundamental developments on this issue.

Validity (bid. Valid - valid, valid, suitable) is a complex characteristic of the method (test), which reflects information about the range of the phenomena studied, as well as the degree of representativeness of the research procedure in relation to them.

In a simplified and generalized formulation, the validity of a test is “a concept that indicates that a test measures and how well it does it” (A.Anastasi, 1982). In the standard requirements for pedagogical and psychological tests (Standarts ..., 1974), validity is defined as a collection of information about which groups of psychological personality traits qualifying judgments can be made, as well as information about the degree of validity of the latter on the basis of test scores or any then other means of evaluation. In modern psychological diagnostics, validity is considered as a mandatory and important score of information about the method, containing information about the degree of consistency of test data with other information about the person being examined (theoretical hope, observation, expert assessments, data of other methods, the psychological significance of which is established, and others. the totality of information about validity also includes judgments about the sufficiency of the validity of the prediction of the development of psychological quality, personality traits or characteristics, spheres of behavior.

The above testifies: the characteristics of the validity of the psychodiagnostic test are extremely complex, because these are complex descriptions of the content of the test as a diagnostic tool. And yet only the definition is formulated does not cover the entire set of essential features of this category of theoretical psychodiagnostics. The complex of information on validity should include information on the specific orientation of the methodology - a list and description of the contingents of subjects by age, education, professional characteristics, socio-cultural affiliation, and the like. In each of these specific cases, the orientation of the test varies somewhat. Such a change is also an element of the validity parameter. In addition, information on the validity of the test should include information on the adequacy of the activity model used to reflect the studied psychological characteristics, as well as on the degree of homogeneity of the tasks (subtests) included in the test, their congruence in quantifying test data as a whole.

Probably the most integral part of validity is the continuum of the properties under study. This aspect is dominant in determining the specific set of techniques that should be used in the study of any previously specified psychological properties. This part of the complex definition of validosnosgs, in our opinion, needs additional interpretation. Let's resort to an example. General information related to the name of the test is often insufficient for judging the scope of the test. This is only the name, "Name" of the specific research procedure. And not every such “Name” corresponds to the essence of the method, its specific purpose from the point of view of the studied psychological property. An example of a proofreading test widely known in various fields of applied psychology can be cited. The scope of the studied personality traits is based on the determination of stability and the degree of concentration of attention (psychomotor mobility). Correction test in its indicators is in good agreement with the results of other methods aimed at the same indicators (for example, from the Schulte table, Gorbova-Platonov, etc.). Thus, the proof test for the determination of these indicators has a high validity. At the same time, many other factors influence the performance of the proofreading test. Among them: neurodynamic features - psychophysiological or temperamental properties, indicators of short-term memory, RAM, monotony tolerance, reading skill development, visual acuity features, etc. Such factors can be measured by proof-reading, but this test is not specific to them. If we use a proofreading test to measure these indicators, its validity will be either slight or questionable.

So, outlining the scope of the method, the validity of the test reflects the level of validity of the measurement results. It is clear that in the case of a relatively small number of side factors affecting the final test result, the quantitative assessment of such a test is more accurate. An even greater degree of authenticity of the test data is provided by the complex of measured properties and their significance with respect to the criterion activity, the completeness and essence reflected in the test content of the measurement object. So, to meet the requirements of validity, a technique that is focused on professional selection has to absorb indicators of various qualities in nature. These indicators, however, should be important, most accurately reflecting the profession of a specific profession, the success of activities in which this technique is diagnosed (for example, attention level, memory features, psychomotor qualities, emotional balance, inclinations, and many others).

As you can see, the definition of validity covers a large amount of various information about the test. Different categories, as well as principles for organizing information on partial validity areas, form validity types (Fig.). These types of validity are specifically addressed in subsequent chapters. Before submitting this information, we note the subdivisions of validity are allocated only conditionally, and when we examine, we will see that the options for considering validity criteria overlap in their content and ways of definition.

Means of control of the diagnostic qualities of psychological tests

2.1. content validity

The validity of the content is one of the main types of validity, reflecting the degree of representativeness of the composition of the test assignments - in accordance with the studied mental properties or functions, which, in turn, can act as a complex psychological construct. In order to effectively measure such a complex psychological characteristic, it is necessary to reflect all its components as completely as possible. Thus, a psychological test can be understood as a combination of several sets of test tasks, each of which is focused on the study of one or several essential parts of the components of a psychological construct.

Information about the validity of the test in the content of practical psychological diagnostics usually have the greatest weight for the tests of achievement, exploring activities - close or such that coincides with the real, most often academic or professional.

Achievement tests are a type of psychodiagnostic methods focused on the qualifications of the achieved level of development of special abilities, skills and acquired knowledge in certain branches of human activity. These tests differ from tests of intelligence in that they reflect NOT so much the influence of accumulated experience, general abilities on behavior and solutions of a wide range of life tasks as they measure the influence of special training programs, vocational training on the efficiency of mastering one or another set of knowledge and the formation of various special skills . Thus, achievement tests are focused on the assessment of individual achievements after the completion of a certain stage of training.

Another feature of achievement tests, distinguishes them from intelligence tests, is their primary focus on measuring achievements during a survey, while research on general abilities is focused on forecasting achievements and predicting future development.

Achievement tests are the most numerous group of psychodiagnostic methods, both in terms of the number of specific tests and their variations. Among them are universal, widely oriented tests that are used to assess skills and knowledge within the main long-term areas of study (tests understanding scientific principles, literature perception tests, understanding technical diagrams, computer skills, etc.). Some of them are designed to measure the impact of learning on logical thinking, learning how to solve a wide range of tasks. The composition of the tasks and the content of the results of these tests are the closest to the tests of intelligence. Comprehensive Batteries Test General Abilities - Multiply Aptitude Batteries; General Aptitude Test Battery, GATB.

Another large group of tests of achievements consists of methodologies focused on analyzing the quality of studying specific curricula, in fact, individual subjects (achievements in reading, mathematics, computer science, etc.). There are more specialized tests of achievements used in the study of the assimilation of individual topics, parts of the curriculum.

Tests of achievements used in school and professional psychodiagnostics have significant advantages compared with the existing system of assessing academic achievement of students. their indicators are focused on the analysis of the assimilation of the cornerstones, elements of curricula, and not of a specific, often random body of knowledge, as is the case with the traditional school system. Thanks to the standardization of indicators, achievement tests allow a student to compare the level of achievement to the results in the academic group, as well as in any other sample. It is clear that such an assessment of the achievements of the subjects will be objective and relate to the prediction of the success of the assimilation of a particular area of ​​knowledge or profession. And this quality of achievement tests, as well as the insignificant expenditure of time and effort to conduct them in relatively large groups of applicants, makes achievement tests an extremely useful tool for conducting entrance exams. But in order for the test of achievements to objectively reflect the essential aspects of mastering a field of knowledge, it is necessary that the control tasks really concern important elements of educational material that ensure its understanding. For this purpose, it is necessary to analyze the validity of the test in content.

The main task of developing an adequate activity model to be tested is to clarify the question: does the selection of test assignments cover precisely those aspects of the phenomenon being studied that are leading, but are they selected in the corresponding real activity proportions?

Requirements for compliance with the validity of the content laid in the test at the beginning of its design. The first stage of validation is the definition of a range of properties and activities, which are investigated, as well as the separation of a complex ability (property) or activity into its components. The model of the test activity itself is developed at the second stage. The composition of the elements of the model is grouped in accordance with the most important elements of real activity. At the last stage, the analysis of the degree of representation in the developed model of real activity or property is carried out and the compatibility of the proportions of the elements of a complex activity in the test tasks is checked. So, for tests of achievement, specifically focused on analyzing the understanding of specific subjects, first complete systematic verification of textbooks and curricula is carried out, as well as consultations with specialist methodologists who are knowledgeable about the content of a specific branch of the curriculum. On the basis of the information collected in this way, a test specification is made up, which indicates the topics to be tested, the final goals that the specific topics are aimed at, as well as the relative weight of each of the topics to achieve the learning goal. The test specification is the rationale for the selection of specific tasks. These tasks are again evaluated by experts on the basis of their proximity to real-world requirements. Experts determine the final judgment on whether the test represents the essential skills and knowledge of the area of ​​study.

In cases of analyzing the validity of the content, expert assessments are widely used. Due to this, the procedure for determining the validity of the content is suitable for the analysis of criterion validity (see 2.2). But the essential difference between these types of validity is that expert judgments in examining the content belong to the test itself, while in the case of criterion validation such assessments are presented in accordance with the subjects.

An expert study of the adequacy of the content of the test tasks for validation can be supplemented by additional empirical procedures. So, you can check to what extent the test indicators of students who are just starting to study a certain field of knowledge, and those who already have significant experience in studying the subject, change. In the latter case, the test results, of course, will be significantly better, but provided that this test really reflects the essential components of the subject. Such a procedure, in turn, brings validation closer to the content of the procedure for determining construct validity (see 2.3), which is carried out using the age differentiation criterion. Note: the value of this criterion in analyzing the content is not in identifying the construct, which affects the answers, but only in asserting or rejecting certain hypotheses, determined the choice of a particular composition of tasks, their complexity, the sequence of placement in the test material and the like.

Together with tests of achievements, content analysis appears as one of the leading forms of validation of criterion-oriented tests and professional selection methods.

When validating personality tests and ability tests, the criteria for content validity are of limited use and use only at the initial stages of passing the test. Methods of personality research often do not have sufficient similarity in the studied area of ​​behavior to the same extent as in the case of achievement tests. Answers to the questionnaire, questionnaire, data from projective studies only indirectly allow to judge the real activity of the individual. The manifestation of personality traits, as well as the realization of abilities, are individualized. When performing a test of abilities, effective problem solving can be achieved through logical thinking, mechanical memory, psychomotor mobility, and the like. The result can be achieved in various ways and means. In addition, tests of abilities are not directly related to the study of specific branches of knowledge, as well as the development of a specific life and professional experience.

Consequently, the importance of the procedures for determining the validity of a test in terms of its content becomes important among other types of validity due to the mandatory use of achievements and criterion-oriented tests in the development of tests that, according to A. Anastasi (1982), constitute one of the most promising branches of applied psychological diagnostics.

2.2. empirical validity

The empirical validity criterion combines a set of different characteristics of the validity of the test are determined using the comparative statistical method. Эмпирическая валидность охватывает критериальную валидность и две ее разновидности - текущую и прогностическую, а также конструктной валидность.

Как уже отмечалось, при определении валидности содержания анализ качеств теста в основном производится с помощью качественных процедур, анализа информации дескриптивными методами при привлечении экспертов и других источников информации с целью суждения о степени совпадения заданий теста и содержания исследуемой психологического свойства. При определении эмпирической валидности всегда прибегают к средствам статистического корректировки и факторизации сравниваемых данных. Чаще всего осуществляется корреляционная анализ степени связи оценок контролируемого теста и показателей, отражающих некий внешний (независимый от самого психологического теста) параметр исследуемой психологической качества. Такая процедура типичная для определения критериальной валидности теста. При выяснении дискриминативнои и конвергентной конструктной валидности конечно прибегают к сравнению показателей контролируемого теста оценкам другого психологического теста, валидность которого уже известна (см. 2.2.3). Понятно, что при употреблении статистических средств анализа главными показателями, раскрывающие сущность валидности, становятся коэффициенты корреляции, то есть количественные коэффициенты валидности (особенности интерпретации коэффициентов валидности подробнее будут рассмотрены в разделе 2.3).

Ли самой распространенной процедурой определения эмпирической валидности психологического теста является исследование критериальной валидности. Главной проблемой в сравнении данных теста с независимым показателем (критерием) исследуемого качества является рациональный подбор этого критерия; по этим причинам рассмотрение критериальной валидности целесообразно начать вопросом о самом независимый критерий, содержание этого понятия, средства его отбора.

2.2.1. критерий валидизации

Критерием валидизации может быть какая-то непосредственная и независимая от теста валидизуеться, мера, прямо или косвенно отражает психическую свойство, личностные качества, проявления поведения, успешность деятельности и тому подобное. Иными словами, критерий валидизации - это показатель какой-то психической свойства, измеряется тестом, но полученный любым другим, отличным от котрольованого теста путем.

В практической психологической диагностике в качестве критерия валидизации чаще всего используют такие виды информации:

  • объективные социально-демографические и биографические данные (специальность, срок работы по специальности, образование, принадлежность к тем или других социальных слоев и групп, факты принятия или увольнение с работы, заключения и т.д.);
  • показатели успеваемости, академические оценки, суждения о достижения в учебе со стороны педагогов и родителей, тестовые результаты, успешность овладения учебным материалом;
  • производственные показатели в отдельных видах профессиональной деятельности, характеристики отношения к работе, трудовой дисциплины и т.д.;
  • выполнение реальной критериальной деятельности (рисование, моделирование, занятия музыкой, составление рассказов и т.п.);
  • медицинский диагноз, а также любые другие выводы специалистов и результаты консультаций;
  • контрольные испытания по полученных знаний и навыков; данные других методик и тестов, валидность которых считается выясненной.

Разновидности критериев валидизации могут быть вообще разделены на объективные (независимые от субъективных оценок, суждений людей), и субъективные критерии. Последние, к сожалению, чаще всего используют в практике критериальной валидизации психодиагностических методик. К таким критериям можно отнести академические оценки, характеристики характера и поведения личности, взглядов, отношений с другими людьми и т.д., которые выносятся экспертами (педагогами, врачами, руководителями производства, психологами и другими специалистами). Чаще всего эксперт осуществляет оценивания какой свойства с помощью специально разработанной шкалы. Объективизация экспертных критериев валидизации достигается за счет увеличения количества экспертов.

Известны четыре наиболее распространенные методы экспертной валидизации.

1. Коллективная оценка.

Using this method, experts together evaluate the object using a special scale. The selection of grading scales of assessments is most often carried out according to the principle of familiarity of assessments for an expert. So, for teachers it is convenient to score on a five-point scale. The condition of the collective assessment method is the development of a compromise solution with respect to the object under study. In public life, this method of invention solutions is perhaps the most common (meetings, conferences, etc.), but with this method of evaluation, the results mainly depend on the personalities and group interaction of experts.

2. Weighted average.

With this approach, each test subject is also assessed by a group of experts, but none of them are unknown judgments about the quality of the object, which are made by other experts. After the end of the assessment, the conclusions of the experts are averaged.

3. The ranking method.

With this approach, the group together or independently from each other is placed as a test subject in several groups "according to the principle of growth or reduction of manifestation of one quality or another. Instead of grouping, ranking lists can be composed. According to the first methods of assessment, final judgments are established by means of a compromise, or weighted average.

4. Paired comparisons

In determining the ranking position, experts often meet with the difficulty of identifying the manifestation of a test trait and qualifying its manifestation according to scale gradations. Such difficulties are especially characteristic of cases of assessment of undifferentiated traits (or this person is kind, intelligent, happy, etc.) or unskilled and lack of experience in evaluating another person in the experts themselves. In these circumstances, the pairwise comparison method has significant advantages. The task of the experts consists of the procedures for pairwise comparison of objects according to alternative features (“moving calm”, “cheerful-sad”). Another alternative distribution option is possible. Then the whole group of subjects is divided into pairs (as it is done in the sociometric table) and it is determined who of this pair has studied the quality is more pronounced. An indicator of the rank of this experimental subject is the number of his choices in the data of all experts. This indicator can take into account the number of experts and objects, as well as the percentage. For this calculation using the formula D. Guilford,%:

Means of control of the diagnostic qualities of psychological tests

where B is the number of choices; n is the number of objects of comparison; N is the number of experts.

The choice of validation criteria is a very crucial step in the design and validation of a test. One of the most difficult issues is to determine the degree of generalization of the criterion, is selected. The more complex the studied psychological quality, the wider and narrower the criterion should be. For example, in justifying a general intelligence test of validation criteria, one can use the effectiveness of the educational, industrial, or scientific activities of the subject as a whole; when validating the “extraversion - introversion” measurement methodology — expert assessments of such personality characteristics as communication intensity, social responsibility, emotionality, etc. Narrower and particular criteria are used in the procedures for substantiating tests focused on measuring the level of specific professional occupations and skills - such like manual dexterity, numerical memory and the like. The range of selected criteria closely matches the scope of psychological phenomena covered by the test, the homogeneity (homogeneity) of the tasks of the methods, the concreteness and unambiguous interpretation of the results. The stronger the validation criterion will be, the more dangerous (heterogeneous) test assignments may be, the more meaningful the interpretation of test results will be.

In real practical activity, the success or failure of a person in any one is not determined by some isolated factor, but by a number of combined factors determining its ability or inability. Therefore, each validation criterion is inherently multicomponent. So, during the validation of the test of intelligence should not be limited to only general indicators of future human activity. It is also necessary to take into account the peculiarities of aptitudes, the intensity of self-learning, some features of behavior, a temper of character, and the like.

From the given types of information, validation criteria are compiled, academic assessments are mainly used in validating intelligence tests; performance indicators in the criteria activities - with the validation of tests used for professional selection and vocational guidance; clinical diagnoses and expert consultations - in justifying some special questionnaires, which are most often used in clinical psychodiagnostics; when validating individual questionnaires, expert assessments of behavioral traits and emotional-volitional sphere are most often used.

2.2.2. initial validity

In criterion validity, they understand the complex characteristics of empirical validity, which reveal the degree of consistency of the test results with independent criteria reflecting the state of the measured psychological property. The initial validity includes a complex of links between test scores and the current state of the studied psychological phenomenon, as well as a link between test results and a more or less distant state. In the latter case, the test scores are compared with the validation criteria in the future, which characterize the development of psychological characteristics after a certain period of time after testing. Thus, they investigate not only the available diagnostic value of the test, but also the ability to foresee the prospects for the development of a psychological phenomenon. So, within the framework of criterion validity, it is possible to distinguish the so-called diagnostic (current) and prognostic validity. In the case of the characteristics of the current criterial validity, the study of the correlation of test scores with independent criteria is carried out in parallel with the survey. Correlation indices indicate the degree of representation of the leading factors of the studied psychological phenomenon in the test results. The predictive initial validity may reflect both the coincidence of the prediction made on the basis of the test data with the actual position of the phenomenon being studied after some time, and indicate the time interval in which the test prediction can be considered valid and scientifically justified.

We will consider the specific qualities of current and prognostic validity further, and now it will be expedient, in our opinion, to dwell on typical approaches to the study of criteria validity.

The most effective results are the initial validation, carried out according to the method of contrasting groups, which consists in using a complex hypothetical criterion, which reflects the complex effect of many factors. For example, in the case of validation of the test of intelligence test scores of children with mental retardation may coincide with the same indicators of normal schoolchildren of the same age. Many factors that determine the transition of a child to a special educational institution for persons with insufficient mental development (a complex hypothetical factor) are precisely the criteria for validation. Similarly, the initial validity of the personality questionnaire for measuring the “neurotic level” is determined on the basis of a comparison of measurement results in patients suffering from neurosis and practically healthy. Here, the hypothetical factor is the multitude of qualities that cause the diseases to be neuroses. The complexity of the approach to the characterization of criterial validity by the method of contrasting groups, selected on the basis of a hypothetical factor, brings this procedure closer to the definition of construct validity (see 2.2.3).

From the point of view of practical test validation, it is more efficient to compare test results with partial criterial measures of essential elements of the studied activity or property. Thus, the test of determining the professional abilities of an office clerk may include literacy analysis, arithmetic skills, combinatory thinking, business communication skills, and business documentation management skills. The validity of such a complex test of the nature of the tasks can be determined using the previously considered criterion of the hypothetical "success of work". Comparing test results with criterial measures of particular abilities will be more accurate and correct in determining validity. Information about current or prognostic validity, which is determined using a partial criterion, is a characteristic of the so-called synthetic validity. The process of synthetic validation consists of a detailed analysis of the test activity in order to identify its essential components, determining the specific weight of each of the elements in the complex hypothetical activity criteria. This approach allows us to apply not only the correlation statistical analysis, but also more complex and informative means of factor and regression analyzes.

The initial test validation can be carried out on the basis of a criterion that reflects the state of the quality studied in the past (retrospective validation). Using this technique, the indicators of the intelligence test can be compared with the criterion information even in childhood (information from parents about the features of the child’s mental development, behavior characteristics, academic assessments in elementary school, etc.). In this case, it is possible with certain assumptions to accept the hypothesis: if a person has developed successfully since childhood, then in subsequent years, her mental development will be somewhat ahead of the average. We note, due to a certain subjectivity of retrospective evaluations, such a validation tool is auxiliary and is used in cases where the test is not sufficiently validated according to the current or prognostic criterion.

So, the essence of criterion validity combines two groups of information about the test: the nature and degree of communication of the indicators of the methodology with the existing situation of the quality studied and the forecast of this relationship for the future. There are certain features in the understanding and definition of these two types of criteria validity. Let us dwell on them in more detail.

Current (diagnostic, competitive) validity. In a broad sense, current validity means the ability of a test to distinguish test subjects against the background of some diagnostic sign that is the object of research in this technique (for example, according to the level of general abilities, skills of verbal intelligence, level of achievements, extraversion, etc.). Due to this among the terms, are synonymous with current validity, is its definition as a diagnostic.

In a narrower sense, current validity is the determination of the relationship between the test score validation and the independent criterion, which reflects the state of the phenomenon under investigation in accordance with the time when the test was conducted.

A peculiar indicator that reveals the essence of current validity is the collection of information about how convenient and economical the available test is to provide information on the quality studied from other sources (observation, analysis of objective data, control tests, expert evaluation, etc.). The term “competitive validity” indicates on this property of current validity once. The competition lies in the fact that, thanks to the indicators of diagnostic validity, the question is solved: what could be simpler and more expedient - to conduct a survey of team members with the help of a test of professional achievements or to analyze traditional indicators such as quality of work, satisfaction, staff turnover.

The current validity criterion is one of the leading ones in characterizing the validity of any psychodiagnostic method, but high requirements for current validity are included in clinical tests aimed at clarifying the differential diagnosis, screening methods for pre-screening samples of subjects, achievement tests, methods for measuring general and special abilities. The leading means of determining the diagnostic validity and still remains the method of contrasting groups, followed by statistical analysis of the degree of convergence of test results; with the current independent criterion.

Predictive validity. Predictive validity is a complex of information about the method, reveals the degree of accuracy and validity of the judgment about the state of mental properties, which is diagnosed based on the test results some time after the measurement. Predictive validity reflects the time interval for which judgments about the state of psychological properties, made on the basis of the analysis of test indicators, remain valid. Information about prognostic validity is directly related to the disclosure of the alleged potency of the methodology, the patterns of extrapolation of the results of the survey for the future.

As a criterion for validation in determining the predictive validity can be used indicators of human behavior and abilities that are shifted in time relative to the test period. When comparing prognostic and current validity, it should be taken into account that, along with the convergence of the principles of criterion comparisons in the first and second cases, the analysis of the prognostic characteristics of a test is much more complicated than the problem. The accuracy of the forecast, which is done on the basis of the test results, is inversely related to the forecast period. Justifying the remote extrapolation of test data requires consideration of a much larger number of factors than when qualifying the diagnostic power of a test. Moreover, during the term, it is set for validation, these factors can change intensively, the course of their changes may fall under the age features of the development of mental phenomena.

The definition of predictive validity for tests aimed at measuring complex properties and activities, complex psychological constructs, such as general abilities, integrative qualities of a person, level of achievements in educational and professional activities, and so on, is very difficult. The development of these psychological properties, largely depends on the acquired knowledge and skills, can change radically depending on the circumstances of the life and activities of the individual. A particular problem is the prediction of activity, consisting of elements that are in a complex dynamics in age development. So, when predicting achievements in mastering reading in the early stages of learning, the leading indicator is reading speed. With the improvement of reading skills, the main characteristic is the level of understanding of the text, the development of which becomes dominant in learning. From this point on, the technique based only on tempo indicators loses its predictive value, since it is based on the levels of development of the activity that have already been passed.

In cases of diagnosing the effectiveness of teaching younger schoolchildren in a complex of indicators on which the forecast can be based, mechanical memory occupies a large place. In the upper grades, the predictive value of this indicator decreases markedly, yielding to the organized memory of the analytical knowledge assimilation. The given examples indicate the need for an in-depth analysis of the psychological constructs underlying the tests, an understanding of their development, a clear idea of ​​the meaning of the subjects being studied, for future criterion activity. The need for such an analysis also brings the procedure for determining the predictive validity to the analysis of the validity of the construct.

In case of remote criterial comparison, there is a risk to get an inadequate idea of ​​prognostic validity if outdated conditions are used during repeated criteria. So, an objective indicator of professional activity on a remote comparison may be too simple. For example, the test of achievements in the analysis of its diagnostic validity was compared with the success of the simplest production operation, with a varying degree of success being performed at the beginning of the occupation of the profession by the whole group of subjects from the validation sample. If during remote validation we compare test results for students with performance indicators after a certain professional training, then the initial operation, which served as the criterion before training, will now become very simple, almost all subjects from the validation sample can cope with it well. It is clear that a comparison with a new level of independent criterion will be more correct. This will give a reliable dispersion of the success of the subjects under the influence of vocational training. The definition of a new, more complex criterion can be based on the study of the statistical distribution of criterion indicators with remote validation. So, when analyzing the diagnostic validity, we can take as a criterion a task that, for example, successfully copes with, for example, 50% of individuals from the sample. After the training, we again have to pick up such a task, which the same proportion of the sample successfully copes with. This task will be the criterion for remote predictive validation.

In some cases, the current validity can replace to a certain extent the prognostic validity. Sometimes it is unprofitable to overly delay validation by studying long-term criterial measures of the studied psychological property on a single sample of validation. To speed up the validation procedure, a test can be given with a group for which there is already criterial data. For example, student survey data can be compared with their academic performance, employee testing data with their production success. In some cases, to quickly obtain information about the predictive quality of the test, you can resort to the procedure of retrospective validation.

One of the most reliable means of gathering information about the predictive capabilities of the test is the cohort method. Consider this approach by example. Testers face the task of checking the extent to which test results for studying certain aspects of emotional states in patients are prognostic in terms of the likelihood of psychosomatic diseases occurring in them, that is, it allows a certain test result to reason about the increased risk of such a disease. To do this, a group of individuals is selected with a certain margin, taking into account the likely volume of contrasting groups in the future. With epidemiological data, for example, it becomes known that within three years from 1000, 57 people can get sick. This means that the preventive diagnostics of the masses to be covered by about 2,000 subjects in order to obtain the number of the diseased group of about 100 people. The prognostic potential of the test in this case will be confirmed, based on the reliability of the quantitative and qualitative difference in the test results conducted during the recruitment of "cohorts" among those who are sick and healthy.

The determination of predictive validity is mandatory for proper use and interpretation of the test. Naturally, the criterion of prognostic validity is most important for methods directly or indirectly aimed at predicting the development of a particular psychological construct, activity. These include, first of all, tests of general abilities, career guidance techniques, selection tests and the like.

The importance of predictive validity indicators for the analysis of test procedures intended for the implementation of professional selection is enhanced by the introduction of a special term of incremental validity. This indicator of prognostic validity gives information on how much more likely the correct choice on the background of test results is compared to a random choice or choice, is carried out on the basis of traditional means (analysis of personal records, interviews, control tests, etc.).

Consideration of the question of criterial validity must be supplemented with a very significant aspect of criterion matching. This refers to the so-called contamination criterion - a complex of phenomena associated with the influence of information about the results of psychodiagnostic examination of an experimental subject on the subjective attitude towards him by other persons. So, if the teacher becomes aware that the student has not successfully completed the task of the achievement test from a separate academic subject (which basically indicates a lack of mastery of certain sections of the program), this may affect the expert assessment of the student’s educational activities in general. The effect of contamination of the criterion will be the greater, the greater authority and confidence this technique enjoys among subjects of psychodiagnostic information (see the next section, “Obvious validity”). A significant role in the manifestation of the contamination of the criterion is played by the presence of certain expectations and considerations that have been formed for one reason or another by experts in general abilities and the success of the training and professional activity of the tested person.

The phenomenon of contamination of the criterion can play a negative role in any areas of psychodiagnostic studies. The presence of this phenomenon emphasizes the need to comply with the norms of ethics in psychodiagnostic studies. When conducting criteria-based control, based on subjective assessments of the experimental person’s qualities from the validation sample, it is necessary to observe the mandatory rules: information on the results of psychological examinations should be unknown not only to experts, but also to a psychologist who collects expert information.

2.2.3. constructive validity

One of the leading types of validity, reflecting the degree of representation in the test results of the test with the help of a certain theoretical construct, is constructive validity. A construct is understood as a certain synthetic factor of personality, behavior, abilities, and the like. As a psychological construct can act: practical or verbal intelligence, emotional instability, introversion, speech comprehension, attention instability and the like. In other words, in the process of characterizing constructive validity, it is determined which branch of the theoretical structure of psychological phenomena is measured using tests.

Since the manifestations of constructs in human behavior and activities are diverse and are not necessarily associated with a certain quality, establishing constructive validity is broader and is not as concretely defined by the means of solving a problem as the characteristic of constructive validity and validity of the content. To explain the connection between the results obtained using the test and the theoretical construct, it is usually necessary to gradually accumulate information about the nature of the test results. It uses a wide range of data describing the psychological essence, the dynamics of the measured property, its convergence and differences from other psychological phenomena.

Among the specific methods of construct validity characterization, first of all, it is necessary to point out a comparison of the test being examined for constructive validity with other methods whose constructive load is known. The correlation between a new and similarly known test is regarded as an indicator that the test being developed measures roughly the same sphere of behavior, abilities, psychological qualities, and reference method. Such a validation procedure resembles the definition of criteria validity, if we assume that the benchmark test acts as an independent criterion.

Directly related to the determination of constructive validity is factor analysis, which allows you to statistically identify the structure of the relationship of the test indicators with other or still unexplained factors, find out the general and specific for the group of compared test factors and the degree of their presence in the results (that is, establish the factor composition and factor load test results). The exceptional importance of such a procedure gives grounds for highlighting it in a special kind of construct validity - factor validity.

An important aspect of constructive validity is internal consistency, which determines how much individual items (tasks, questions) that make up the test material are subordinate to the main focus of the test as a whole, focused on the diagnosis of the same constructs. Analysis of internal consistency is carried out by correlating the answers to each task with the overall test result. Note that the criterion of internal consistency indicates only the degree of connection of the whole context of the methodology for some unknown construct, without giving information about what exactly the nature of this measured feature.

The complex of information about the constructive validity of the methodology includes data that traditionally belong to the field of criterion validity and validity of the content. Thus, the criteria used in validation, containing significant information for disclosing behavior, quality abilities, presented in the test in the form of a construct. The connection with practical forms of activity, the probability of predicting real behavior are also extremely important for the characterization of construct validity. However, constructive validity takes a qualitatively higher level in the qualification of the test due to the fact that it characterizes the area of ​​measurable behavior in broad psychological categories. Thanks to the data of constructive validity, we can interpret test results, their variance, and establish the diagnosis from a scientifically sound position, introducing the measured quality into the system of psychological categories.

The urgent need for an in-depth analysis of the psychological construct can be illustrated by the example of two popular questionnaires, namely, the J. Taylor Discrimination Scale of Anxiety (MAS) and H. Ayzenck's personal questionnaire (ERI). Correlative studies indicate that the MAS Scale positively correlates with the scale of “neuroticism” of the ERE questionnaire and negatively with the scale of extraversion. From the point of view of the Aysenk concept, this data can be interpreted as evidence of the low validity of the MAS scale: “anxiety” correlates not only with the relevant factor of neuroticism, but also with an irrelevant factor of introversion. Thus, the MAS Scale is simply insensitive to a special kind of “neuroticism” - the anxiety of extroverts. Concerning the Scale questions, statements in which anxiety extroverts could be excluded. Despite the theoretical meaning, which is attributed to the MAS indicators by K.Spence and J.Taylor, this situation is quite natural and is by no means an artifact. According to K. Spence, who tried to transfer Hull’s theory of learning to human behavior, MAS measures the overall level of drive - a non-specialized train, and reaches its maximum with convergence of neuroticism (specific activation by Aysenko) and introversion (non-specific activation). So, as you can see, the name of the test does not always fully reflect the theoretical status of the measured construct. In this example, the role of psychological theory, which underlies the methodology in disclosing the content of indicators, is especially emphasized, is diagnosed with its help.

One of the typical methods of analyzing the constructive validity of tests of general abilities is the characteristic of age-related differentiation of test results, reflecting the dynamics of changes of the criteria considered using the test, depending on the age of the subjects. Анализ конструктной валидности реализуется через определение степени соответствия результатов теста теоретически ожидаемым и практически наблюдаемым возрастным изменениям исследуемого конструкта.Если в группе испытуемых из выборки валидизации наблюдаться прогрессивное увеличение показателей, по своим параметрам приближаются к известной скорости развития свойства интеллекта, которая измеряются данным тестом, то это в определенной степени подтверждением того, что тест направлен на исследование обозначенной свойства интеллекта.

Определение конструктной валидности методом возрастной дифференциации получило наибольшее значение при валидизации тестов, разработанных для измерения психологических свойств и функций, которые отличаются относительно быстрым изменением под влиянием индивидуального опыта, а также заметной иерархичностью ступеней развития (осведомленность, навыки, интеллектуальные операции и т.д.). Учитывая это метод возрастной дифференциации становится едва ли не самыми главными оценочным критерием валидации тестов интеллекта, особенно предназначенных для детского возраста. В таких методиках закономерное повышение результатов выполнения заданий теста в каждой последующей возрастной группе является основным психометрическим принципу диагностики, базой для образования психометрической шкалы. Через характеристику степени точности определения возрастных этапов развития исследуемых конструктов прослеживаются связь метода возрастной дифференциации методом определения диагностической валидности. Анализ возрастной дифференциации имеет существенное значение и для методик, использующих в клинической психодиагностике с целью дифференциальной диагностики возрастных и патологических изменений, а также для обследования лиц пожилого возраста.

В некоторых случаях критерий возрастной дифференциации отражает связь с прогностической валидностью методики. Наличие высоких показателей по критерию возрастной дифференциации, означающие стабильную смену исследуемых свойств при переходе от одной возрастной группы в другую, повышает точность индивидуального прогноза.

Следует, однако, учитывать, что за критериальной валидизации, как правило, существует потребность в высокой степени корреляции между показателями теста и критерием валидизации, тогда как при анализе конструктной валидности высокая степень связи результатов контролируемого и эталонного тестов вовсе не обязателен . Если же новый и эталонный тесты практически идентичны по составу и конструктной нагрузкой, а также если новый тест не является компактным и экономичным, это будет свидетельствовать только дублирование методики, оправдано только с точки зрения потребностей составления параллельных форм или модификации теста. Содержание процедуры конструктной валидизации заключается в установлении как сходимости, так и различия исследуемого комплекса психологических феноменов по сравнению с эталонным тестом.

При анализе конструктной валидности методики формулируется ряд гипотез относительно того, как будет коррелировать разрабатываемый тест с широким кругом других методик, ориентированных на измерение конструктов, которые находятся в теоретически известной или предполагаемой корреляции со свойствами, которые измеряет контролируемый тест. При этом принимаются во внимание результаты тестов не только имеющих

продолжение следует...

Продолжение:


Часть 1 Means of control of the diagnostic qualities of psychological tests
Часть 2 2.1. content validity - Means of control of the diagnostic
Часть 3 2.4. obvious validity - Means of control of the diagnostic


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Mathematical Methods in Psychology

Terms: Mathematical Methods in Psychology