Research - Validity of Measurements 6

measurement validity

extent to which an instrument measures what it is intended to measure - places emphasis on objectives of a test and ability to make inferences from test scores or measurements

examples of valid instruments

goniometer measures joint angles, ruler for length

validity addresses

discrimination, evaluation and predictions

validity implies

measurement is free from error - test is valid and reliable

a test can be reliable

but not valid i.e. measuring leg length to determine level of back pain

Measurement validity

Can inferences be made from test scores and measurements, What can be done with test results , Valid test=reliable test Invalid test=unreliable test

Specificity of validity.

The validity is only established within the context of some specific population. You have to provide whether the measurement they are taking are reliable, intra, inter rater, whether the instrument has test retest reliability if not it decrease the validity in the outcome.

4 different types of measurement validity

face, content, criterion related, construct

Face validity

Indicates that an instrument appears to test what it is supposed to. The weakest form of measurement validity. What you see is what you believe that is face value. In Face validity the concept must be well defined by usually done by direct observation.

Content validity

Indicates that the items that make up an instrument adequately sample the universe of content that defines the variables being measured. Most useful with questionnaires, examinations, interviews and inventories. The test should not be influenced by outside factors or irrelevant factors and everything in it should reflect everything you are trying to measure. This is considered subjective. It is usually established by experts in the field. There is not statistical method related to it. It is good for questioners

Criterion-related validity

most practical and objective approach, based on ability of one test to predict results obtained on an external criterion the test to be validated, target test, is compared with Gold standard or criterion measure already established and assumed to be valid The gold standard has to be established in order for you to do this. Realize that both this must be measure the same thing and then you correlate both of those.

Concurrent validity

Establishes validity when two measures are taken at relatively the same time. Most often used when this target lest is considered more efficient than the gold standard and, therefore, can be used instead of the gold standard. Concurrent= able to give both tests at the same time and able to correlate if the info and see if it the information is correct A new motion analysis and they were using old version beforehand. They ran the program 2x. This was used to see if they have the same gait characteristics.

Predictive validity

Establishes that the outcome of the target test can be used to predict future criterion score or outcome. If you take your target test, do you measurement, wait some time then give the criterion test and see how the 2 scores are. The first target test should be able to predict what your criterion test is.

Construct validity

Established the ability of an instrument to measure an abstract construct and the degree to which the instrument reflects the theoretical components of the construct. Construct=they are abstract in nature; you can't observe them or directly measure them. They are multidimensional. Therefore, a test the measure a construct is typically measuring one dimension not all the dimensions of the one test that would be impossible. Eg. There are different types dimensions of strength because there are different types of contractions isometric and isotonic, so it would only test strength in one dimension. The FIM defines functional status with respect to the amount of assistance given . Also, we could define pain with regards to intensity.

test of ROM, length, strength, tactile discrimination, sensation, gait and balance

face validity

separate items on a functional status scale - eating, dressing and transferring

face validity

face validity

is subjective and scientifically weak - based on the opinion of the investigator

examples of content validity

instructor preparing final exam, what range of activities represent functional, mcgill pain questionnaire

the criterion measure

is known or assumed to be a valid indicator of the variable of interest

the criterion and the target test must

measure the same thing

construct validation

is often necessary to validate the measurement of more abstract variables

two components of criterion related validity

concurrent and predictive

predictive validity example

berg balance or TUG as a predictor of fall risk, using SAT or GPA to predict future academic success

construct validity is partly based on

content validity

methods of construct validation

known groups method, convergence and discrimination, factor analysis, hypothesis testing, criterion validation

Known groups method

criterion is chosen that can identify presence or absence of particular characteristic. validity of test is supported if results document known differences. Stepping in place test as an indicator of falls in the elderly. Ppl with high scores would fall into the category of non-fall risk and low scores would fall in the risk of falls category.


When 2 tests measure the same things and they should give similar results or high correlation. Eg. TUG test give you this high correlation because theoretically they are measuring the same thing - must also show that a construct can be differentiated from other constructs


you could take 2 scores and do discrimination that don't measurement the same thing, and the correlation should be low or not have any correlation. If you are using 2 test that are not measuring the same thing because they are theoretically different. .

Factor analysis

looks at different components and grouping them with things that highest correlated. You could take all the factors in a test and take out what is the most impt

Hypothesis testing

validity can be assessed by using it to test specific hypothesis that support the theory. FIM

Criterion validation

Compare our score of one test vs. another tests scores as long as they are representing the same component of the construct. (This is more of comparing scores vs convergence is correlating scores.)