regarding how the score of the instrument should relate to other concepts” (Krysik & Finn, 2013, p.263). In addition, there are two different components of construct validity which include convergent and discriminant validity. Convergent validity is “measures of constructs that theoretically should be related to each other are, in fact, observed to be related to each other. That is, you should be able to show a correspondence or convergence between similar constructs. Discriminant validity is measures of constructs that theoretically should not be related to each other are, in fact, observed to not be related to each other. That is, you should be able to discriminate between dissimilar constructs” (Trochim, 2006, p.1).
Convergent validity uses correlation which is a statistical test that measures the strength between two sets of numbers. “Researchers use correlation as a means to test convergent validity and the correlation coefficient the measure of the strength of the correlation ranges from 0 to 1. The number 1 represents perfect correlation, and 0 represents no relationship. The closer the correlation coefficient is to 1, the stronger the relationship, and the greater the evidence of convergent reliability” (Krysik & Finn, 2013, p.263). Discriminant validity, looks at how the measure may not correlate with variables that have been hypothesized to be unrelated such as “researchers may assume that self-esteem should be weakly correlated with variables such as age, gender, geographic location, and ethnicity” (Krysik & Finn, 2013, p.263). One type of validity looks at the convergence which are related and the other looks at how they are unrelated and discriminant. Pros and Cons of Measurement Validity
“Face validity is an important consideration in evaluating how a particular culture will respond to a specific measure.
For instance, the concept of elder abuse can differ substantially from one culture to another. Items that measure elder abuse in one culture may be interpreted differently to another” (Krysik & Finn, 2013, p.262). As a result, the differences in culture could affect the outcomes such as understanding the questions and the willingness to cooperate. Face validity is easy to apply and because the measure can be done in a simple manner it saves time. However, some researchers consider face validity not to be a true form “because it is concerned only with whether the instrument appears to measure what it purports to measure. Face validity cannot determine whether the instrument is actually measuring the concept” (Krysik & Finn, 2013, p.262). As a result, since it cannot be determined if the instrument is measuring the concept it could interfere with the result because outside factors might be the
cause. “Construct validity defines how well a test or experiment measures up to its claims. It refers to whether the operational definition of a variable actually reflect the true theoretical meaning of a concept” (Shuttleworth, 2009, p.1). The advantages of using construct validity is it if a concept is hypothesized and the pattern of correlations matches the relationships it shows the instrument does have construct validity and if the pattern deviates it shows the instrument is measuring something other than the concept. However, the disadvantages include the threat of “hypothesis guessing, evaluation apprehension, and researcher expectancies and bias” to name a few examples. (Shuttleworth, 2009, p.1). Hypothesis guessing is when a person guesses what the test might be and this can alter their behavior or response. Evaluation apprehension is when a person may feel under pressure and as a result, this could improve or deter a response and researcher bias “can lower construct validity by clouding the effect of the actual research variable” (Shuttleworth, 2009, p.1). However, all research can have threats and by using different methods such as observation, questionnaires, and other types of testing helps minimize the risks.
Two Ways to Determine Measurement Reliability
“Reliability refers to how consistent the results are each time a measure is administered. In contrast to validity, reliability is concerned with the introduction of random error into a research study. That is, the only time the outcome on a measure should fluctuate is when some real change has occurred in the target concept” (Krysik & Finn, 2013, p.264). There are five basic tests for researchers to use to determine reliability they include interrater, test-retest, parallel forms reliability, split-half reliability, and internal consistency. The two types that will be defined and evaluated are test-retest reliability and internal consistency. Test-retest reliability can be a useful assessment for a survey instrument. “Test-retest reliability provides information on how consistent a measure is when it is administered twice in a relatively short time frame. The appropriate length for that time frame will vary with both the instrument and the target population” (Krysik & Finn, 2013, p.266). In addition, the interval needs to be extended so that the subject cannot recall the answers and short enough to make sure that no changes occur in the variable that is being measured. If the results between the two tests are similar, the instrument has good test-retest reliability. Similar to interrater reliability, “researchers can calculate a correlation coefficient using the two set of scores to represent the degree of consistency between the two administrations” (Krysik & Finn, 2013, p.266). Furthermore, using test-retest reliability might be a useful measure for doing research for the