Due Day 7: 5/20/13: Reliability and Validity Worksheet
Use this version of the matrix because it has been updated. You will receive a .5 deduction for using the older version.
PUT YOUR NAME HERE: _____________
Do NOT quote the text, explain these ideas in your own words.
University of Phoenix Material
Validity and Reliability Matrix
For each of the tests of reliability and validity listed on the matrix, prepare a 50-100-word description of the type of reliability/validity, its purpose and under what conditions these types of reliability and validity would be used as well as when they would be inappropriate. Then, prepare a 50-100-word description of each test’s strengths and a 50-100-word description of each test’s weaknesses. |TEST of |Description, Purpose, Application and |Strengths |Weaknesses |
|Reliability |Appropriateness | | |
|Inter-item | Inter-item consistency is the parallel of all | Test score calculations are more| Inter-item consistency cannot |
|Consistency |items on a scale calculated from one trial of a |accurate and clear when there is |measure intelligence or personality.|
| |test. It is used in assessing how consistent |a lot of consistency. Inter item |If the items are not homogenous with|
| |various raters and observers are of the same |consistency is great at measuring|the same difficulty and length, it |
| |phenomenon. When asking questions to research an |if a test is reliable and |would be ineffective determining |
| |idea this test can assess the response of the |consistent based on the length or|internal consistency. Even the |
| |test-taker against the idea. Different questions |shortness of a test. The |Spearman-Brown formula would fail. |
| |that test the same idea give consistent results. |inter-item consistency test can |Inter-item consistency works best on|
| |This is appropriate for example, in testing |show reliability over a period of|tests that are whole-test and long |
| |employee performance at different times over a |time. On the flip side, errors |in length rather than half-test uses|
| |period of time. Employers can use this to determine|among items can be broken down |or short test. |
| |if an employee is eligible for a raise or |and new ones can be added to | |
| |promotion. |reach a reliability measurement. | |
|Split-half | Split-half reliability randomly divides all items | Split-half reliability has its | It is not wise to divide a test in |
| |that mean to measure the same idea into two sets. |strength in being efficient and |half straight down the middle |
| |When it is difficult to measure reliability with |less tedious for test-takers than|because the content and difficulty |
| |two test or perform a test two times, split half |the parallel form. It measures |of questions will not be distributed|
| |reliability is suitable. It is appropriate with |internal consistency well. It |evenly. Many intermediary variables |
| |uneven random assignment splits need to be |also can check middle variables |are created such as fatigue during |
| |measured. It also can be used to create a small |that may cause an error in the |the second half of the test. |
| |parallel form of the same test. |analysis since the both portions |Deviations in difficulty and |
| | |of the test are taken at one |subjects of the items on the first |
| | |time. |part of the test compared to the |
| | | |second part. |
|Test/retest | Test-Retest reliability is about taking the same | Test-retest is strong in | Test-retest reliability is weak in |
| |test with the same people and two different times |reliability because the results |that the roots of an idea being |
| |to measure how stable an idea is over time. If an |measure an individuals reaction |tested can alter over time. It would|
| |idea being measured is supposed to change over a |time and perceived judgment. Such|produce sensitive results that make |
| |period then the scores would vary. It is |traits are stagnant and do not |the score of reliability appear |
| |inappropriate when measuring for example, computer |change a lot over time and are |lower than the actual measurement. |
| |skills of college students. A series of lessons |not sensitive to many intervening|For example, a college student may |
| |about computers would be on the first and second |variables. |have excellent skills when assessed |
| |test, then the test would show variance because of | |on using a HP computer but when |
| |the education provided to all testtakers. | |assessed on a MAC they could fail or|
| | | |when assessed on a computer from 15 |
| | | |years ago, they could falter. |
|Parallel and | Parallel and alternate forms that test reliability| It helps in determining what | Parallel and alternate forms are |
|alternate forms |use many occurrences of the same test items at two |questions are best to ask. It |very time consuming, cost a lot of |
| |separate times with the same test-takers. It is |measures the center idea through |money and bring fatigue for the |
| |appropriate in measuring traits that are stagnant |different variations on the same |test-taker because of the many |
| |over a long period of time and not effective when |test item. The reliability of a |changers of the same test questions |
| |measuring limited emotions or anxiety levels. |test increases when similar |over and over. These forms are not |
| |Parallel forms can be done with another form such |scores are on the same question |dependable to measure an idea that |
| |as split-half. |on many tests. |can alter over time. The tests can |
| | | |be taken months or even years apart |
| | | |causing intervening variables to |
| | | |impact the scores creating error |
| | | |variance. |
|Test of Validity |Description, Application and Appropriateness |Strengths |Weaknesses |
|Face validity | Face validity describes the particular view of a | Face validity’s strength is that| A weakness for face validity is its|
| |test-taker on the test’s validity. The measurement |a test taker has confidence in |inability to measure validity. A |
| |is not about the quantity of the actual validity |the validity of the test and is |test may look like it’s valid but |
| |but the test taker’s perception of the tests’ |more comfortable taking the test |not possess good ideas, long enough |
| |validity. It is appropriate when measuring the |or passing out the test to be |time, or be taken in a good |
| |confidence of a test taker. It measures what it is |taken. Otherwise, the test would |environment. |
| |supposed to measure. |be invalid. | |
| | | | |
| | | | |
| | | | |
| | | | |
|Content validity | Content validity is useful to test designers who | Strength for content validity | A pitfall for content validity is |
| |need to create test questions that match the |lies in that it can work in |potentially new material is prey to |
| |material being tested. It is appropriate for |reverse from job responsibilities|culture and linear changes. The |
| |college professors on a final exam. It is |to what is required for the job. |questions can have different answers|
| |ineffective for a test designer who wants new |First the questions must cover |in different fields of the world at |
| |people to have the same strengths as current |what needs to be performed the |different times. The items on the |
| |employees. |duties of the job, then a process|test have to be accurate all the way|
| | |to evaluate what an employee |around. |
| | |contributes to a position | |
| | | | |
| | | | |
|Criterion related | This method, criterion related validity, is very | A positive for criterion-related| A negative about the criterion |
| |strong in confirming validity. It is used to verify|validity is it can validate a |related validity is that it can |
| |criteria on a test and represent what is really in |test score. Using methods outside|contaminate the results. In the same|
| |the trial of test-takers who are tested. A group of|of the test to prove that the |way it can measure and diagnose a |
| |people, who have lost everything they owned from a |information on the test covers |personality disorder like |
| |natural disaster like a tornado, may all be |the subject matter that is |schizophrenia, a panel of |
| |diagnosed as depressed. If they all are tested |supposed to be covered. It is |psychiatrists would use the test |
| |using new questions and all score high for |more objective and verifiable |criterion and validity to measure. |
| |depression, then the test has proven validity. |that the previous methods and is | |
| | |a favorite. | |
| | | | |
| | | | |
| | | | |
|Construct | Other smaller types of validity are under | A strength for construct | A weakness for construct validity |
| |construct validity. This is appropriate when a test|validity is the steps used to |is there is no single idea or it is |
| |needs to measure an idea like intelligence or |verify an idea follow a |too vague. The results of the test |
| |anxiety. It is ineffective when an idea is not |particular scientific method. |will not be able to be measured |
| |clear or covers to broad a spectrum. |First a hypothesis is created, |accurately. The validity of the test|
| | |then a prediction is made and |on the idea will have no substance |
| | |then the results are measured. |or definition. |
| | |The predictions are based on | |
| | |facts and the test is used to see| |
| | |if the prediction is true. If it | |
| | |is not true then the test | |
| | |questions or idea may have to be | |
| | |reviewed. | |
You May Also Find These Documents Helpful
-
e. Indicate Y = yes or N = no if the measurement is within tolerance.…
- 452 Words
- 4 Pages
Satisfactory Essays -
Validity: Look at the population used for the VMQ and the populations for the tests used to evaluate the VMQ’s validity. Do you believe that the populations of the other tests are comparable…
- 514 Words
- 2 Pages
Good Essays -
* Criterion References testing measures an individual’s performance with respect to an expected criteria or established norm…
- 1207 Words
- 5 Pages
Better Essays -
If two or more individuals agree on some dimension and give a participant the same score, then that study possesses:…
- 854 Words
- 4 Pages
Satisfactory Essays -
This section should discuss the types of reliability for which there is evidence and the adequacy of this evidence to support potential uses of the test.…
- 2775 Words
- 12 Pages
Powerful Essays -
o Provide a set of test values that test the abnormal operation of this program segment.…
- 349 Words
- 2 Pages
Satisfactory Essays -
It is a method which is used to assess learner him/herself. It helps the learner to understand their own assignment before they go for final review. This help the student to analyze what has been learnt so far what are the improvement areas that they have to look after.…
- 1145 Words
- 5 Pages
Good Essays -
3. Evaluate the validity and reliability of the 2 tests that you have defined. (Word limit: 500 words) (6 marks)…
- 429 Words
- 2 Pages
Satisfactory Essays -
participants estimated the length of lines without hearing estimates of confederates. As we expected, participants in the first condition were less accurate in their estimates of line length, demonstrating the tendency to conform…
- 883 Words
- 4 Pages
Good Essays -
Key word “whether”.This is a two-tailed hypothesis test. Researcher wants to know “whether” the groups being compared differ, but does not predict the direction of the difference.…
- 1569 Words
- 7 Pages
Better Essays -
? Describe one verbal and one performance subtest of the Wechsler Adult Intelligence Scale (WAIS).…
- 310 Words
- 2 Pages
Satisfactory Essays -
| True comparability needs to meet three criteria: consistency, verification and unit measurement. (Baker & Baker, 2012)…
- 297 Words
- 2 Pages
Satisfactory Essays -
Evaluates the quality of each item Rationale: the quality of items determines the quality of test (i.e., reliability & validity) May suggest ways of improving the measurement of a test Can help with understanding why certain tests predict some criteria but not others…
- 3356 Words
- 14 Pages
Better Essays -
*Follow prescribed format for each presentation. See Dr. Cristine Esquivel-Saldivar at least a week before scheduled presentation for complete instructions. Wear OFFICE ATTIRE during presentation!…
- 249 Words
- 5 Pages
Satisfactory Essays -
When one or more of the measured values obtained within a set is/are different from the rest, the Q test…
- 1907 Words
- 8 Pages
Better Essays