Nature of Reliability Show
1. Reliability refers to the results obtained with an evaluation instrument and not to the instrument itself. 2. Reliability refers to a type of consistency. 3. Reliability is a necessary but not a sufficient condition for validity. 4. Reliability merely provides the consistency that makes validity possible. Interpreting Reliability 1. Group variability affects the size of the reliability coefficient. Higher coefficients result from heterogeneous groups than from homogeneous groups. 2. Scoring reliability limits test reliability. 3. All other factors being equal, the more items included in a test, the higher the test�s reliability. 4. Reliability tends to decrease as tests become too easy or too difficult. Factors Influencing Reliability 1. Length of Test 2. Spread of Scores 3. Difficulty of Test 4. Objectivity Methods of Estimating Reliability Test-retest Method Measure of Stability Give the same test twice to the same group with any time interval between tests, from several minutes to several years. Equivalent-forms Method Measure of Equivalence Give two forms of the test to the same group in close succession. Split-half Method Measure of Internal Give test once. Score two Consistency equivalent halves of test. Use the Spearman-Brown formula. Kuder-Richardson Measure of Internal Give test once. Score test and Method Consistency apply Kuder-Richardson formula. Nature of Validity 1. Validity refers to the appropriateness of the interpretation of the results of a test or evaluation instrument for a given group of individuals, and not to the instrument itself. 2. Validity is a matter of degree; it does not exist on an all-or-none basis. 3. Validity is always specific to some particular use or interpretation. 4. Validity is a unitary concept based on various kinds of evidence. Checking For Validity Content Validity � Examine the test to see if the questions correspond to what the user intended to test. Criterion-Related Validity � Scores from a test are correlated with an external criterion. � Concurrent Criterion-Related Validity � Two tests are given to the same group of examinees, one being the established test (criterion) and the other the new test. � Predictive Validity � Test that predicts some future behavior of an examinee. Content-Related Validity Classroom Instruction Determines which intended learning outcomes are to be achieved by pupils. � Achievement Domain Specifies and delimits a set of instructionally relevant learning tasks to be measured by a test. � Achievement Test Provides a set of relevant test items designed to measure a representative sample of the tasks in the achievement domain. Content Validity Example Instructional Objective: The student will write the capitals of all fifty states in the United States of America. Instructional Activities: 1. The student makes �flash cards� with the state�s name on one side and the state�s capital on the other side. The student will study these at home. 2. Given a map of the United States, the student will write the name of capital in the appropriate state. Test: The student will write the state capital�s name on a sheet of paper when the teacher reads the state�s name. Concurrent Validity Example Reading Achievement A multiple-choice test on reading achievement (which is designed to measure achievement at the time of testing and not designed to predict any future behavior) might be validated by comparing the scores on the test with teacher ratings of students� reading abilities. The teachers� ratings are the criterion. Factors Influencing Validity 1. The Test Itself. a. Unclear directions. b. Reading vocabulary and sentence structure too difficult. c. Inappropriate level of difficulty of the test items. d. Poorly constructed test items. e. Ambiguity. f. Test items inappropriate for the outcomes being measured. g. Inadequate time limits. h. Test too short. i. Improper arrangement of items. j. Identifiable pattern of answers. 2. Factors in Test Administration and Scoring. 3. Factors in Pupils� Responses. 4. Environment. Reliability & Validity Multiple-Choice 1. The correlation between test scores and a criterion is a measure of: a. causation b. objectivity c. reliability d. validity e. variability 2. The biggest obstacle to determining a test�s predictive validity is: a. devising tests with ingenious and well-constructed items b. administering the test under uniform conditions c. obtaining a sufficiently large sample of items d. obtaining a really adequate criterion measure 3. For which of the following tests would one be most exclusively interested in predictive validity? a. a biographical data bank being used in picking airplane pilots b. a measure of attitudes towards Communism c. a diagnostic test of reading comprehension d. an introversion-extroversion questionnaire 4. The type of validity that is most appropriate for aptitude tests is: a. content validity b. predictive validity c. concurrent validity d. face validity 5. You have devised a new measure called the PITSS and correlate it with an existing procrastination inventory. This is an example of: a. content validity b. predictive validity c. concurrent validity d. construct validity 6. On which of the following tests is content validity most appropriate? a. The Alpha Aptitude Battery b. The Beta Achievement Test c. The Gamma Personality Inventory d. The Delta Intelligence Test e. The Epsilon Test of Creative Ability 7. Which type of validity coefficient is most appropriately used for selection purposes? a. predictive b. concurrent c. construct d. content 8. Decreasing the time interval between predictor and criterion measures: a. increases the validity coefficient b. decreases the validity coefficient c. has no effect on validity d. none of the above 9. Which type of validity coefficient is most important of personality tests? a. predictive validity b. concurrent validity c. construct validity d. content validity 10. Comparing test items with objectives refers to which type of validity? a. content b. predictive c. concurrent d. construct 11. Requires a time interval for its determination: a. content validity b. predictive validity c. concurrent validity d. construct validity 12. The number of items on the predictor is cut in half. This: a. increases the validity coefficient b. decrease the validity coefficient c. has no effect on the validity coefficient d. cannot occur 13. The content validity of a teacher-constructed achievement test is: a. high if the teacher has matched items to objective b. usually unacceptable due to lack of expert input c. generally low despite the teacher�s knowledge of his class. d. About equivalent to that of similar standardized tests 14. Comparing a newly formed anxiety scale with an existing anxiety scale yields this type of validity coefficient. a. content validity b. predictive validity c. concurrent validity d. construct validity 15. The new IQ test you have devised is administered to a gifted class. Its results are then correlated with end-of-year grades. Compared with the correlation that would be obtained if it were correlated with grades from regular class students, this correlation would be: a. lower b. higher c. curvilinear d. about the same 16. To build reliability into a test, it is desirable to: a. write items of high difficulty level b. write items of various difficulty levels c. offer the poorer students rewards to heighten their attention d. write items in different areas of interest 17. For speeded tests, the split-halves procedure of determining reliability will usually yield estimates that are: a. impossible to interpret b. statistically unstable c. quite accurate d. too high 18. Instead of giving a test to a single grade level, it is administered to the whole school. The reliability will: a. increase b. decrease c. be unaffected d. be very unpredictable 19. This reliability coefficient is usually greater over a short time than a long term: a. test-retest b. alternate forms c. split-halves d. Kuder-Richarson 20. Which of the following examples is not a method of building reliability into a test? a. adding items of good quality b. administering the test to a heterogeneous group c. comparing the test with existing measures d. controlling the conditions of test administration 21. A teacher has just computed the reliability of a test she has made after a single administration. What kind of reliability did she compute? a. test-retest b. inter-rater c. internal consistency d. alternate forms 22. Administering a test in the morning, rather that the afternoon, will cause the reliability of the test to : a. increase b. decrease c. be questionable d. vary unpredictably 23. Erroneously adding five points to each score on a test will cause the reliability coefficient to: a. increase b. decrease c. remain the same d. vary unpredictably 24. You administer the Quick and Dirty Personality Test on January 1, 1984, and March 1, 1984, to the same group of subjects and correlate the results. This gives you an estimate of: a. test-retest reliability b. alternate forms reliability c. predictive validity d. concurrent validity 25. Involves the administration of two different tests at two different times: a. test-retest b. alternate forms c. split-half d. Kuder-Richardson What type of reliability is referred when there is one test given twice?Test-retest reliability measures the consistency of results when you repeat the same test on the same sample at a different point in time.
What type of reliability is where the test was administered twice to the same group with a time interval not to exceed 6 months?Test-retest reliability
The test-retest reliability method in research involves giving a group of people the same test more than once over a set period of time.
When the same measure is tested twice and shows the same result this example?Reliability refers to the consistency of a measure. 1 A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same.
What is test re test reliability repeated measures reliability?Test-Retest Reliability (sometimes called retest reliability) measures test consistency — the reliability of a test measured over time. In other words, give the same test twice to the same people at different times to see if the scores are the same.
|