What type of reliability is obtained when the same test is given twice to the same group of examinees?

Nature of Reliability

Nội dung chính Show

What type of reliability is referred when there is one test given twice?
What type of reliability is where the test was administered twice to the same group with a time interval not to exceed 6 months?
When the same measure is tested twice and shows the same result this example?
What is test re test reliability repeated measures reliability?

1. Reliability refers to the results obtained with an evaluation instrument and not to the instrument itself.

2. Reliability refers to a type of consistency.

3. Reliability is a necessary but not a sufficient condition for validity.

4. Reliability merely provides the consistency that makes validity possible.

Interpreting Reliability

1. Group variability affects the size of the reliability coefficient. Higher coefficients result from heterogeneous groups than from homogeneous groups.

2. Scoring reliability limits test reliability.

3. All other factors being equal, the more items included in a test, the higher the test�s reliability.

4. Reliability tends to decrease as tests become too easy or too difficult.

Factors Influencing Reliability

1. Length of Test

2. Spread of Scores

3. Difficulty of Test

4. Objectivity

Methods of Estimating Reliability

Test-retest Method Measure of Stability Give the same test twice to the

same group with any time

interval between tests, from

several minutes to several years.

Equivalent-forms Method Measure of Equivalence Give two forms of the test to the

same group in close succession.

Split-half Method Measure of Internal Give test once. Score two

Consistency equivalent halves of test. Use

the Spearman-Brown formula.

Kuder-Richardson Measure of Internal Give test once. Score test and

Method Consistency apply Kuder-Richardson

formula.

Nature of Validity

1. Validity refers to the appropriateness of the interpretation of the results of a test or evaluation instrument for a given group of individuals, and not to the instrument itself.

2. Validity is a matter of degree; it does not exist on an all-or-none basis.

3. Validity is always specific to some particular use or interpretation.

4. Validity is a unitary concept based on various kinds of evidence.

Checking For Validity

Content Validity � Examine the test to see if the questions correspond to what the user intended to test.

Criterion-Related Validity � Scores from a test are correlated with an external criterion.

� Concurrent Criterion-Related Validity � Two tests are given to the same group of examinees, one being the established test (criterion) and the other the new test.

� Predictive Validity � Test that predicts some future behavior of an examinee.

Content-Related Validity

Classroom Instruction

Determines which intended learning outcomes are to be achieved by pupils.

�

Achievement Domain

Specifies and delimits a set of instructionally relevant learning tasks to be measured by a test.

�

Achievement Test

Provides a set of relevant test items designed to measure a representative sample of the tasks in the achievement domain.

Content Validity Example

Instructional Objective:

The student will write the capitals of all fifty states in the United States of America.

Instructional Activities:

1. The student makes �flash cards� with the state�s name on one side and the state�s capital on the other side. The student will study these at home.

2. Given a map of the United States, the student will write the name of capital in the appropriate state.

Test:

The student will write the state capital�s name on a sheet of paper when the teacher reads the state�s name.

Concurrent Validity Example

Reading Achievement

A multiple-choice test on reading achievement (which is designed to measure achievement at the time of testing and not designed to predict any future behavior) might be validated by comparing the scores on the test with teacher ratings of students� reading abilities. The teachers� ratings are the criterion.

Factors Influencing Validity

1. The Test Itself.

a. Unclear directions.

b. Reading vocabulary and sentence structure too difficult.

c. Inappropriate level of difficulty of the test items.

d. Poorly constructed test items.

e. Ambiguity.

f. Test items inappropriate for the outcomes being measured.

g. Inadequate time limits.

h. Test too short.

i. Improper arrangement of items.

j. Identifiable pattern of answers.

2. Factors in Test Administration and Scoring.

3. Factors in Pupils� Responses.

4. Environment.

Reliability & Validity Multiple-Choice

1. The correlation between test scores and a criterion is a measure of:

a. causation

b. objectivity

c. reliability

d. validity

e. variability

2. The biggest obstacle to determining a test�s predictive validity is:

a. devising tests with ingenious and well-constructed items

b. administering the test under uniform conditions

c. obtaining a sufficiently large sample of items

d. obtaining a really adequate criterion measure

3. For which of the following tests would one be most exclusively interested in predictive validity?

a. a biographical data bank being used in picking airplane pilots

b. a measure of attitudes towards Communism

c. a diagnostic test of reading comprehension

d. an introversion-extroversion questionnaire

4. The type of validity that is most appropriate for aptitude tests is:

a. content validity

b. predictive validity

c. concurrent validity

d. face validity

5. You have devised a new measure called the PITSS and correlate it with an existing procrastination inventory. This is an example of:

a. content validity

b. predictive validity

c. concurrent validity

d. construct validity

6. On which of the following tests is content validity most appropriate?

a. The Alpha Aptitude Battery

b. The Beta Achievement Test

c. The Gamma Personality Inventory

d. The Delta Intelligence Test

e. The Epsilon Test of Creative Ability

7. Which type of validity coefficient is most appropriately used for selection purposes?

a. predictive

b. concurrent

c. construct

d. content

8. Decreasing the time interval between predictor and criterion measures:

a. increases the validity coefficient

b. decreases the validity coefficient

c. has no effect on validity

d. none of the above

9. Which type of validity coefficient is most important of personality tests?

a. predictive validity

b. concurrent validity

c. construct validity

d. content validity

10. Comparing test items with objectives refers to which type of

validity?

a. content

b. predictive

c. concurrent

d. construct

11. Requires a time interval for its determination:

a. content validity

b. predictive validity

c. concurrent validity

d. construct validity

12. The number of items on the predictor is cut in half. This:

a. increases the validity coefficient

b. decrease the validity coefficient

c. has no effect on the validity coefficient

d. cannot occur

13. The content validity of a teacher-constructed achievement

test is:

a. high if the teacher has matched items to objective

b. usually unacceptable due to lack of expert input

c. generally low despite the teacher�s knowledge of his class.

d. About equivalent to that of similar standardized tests

14. Comparing a newly formed anxiety scale with an existing

anxiety scale yields this type of validity coefficient.

a. content validity

b. predictive validity

c. concurrent validity

d. construct validity

15. The new IQ test you have devised is administered to a gifted

class. Its results are then correlated with end-of-year grades.

Compared with the correlation that would be obtained if it

were correlated with grades from regular class students, this

correlation would be:

a. lower

b. higher

c. curvilinear

d. about the same

16. To build reliability into a test, it is desirable to:

a. write items of high difficulty level

b. write items of various difficulty levels

c. offer the poorer students rewards to heighten their attention

d. write items in different areas of interest

17. For speeded tests, the split-halves procedure of determining

reliability will usually yield estimates that are:

a. impossible to interpret

b. statistically unstable

c. quite accurate

d. too high

18. Instead of giving a test to a single grade level, it is

administered to the whole school. The reliability will:

a. increase

b. decrease

c. be unaffected

d. be very unpredictable

19. This reliability coefficient is usually greater over a short time

than a long term:

a. test-retest

b. alternate forms

c. split-halves

d. Kuder-Richarson

20. Which of the following examples is not a method of building

reliability into a test?

a. adding items of good quality

b. administering the test to a heterogeneous group

c. comparing the test with existing measures

d. controlling the conditions of test administration

21. A teacher has just computed the reliability of a test she has

made after a single administration. What kind of reliability did

she compute?

a. test-retest

b. inter-rater

c. internal consistency

d. alternate forms

22. Administering a test in the morning, rather that the

afternoon, will cause the reliability of the test to :

a. increase

b. decrease

c. be questionable

d. vary unpredictably

23. Erroneously adding five points to each score on a test will

cause the reliability coefficient to:

a. increase

b. decrease

c. remain the same

d. vary unpredictably

24. You administer the Quick and Dirty Personality Test on

January 1, 1984, and March 1, 1984, to the same group of

subjects and correlate the results. This gives you an estimate

of:

a. test-retest reliability

b. alternate forms reliability

c. predictive validity

d. concurrent validity

25. Involves the administration of two different tests at two

different times:

a. test-retest

b. alternate forms

c. split-half

d. Kuder-Richardson

What type of reliability is referred when there is one test given twice?

Test-retest reliability measures the consistency of results when you repeat the same test on the same sample at a different point in time.

What type of reliability is where the test was administered twice to the same group with a time interval not to exceed 6 months?

Test-retest reliability The test-retest reliability method in research involves giving a group of people the same test more than once over a set period of time.

When the same measure is tested twice and shows the same result this example?

Reliability refers to the consistency of a measure. 1 A test is considered reliable if we get the same result repeatedly. For example, if a test is designed to measure a trait (such as introversion), then each time the test is administered to a subject, the results should be approximately the same.

What is test re test reliability repeated measures reliability?

Test-Retest Reliability (sometimes called retest reliability) measures test consistency — the reliability of a test measured over time. In other words, give the same test twice to the same people at different times to see if the scores are the same.