645 Part One Test

IQ Test

Mental Age divided by chronological age X 100.


Belief in selective breeding in order to improve the human race. Used tests such as Army Alpha and Army Beta to attempt to measure intelligence, instead had cultural bias and measured achievement rather than raw intelligence. Terman, Galton, and Yerkes

Test Worthiness

how good a test actually is based on validity, reliability, cross-cultural fairness, and practicality


relationship between two sets of scores, range between -1.00 to +1.00. The closer to 0 shows no relationship between the variables.

Positive Correlation

correlation that moves in the same direction

Negative Correlation

what type of correlation moves in opposite direction?


Graph showing two or more sets of test scores


Bar graph of class intervals and frequency of a set of scores

Frequency Polygons

Line graph of class intervals and frequency of a set of scores


evidence supporting the use of test scores- How well does a test measure what it's supposed to measure

Content Validity

Evidence that test items represent the proper domain- Is the content of the test valid for what type of test that it is?

Face Validity

Superficial appearance of a test- not true validity

Criterion-Related Validity

Relationship between test scores and another standard

two types of criterion related validity

concurrent validity and predictive validity

Concurrent validity

Relationship between test scores another currently obtainable benchmark, "here and now"- example asking test takers family how much they test taker drinks

Predictive validity

Relationship between test scores and a future standard

Construct Validity

Evidence that an idea or concept (construct) is being measured by a test. (such as intelligence)

Four types of construct validity

experimental design validity, convergent validity, discriminant validity, factor analysis. What type of validity?

Experimental Design Validity

Using experimentation to show that a test measures a concept

Convergent validity

relationship between a test and other similar tests (You don't want too high of a correlation because the test would be exactly the same). -positive correlation between your test and another one similar in nature


amount of freedom from measurement error- consistency of test scores

Test-Retest Reliability

Relationship between scores from one test given at two different administrations. (consistency after two administrations)

Alternate, Parallel, or Equivalent forms of Reliability:

Relationship between scores from two different versions of the same test - challenge is making sure tests are equal in difficulty and content

Internal Consistency

Reliability measured statistically by going "within" the test.

Split-half reliability or Odd-Even reliability

Correlating one half of a test against the other half or doing even or odd

Coefficient alpha or Kuder-Richardson

Reliability based on a mathematical comparison of individual items with one another and total score. A type of internal consistency


Feasibility considerations in test selection and administration. Time, Cost, Format, Readability, Ease of Administration, Scoring, and Interpretation

Cross-cultural Fairness

Degree to which cultural background, class, disability, and gender do not affect test results

Sources of picking out a test

Buros Mental Measurements Yearbook:
Tests in Print: companion volume
What does these books help do?

Buros Mental Measurements Yearbook

Provides reviews of many tests to include information of the construction, validity, reliability and usefulness of the test.


The Education Act for All Handicapped Children
-passed in 1975 and assured children would get free education, least restrictive environment (LRE), get Individualized Education Plan (IEP), and right to be tested

IDEA, Individuals with Disabilities Act,

assures right to be tested for learning disabilities in
school. Accommodations for his/her disability in the least restrictive environment

FERPA (Buckley Amendment)

Affirms right to access test records in the school possible.

ADA Americans with Disabilities Act

accommodations for testing must be made. Prohibits discrimination in employment, public services, public transportation, public accommodations, and telecommunications for individuals with disabilities.

Griggs v. Duke Power Company:

asserted that tests used for hiring and advancement at work must show that they can predict job performance for all groups.

Carl Perkins Act

Ensures access to vocational assessment, counseling, and placement. (disabilities, economically disadvantaged, nontraditional fields, single parents, displaced homemakers, individuals with limited English proficiency

Freedom of Information Act

Affirms right to access federal and state records

Army Alpha/Army Beta

First modern group test- used during WWI, bias and cultural Unfairness. and there is a the language-free version of the test (form boards and mazes).


Standard score with a mean of 0 and a SD of 1


Instruments that yield scores based on collected data- a subset of assessment

Assessment of ability

tests that measure what a person can do in the cognitive realm

Achievement Testing

tests that measure what one has learned

Types of achievement tests

what type of tests are survey battery tests, diagnostic tests, readiness tests?

Survey Battery Tests

Tests, usually given in school settings, which measure broad, content areas. Often used to assess progress in school.

Diagnostic Tests

Tests that assess problem areas of learning. Often used to assess learning disabilities.

Readiness Tests

Tests that measure one's readiness for moving ahead in school. Often used to assess readiness to enter first great.

Aptitude Testing

Tests that measure what one is capable of learning.

Types of Aptitude Tests

Multiple aptitude test
Specialized aptitude
intellectual and cognitive functioning tests
cognitive ability tests

Intellectual and Cognitive Functioning Tests

Tests that measure a broad range of cognitive functioning in the following domains: general intelligence, intellectual disabilities, giftedness, and changes in overall cognitive functioning. Includes intelligence testing that leads to an IQ score and neur

Cognitive Ability Tests

Tests that measure a broad range of cognitive ability. These tests are usually based on what a student has learned in school and are useful in making predictions about the future. Type of aptitude test.

Special Aptitude Tests

Tests that measure one aspect of ability. Often useful in determining the likelihood of success in a vocation. Type of aptitude test

Personality Assessment

Tests in the affective realm use to assess habits, temperament, likes and dislikes, character, and similar behaviors.

Types of personality assessments

Objective personality tests, projective personality tests, interest inventories, informational assessment instruments

Objective Personality

Multiple choice and true-false test that assess various aspects of personality. Often used to increase client insight, to identify psychopathology, and to assist in treatment planning.

Projective Personality Tests

Tests that present a stimulus to which individuals an respond. Often used to identify psychopathology ad to assist in treatment planning.

Interest Inventories

Tests that measure likes and dislikes as well as one's personality orientation toward the world of work. Generally used in career counseling.

Informational Assessment Instruments:

Often developed by the user, these tests tend to assess broad areas of ability or personality and tend to be specific to the testing situation.


Founder of the Strong Vocational Interest Blank- derivative still used today. Most well-known interest inventory


Developed famous Rorschach Inkblot test

Strong Vocational Interest Blank- derivative

Most well-known interest inventory


Leader in vocational counseling, aptitude tests


Developed Thematic Apperception Test, which asks a subject to view a number of standard pictures and create a story that explains the situation.


Used word associations to identify mental illnesses


Created Woodworth's Personal Data Sheet which is the First modern personality inventory used in WWI

Woodworth Personal Data Sheet

first modern personality inventory used in WWI, ancestor of personality tests. Screened for mental illness


Developer of the Stanford Achievement Test, pioneer in modern day education and psychological testing


Brought statistics to mental testing- coined term mental test


Used language to identify intelligence- forerunner to "Verbal IQ", idiocy and intellectual disabilities


Developed one of the first psychological laboratories


Examined relationship of sensory motor responses to intelligence, concept of correlation coefficient (strength of relationship among individuals)


Created first modern intelligence test, "subnormal children" into French schools


Chairman of the committee that developed the Army Alpha


Enhanced Binet's work to create Standford-Binet Intelligence Test, created IQ (intelligence quotient)


Developed SAT to equalize educational opportunities


Developed the form board to increase motor control- forerunner of "performance IQ

Standard Error of Measurement:

The range of scores where we would expect a person's score to fall if he or she took the instrument over and over again´┐Żin other words, where a "true" score might lie. It is calculated by taking the square root of 1 minus the reliability and multiplying t


Arithmetic average of a set of scores


Score where 50% fall above and 50% below


Most frequently occurring score


Difference between highest and lowest score plus 1

The median

n a skewed distribution, which is a better measure of central tendency (mean median or mode)

Interquartile Range

Middle 50% of scores around the median

Standard Deviation

How scores vary from the mean

Factor analysis

another method to show construct validity, demonstrates the statistical relationship among subscales or items of a test.


A method of comparing raw scores to a norm group by calculating the percentage of people falling below an obtained score, with ranges from 1 to 99, and 50 being the mean.


Which is not an important "rule" highlighted in the text?
a) Raw scores are meaningless
b) God did not role dice with the universe
c) Normative scores are only meaningful if you compare them to raw scores *
d) Z-scores are golden
e) Don't mix apples and o

Developmental Scores

Direct comparison of an individual's score to the average scores of others at the same age or grade level. Examples include age comparison and grade equivalent scoring.

Age comparison score

Comparison of individual score to average score of others at the same age

Grade equivalent

Comparison of individual score to average score of others at the same grade level


Observing an individual in order to develop a deeper understanding of one or more specific behaviors (e.g., observing a student's acting-out behavior in class or assessing a client's ability to perform eye-hand coordination tasks as a means of determining

Rating Scales

Scales developed to assess any of a number of attributes of the examinee. Can be rated by the examinee or someone who knows the examinee well (e.g., rating a faculty member's teaching ability or a student's ability to make empathic responses)

Classification Methods

A tool whereby an individual identifies whether he or she has, or does not have, specific attributes or characteristics (e.g., from a list, checking adjectives that seem to be most like you).

Environmental Assessment

A naturalistic and systemic approach to assessment in which information about clients is collected from their home, work, school, or other places through observation, self-reports, and checklists.

Records and Personal Documents

Items such as diaries, personal journals, genograms, and school records that are examined to gain a broader understanding of an individual.

Performance-Based Assessment

The evaluation of an individual using informal assessment procedures based on real-world activities that are not highly loaded for cognitive skills. These procedures are seen as an alternative to standardized testing (e.g., a portfolio).

observation, rating scales, classification methods, environmental assessments, records & personal documents, performance-based assessment

What are some types of Informal Assessment Instruments?

Discriminant validity

when something shows a lack of relationship between a test and other dissimilar tests.


a broad array of evaluative procedures that yield information about a person

T score

mean of 50 SD 10


used for IQ tests mean 100 SD 15


used in schools and achievement tests mean 5 SD 2

sten scores

used for personality inventories mean 5.5 SD 2

norm curve equivalent

used in educational settings mean 50 SD 21.06


mean 500 SD 100


mean 21 SD 5

Publisher type scores

test developers generate their own score