Instrumental conditioning
learning association b/w stimuli, behavior, and their effect on environment
What is the purpose of instrumental conditioning?
To obtain goal-oriented behavior in everyday life
What is an example of instrumental conditioning?
-work to get money
-disobey parents u grounded
-u plagerize, u fail
What is the difference between classical conditioning and instrumental conditioning?
Classical conditioning is learning associations between two stimuli's (S-S learning), whereas instrumental conditioning is learning associations between response and outcome
Explain Pavlov's dog experiment
-dog salivates when sees food
-Pavlov rings bell every time dog is going to be fed
-Eventually Pavlov's dog salivates upon hearing bell.
Explain E.L Thorndike's experiment
-food restricted cats
-goal was to escape in order to obtain the food
-there was a rope in a box (Stimulus) + pulling rope (response) -> escape (outcome)
-cat learned difference between stimulus/response (instrumental)
Law of effect (strengthened S-R) relationship)
response to a stimulus followed by a satisfying event
- early on rope means nothing until cat realizes that pulling rope causes u to obtain food.
-therefore the latency between escape time is shorter bc the cat now knows what causes him to escape.
Law of effect (weakened S-R relationship)
response to a stimulus followed by an annoying event
-in this part of the experiment, cats would pull the rope and would be shocked, but were still allowed to eat and obtain food
-the latency between escape time is longer bc cat does not want to be shocke
Discrete- Trial Approach
The response is performed ONCE and the behavior of the subject terminates the trial.
-timing determined by experimenter
ex: maze (when rat reaches end of maze and gets reward=over
-puzzle box
How is behavior measured in Discrete - Trial Approach
-time to find food
-latency to start the maze (move from start box)
-Choice behavior -T maze only
-running speed
Free- Operant Approach
Subject is free to respond at any time
-timing determined by subject
-response may be repeated many times
ex: slot machines, skinner box
Skinner box
a type of experiment in which the number of time u respond affects the outcome.
ex: number of lever presses for food
How does one produce a target response?
Through magazine training/ shaping
Magazine training
-step 1 in producing target response
-classical conditioning
-the sound of the magazine (CS+) followed by food (US) orients the organism
ex: packaging of dog treat makes sound that orients dog
Shaping
-step 2 in producing target response
-u reinforce closer actions to the right responses, never reinforce the earlier responses
Ex: a rat wont know to run the entire maze the first time, baby steps, reinforce behavior along the way.
A response can produce one of two outcomes
Appetitive stimulus or aversive stimulus
Appetitiive stimulus
A pleasant outcome (getting paid, food, sunshine)
Aversive stimulus
Negative outcome (shock, yelling, cold)
Positive contingency
Response turns on and causes an outcome
-ex: rat presses lever to get food
-you do something you get something
Negative contingency
Response turns off/ inhibits an outcome
-ex: rat turns off loud noise by pressing lever
-you do something, something stops
Positive reinforcement
positive contingency b/w response and appetitive stimulus
outcome: increased responding
ex: decreased responding
Punishment
positive contingency b/w response and aversive stimulus
outcome: decreased responding
ex: ticket for speeding
Omission Training
negative contingency b/w response and appetitive stimulus
outcome: decreased responding
ex: swearing leads to loss of TV, getting time out for doing something bad
Negative Reinforcement
negative contingency b/w response and aversive stimulus
-do something to remove bad stimulus
outcome: increased responding
Escape
-type of NR in which the stimulus is present at the time of behavior but its stopped by a response
-ex: u walk outside and it begins to rain, u run to cover or take umbrella out
Avoidance
-type of NR in which the stimulus is scheduled to happen but prevented due to your response
-ex: u know that if you don't pay your bill on time u get fined so you pay it on time
Barton, Bulle, & Repp
ran study on students with ASD, attempted to decrease hand flapping, used yummy snacks as reinforcers
What types of instrumental conditioning were used in Barton, Bulle & Repps study
-Omission training to decrease the hand flapping, if you hand flap something was taken away
-Positive Reinforcement is used for other behaviors
key elements of instrumental conditioning
-responses
-response-reinforcer relation
-reinforcer
-by directional
response
behavior performed
Stereotyped response
occurs when you reinforce a certain response
variable response
occurs when you reinforce variable different responses
Page and Neuringer (1985)
ran experiment on pigeons in which they pecked 2 keys 8 times to obtain food
-Control group: they could peck any key at any sequence = food
- Variability group: they had to peck both keys and not repeat = food
If you reinforce variability
creativity and diversity become encouraged
If you encourage the same response
variability becomes discouraged
Instinctive Drift
responses are impacted by instincts
Breland and Breland (1961)
came up with Insinctive Drift through various examples.
-tried to make racoons take coin and drop it in coin slot and reinforce them with food.
-racoons thought coins were food, instinctive drift
-eventually they begin to treat the coins as the rewards as
Behavioral Systems Theory
learning a response depends on compatibility with natural behaviors
What roles does Shettleworth say that food deprivation plays on behavior?
-deprivation of food = decrease in self care, increase in food seeking behavior
Reinforcer
outcome of the conditioning depends upon the reinforcer
Hutt (1954)
measured quality vs quantity of reinforcers
-thirsty rats trained to push lever for quantity (s,m,l) and quality (sour, water, milk,)
-showed that increases in quality and quantity produce increases in responding
Positive Contrast
-small quantity/ bad quality reward -> larger quantity/good quality reward
-results showed that at first responses were low, but increased as quality and quantity improved.
- elevated response for favorable reward resulting from prior experience w/ less a
Negative Contrast
-larger quantity/good quality reward -> small quantity/ bad quality reward
-results showed that responses were high at first shift to low
-depressed response for smaller reward than previously exposed to
What two factors contribute to response-reinforcer relationship
Contiguity and Contingency
Contiguity
How long after the response does the reinforcer occur
Contingency
does the response lead to a reinforcer
What is the most efficient way to use a reinforcer
immediately after the response is displayed
What is the purpose of a secondary reinforcer
it connects the correct response with the delayed reinforcer
ex: verbal prompts in coaching, saying good monty
Marking
separates subject's response so that it's aware that the response leads to an outcome; distinguishes a certain response from other activities
-marking is not associated with reinforcement, but rather choice
-usually accomplished by introducing a light/noi
Liberman et Al (1979)
study that showed that marked rats perform more correct responses
-rats that performed behavior were placed in white box, rats that did not were placed in black, both received reward at same delayed time.
-marked group learned association faster
Skinner
argued that contiguity > contingency bc accidental reinforcement occurs when contiguity is not enforced
Adventitious reinforcement
-accidental pairing of response with reinforcer
Intrim response
odd" behaviors occur intermediate to reinforcer (1/4 turn)
ex: if reinforcer was given every 12 s, b/w 4-8 the behavior was odd and difficult to shape
Terminal Response
food-related behaviors occur just before the reinforcer (pecking)
-pigeons would wait a while, as time went on would start pecking (when close to a reinforcer)
Staddon & Simmelhag
-experiment showed that pigeons develop quarter turns during interim
-developed magazine pecking as terminal response right before reinforcer