Reinforcment Increases the Likelihood a Behavior Will Occur Again

Learning Objectives

By the end of this department, you will be able to:

  • Define operant conditioning
  • Explain the difference between reinforcement and punishment
  • Distinguish between reinforcement schedules

The previous section of this chapter focused on the type of associative learning known as classical workout. Remember that in classical conditioning, something in the environs triggers a reflex automatically, and researchers train the organism to react to a different stimulus. Now we turn to the second type of associative learning, operant workout. In operant conditioning, organisms larn to associate a beliefs and its outcome ([link]). A pleasant issue makes that behavior more likely to exist repeated in the futurity. For example, Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in the air when her trainer blows a whistle. The consequence is that she gets a fish.

Classical and Operant Conditioning Compared
Classical Conditioning Operant Conditioning
Conditioning arroyo An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus somewhen becomes the conditioned stimulus, which brings most the conditioned response (salivation). The target behavior is followed by reinforcement or penalization to either strengthen or weaken it, and so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing The stimulus occurs immediately before the response. The stimulus (either reinforcement or penalisation) occurs presently after the response.

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and information technology doesn't account for new behaviors such as riding a bike. He proposed a theory nearly how such behaviors come about. Skinner believed that behavior is motivated by the consequences nosotros receive for the behavior: the reinforcements and punishments. His idea that learning is the issue of consequences is based on the law of effect, which was start proposed by psychologist Edward Thorndike. According to the law of effect, behaviors that are followed past consequences that are satisfying to the organism are more probable to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings virtually a desired result, the organism is more than likely to do it again. If an organism does something that does non bring about a desired upshot, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and frequently the primary reason) we show up for work is because we go paid to do so. If we stop getting paid, we will likely stop showing up—fifty-fifty if we dear our job.

Working with Thorndike's law of event as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms larn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a "Skinner box" ([link]). A Skinner box contains a lever (for rats) or disk (for pigeons) that the brute can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and a food dispenser.

(a) B. F. Skinner developed operant conditioning for systematic study of how behaviors are strengthened or weakened according to their consequences. (b) In a Skinner box, a rat presses a lever in an operant conditioning sleeping room to receive a nutrient reward. (credit a: modification of work by "Silly rabbit"/Wikimedia Commons)

In discussing operant conditioning, we utilize several everyday words—positive, negative, reinforcement, and penalty—in a specialized style. In operant conditioning, positive and negative exercise not mean expert and bad. Instead, positive means yous are adding something, and negative means you are taking something away. Reinforcement means you are increasing a behavior, and penalisation ways yous are decreasing a beliefs. Reinforcement can be positive or negative, and penalization tin can also be positive or negative. All reinforcers (positive or negative) increase the likelihood of a behavioral response. All punishers (positive or negative) decrease the likelihood of a behavioral response. At present let's combine these four terms: positive reinforcement, negative reinforcement, positive punishment, and negative punishment ([link]).

Positive and Negative Reinforcement and Punishment
Reinforcement Penalisation
Positive Something is added to increase the likelihood of a behavior. Something is added to subtract the likelihood of a behavior.
Negative Something is removed to increase the likelihood of a behavior. Something is removed to decrease the likelihood of a behavior.

REINFORCEMENT

The most effective style to teach a person or fauna a new behavior is with positive reinforcement. In positive reinforcement, a desirable stimulus is added to increase a behavior.

For example, you lot tell your five-twelvemonth-old son, Jerome, that if he cleans his room, he will get a toy. Jerome quickly cleans his room because he wants a new art set. Let'south pause for a moment. Some people might say, "Why should I reward my child for doing what is expected?" But in fact we are constantly and consistently rewarded in our lives. Our paychecks are rewards, as are high grades and acceptance into our preferred school. Being praised for doing a expert job and for passing a driver's examination is too a reward. Positive reinforcement equally a learning tool is extremely effective. It has been institute that ane of the most effective ways to increase accomplishment in schoolhouse districts with below-average reading scores was to pay the children to read. Specifically, second-form students in Dallas were paid $2 each fourth dimension they read a book and passed a short quiz about the book. The result was a significant increase in reading comprehension (Fryer, 2010). What do you call back about this programme? If Skinner were alive today, he would probably think this was a peachy idea. He was a strong proponent of using operant conditioning principles to influence students' behavior at school. In fact, in addition to the Skinner box, he also invented what he called a teaching machine that was designed to reward modest steps in learning (Skinner, 1961)—an early forerunner of computer-assisted learning. His pedagogy auto tested students' knowledge as they worked through various school subjects. If students answered questions correctly, they received immediate positive reinforcement and could go on; if they answered incorrectly, they did non receive any reinforcement. The idea was that students would spend additional time studying the material to increase their take chances of being reinforced the next time (Skinner, 1961).

In negative reinforcement, an undesirable stimulus is removed to increase a behavior. For case, automobile manufacturers use the principles of negative reinforcement in their seatbelt systems, which become "beep, beep, beep" until yous fasten your seatbelt. The annoying sound stops when you exhibit the desired beliefs, increasing the likelihood that you will buckle upward in the future. Negative reinforcement is also used ofttimes in horse training. Riders apply pressure—by pulling the reins or squeezing their legs—and so remove the pressure when the horse performs the desired behavior, such as turning or speeding upwards. The pressure is the negative stimulus that the horse wants to remove.

PUNISHMENT

Many people confuse negative reinforcement with punishment in operant conditioning, but they are ii very dissimilar mechanisms. Remember that reinforcement, fifty-fifty when information technology is negative, e'er increases a behavior. In contrast, punishment always decreases a behavior. In positive punishment, you add an undesirable stimulus to subtract a behavior. An example of positive punishment is scolding a student to get the pupil to stop texting in grade. In this case, a stimulus (the reprimand) is added in social club to decrease the behavior (texting in class). In negative penalisation, you remove a pleasant stimulus to decrease a beliefs. For example, a driver might nail her horn when a light turns light-green, and keep blasting the horn until the car in front end moves.

Punishment, peculiarly when it is immediate, is one manner to subtract undesirable beliefs. For example, imagine your iv-year-old son, Brandon, runs into the busy street to get his ball. You give him a time-out (positive punishment) and tell him never to go into the street again. Chances are he won't repeat this behavior. While strategies like fourth dimension-outs are common today, in the by children were oft subject to physical punishment, such as spanking. It'southward important to be enlightened of some of the drawbacks in using physical punishment on children. First, penalty may teach fearfulness. Brandon may become fearful of the street, only he besides may become fearful of the person who delivered the penalisation—you, his parent. Similarly, children who are punished by teachers may come up to fright the instructor and try to avoid school (Gershoff et al., 2010). Consequently, most schools in the U.s.a. have banned corporal punishment. Second, penalty may cause children to become more aggressive and decumbent to hating behavior and delinquency (Gershoff, 2002). They see their parents resort to spanking when they become angry and frustrated, so, in plough, they may act out this same behavior when they become angry and frustrated. For instance, because y'all spank Brenda when y'all are angry with her for her misbehavior, she might start striking her friends when they won't share their toys.

While positive penalisation can be constructive in some cases, Skinner suggested that the use of punishment should be weighed against the possible negative effects. Today'southward psychologists and parenting experts favor reinforcement over penalisation—they recommend that yous take hold of your child doing something practiced and advantage her for information technology.

Shaping

In his operant conditioning experiments, Skinner often used an arroyo called shaping. Instead of rewarding only the target beliefs, in shaping, we advantage successive approximations of a target beliefs. Why is shaping needed? Remember that in guild for reinforcement to work, the organism must first brandish the behavior. Shaping is needed because it is extremely unlikely that an organism will display annihilation but the simplest of behaviors spontaneously. In shaping, behaviors are cleaved down into many small-scale, doable steps. The specific steps used in the process are the following:

Reinforce whatever response that resembles the desired behavior.
So reinforce the response that more closely resembles the desired behavior. You lot will no longer reinforce the previously reinforced response.
Adjacent, begin to reinforce the response that even more than closely resembles the desired behavior.
Go on to reinforce closer and closer approximations of the desired beliefs.
Finally, simply reinforce the desired behavior.

Shaping is frequently used in teaching a complex beliefs or chain of behaviors. Skinner used shaping to teach pigeons not only such relatively simple behaviors as pecking a disk in a Skinner box, merely also many unusual and entertaining behaviors, such as turning in circles, walking in figure eights, and fifty-fifty playing ping pong; the technique is commonly used by brute trainers today. An important part of shaping is stimulus discrimination. Recall Pavlov'south dogs—he trained them to respond to the tone of a bong, and not to like tones or sounds. This discrimination is as well important in operant conditioning and in shaping behavior.

It's easy to see how shaping is effective in educational activity behaviors to animals, but how does shaping work with humans? Permit's consider parents whose goal is to have their child learn to clean his room. They use shaping to help him master steps toward the goal. Instead of performing the entire job, they set these steps and reinforce each footstep. First, he cleans up one toy. Second, he cleans up v toys. Tertiary, he chooses whether to pick up ten toys or put his books and dress abroad. Fourth, he cleans upwards everything except two toys. Finally, he cleans his entire room.

Primary AND SECONDARY REINFORCERS

Rewards such equally stickers, praise, money, toys, and more than can exist used to reinforce learning. Let's go back to Skinner'due south rats again. How did the rats larn to press the lever in the Skinner box? They were rewarded with food each fourth dimension they pressed the lever. For animals, food would be an obvious reinforcer.

What would exist a adept reinforce for humans? For your daughter Sydney, it was the promise of a toy if she cleaned her room. How virtually Joaquin, the soccer player? If you gave Joaquin a piece of candy every time he made a goal, yous would be using a primary reinforcer. Chief reinforcers are reinforcers that have innate reinforcing qualities. These kinds of reinforcers are non learned. H2o, food, sleep, shelter, sex, and touch, among others, are master reinforcers. Pleasure is also a primary reinforcer. Organisms do not lose their drive for these things. For nearly people, jumping in a cool lake on a very hot solar day would exist reinforcing and the cool lake would be innately reinforcing—the water would cool the person off (a physical need), as well as provide pleasure.

A secondary reinforcer has no inherent value and only has reinforcing qualities when linked with a primary reinforcer. Praise, linked to affection, is one instance of a secondary reinforcer, as when you chosen out "Great shot!" every time Joaquin made a goal. Some other example, money, is just worth something when you can employ it to buy other things—either things that satisfy bones needs (nutrient, water, shelter—all master reinforcers) or other secondary reinforcers. If you were on a remote island in the middle of the Pacific Sea and you had stacks of money, the coin would not exist useful if you could not spend information technology. What near the stickers on the behavior chart? They also are secondary reinforcers.

Sometimes, instead of stickers on a sticker chart, a token is used. Tokens, which are also secondary reinforcers, can then be traded in for rewards and prizes. Entire behavior direction systems, known as token economies, are built around the apply of these kinds of token reinforcers. Token economies accept been found to exist very constructive at modifying behavior in a multifariousness of settings such as schools, prisons, and mental hospitals. For example, a written report by Cangi and Daly (2013) plant that utilize of a token economy increased advisable social behaviors and reduced inappropriate behaviors in a grouping of autistic school children. Autistic children tend to exhibit disruptive behaviors such equally pinching and hitting. When the children in the written report exhibited appropriate beliefs (not hit or pinching), they received a "quiet easily" token. When they striking or pinched, they lost a token. The children could then exchange specified amounts of tokens for minutes of playtime.

Everyday Connexion: Behavior Modification in Children

Parents and teachers often apply behavior modification to change a kid's behavior. Behavior modification uses the principles of operant workout to attain behavior change so that undesirable behaviors are switched for more than socially acceptable ones. Some teachers and parents create a sticker chart, in which several behaviors are listed ([link]). Sticker charts are a form of token economies, as described in the text. Each time children perform the behavior, they get a sticker, and after a certain number of stickers, they get a prize, or reinforcer. The goal is to increment acceptable behaviors and decrease misbehavior. Remember, it is best to reinforce desired behaviors, rather than to apply punishment. In the classroom, the instructor can reinforce a wide range of behaviors, from students raising their hands, to walking quietly in the hall, to turning in their homework. At home, parents might create a behavior chart that rewards children for things such every bit putting away toys, brushing their teeth, and helping with dinner. In order for behavior modification to be effective, the reinforcement needs to be continued with the behavior; the reinforcement must thing to the kid and be done consistently.

A photograph shows a child placing stickers on a chart hanging on the wall.

Sticker charts are a grade of positive reinforcement and a tool for behavior modification. Once this little girl earns a certain number of stickers for demonstrating a desired behavior, she volition be rewarded with a trip to the ice cream parlor. (credit: Abigail Batchelder)

Fourth dimension-out is another popular technique used in behavior modification with children. It operates on the principle of negative punishment. When a child demonstrates an undesirable beliefs, she is removed from the desirable activeness at manus ([link]). For example, say that Sophia and her brother Mario are playing with building blocks. Sophia throws some blocks at her brother, so you give her a alert that she volition go to fourth dimension-out if she does information technology again. A few minutes afterwards, she throws more blocks at Mario. You remove Sophia from the room for a few minutes. When she comes back, she doesn't throw blocks.

At that place are several important points that you should know if you programme to implement time-out as a behavior modification technique. First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the action is something undesirable for the child, this technique volition backfire because it is more enjoyable for the child to be removed from the activity. Second, the length of the time-out is important. The general dominion of thumb is one minute for each year of the child'south age. Sophia is five; therefore, she sits in a time-out for v minutes. Setting a timer helps children know how long they have to sit in time-out. Finally, equally a caregiver, proceed several guidelines in heed over the course of a fourth dimension-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehavior); and give the child a hug or a kind discussion when time-out is over.

Photograph A shows several children climbing on playground equipment. Photograph B shows a child sitting alone at a table looking at the playground.

Time-out is a popular course of negative punishment used by caregivers. When a kid misbehaves, he or she is removed from a desirable activity in an endeavor to decrease the unwanted behavior. For example, (a) a kid might be playing on the playground with friends and button another child; (b) the child who misbehaved would then exist removed from the activity for a short period of time. (credit a: modification of piece of work by Simone Ramella; credit b: modification of work by "JefferyTurner"/Flickr)

REINFORCEMENT SCHEDULES

Remember, the all-time way to teach a person or animate being a beliefs is to utilize positive reinforcement. For example, Skinner used positive reinforcement to teach rats to printing a lever in a Skinner box. At offset, the rat might randomly hit the lever while exploring the box, and out would come up a pellet of food. Later eating the pellet, what do you think the hungry rat did side by side? It striking the lever again, and received another pellet of food. Each time the rat striking the lever, a pellet of food came out. When an organism receives a reinforcer each time information technology displays a behavior, information technology is chosen continuous reinforcement. This reinforcement schedule is the quickest way to teach someone a beliefs, and it is especially effective in training a new behavior. Let's await back at the domestic dog that was learning to sit earlier in the chapter. Now, each fourth dimension he sits, you give him a treat. Timing is important hither: you will be most successful if y'all nowadays the reinforcer immediately after he sits, so that he can make an association betwixt the target behavior (sitting) and the issue (getting a care for).

Once a behavior is trained, researchers and trainers frequently turn to another blazon of reinforcement schedule—partial reinforcement. In fractional reinforcement, also referred to as intermittent reinforcement, the person or animal does not get reinforced every time they perform the desired beliefs. There are several different types of fractional reinforcement schedules ([link]). These schedules are described as either stock-still or variable, and as either interval or ratio. Fixed refers to the number of responses between reinforcements, or the corporeality of fourth dimension betwixt reinforcements, which is set and unchanging. Variable refers to the number of responses or amount of time between reinforcements, which varies or changes. Interval means the schedule is based on the time between reinforcements, and ratio means the schedule is based on the number of responses between reinforcements.

Reinforcement Schedules
Reinforcement Schedule Clarification Upshot Example
Fixed interval Reinforcement is delivered at anticipated time intervals (east.g., after 5, 10, 15, and 20 minutes). Moderate response rate with significant pauses after reinforcement Infirmary patient uses patient-controlled, physician-timed pain relief
Variable interval Reinforcement is delivered at unpredictable time intervals (due east.yard., after v, 7, 10, and twenty minutes). Moderate nevertheless steady response charge per unit Checking Facebook
Fixed ratio Reinforcement is delivered after a anticipated number of responses (e.g., after 2, 4, 6, and 8 responses). High response rate with pauses after reinforcement Piecework—manufacturing plant worker getting paid for every x number of items manufactured
Variable ratio Reinforcement is delivered later on an unpredictable number of responses (e.g., later on 1, 4, v, and nine responses). High and steady response rate Gambling

Now permit's combine these four terms. A fixed interval reinforcement schedule is when behavior is rewarded after a set amount of time. For example, June undergoes major surgery in a infirmary. During recovery, she is expected to experience pain and will crave prescription medications for pain relief. June is given an 4 drip with a patient-controlled painkiller. Her doctor sets a limit: one dose per hour. June pushes a button when pain becomes difficult, and she receives a dose of medication. Since the advantage (pain relief) only occurs on a fixed interval, there is no indicate in exhibiting the behavior when it volition not be rewarded.

With a variable interval reinforcement schedule, the person or animal gets the reinforcement based on varying amounts of time, which are unpredictable. Say that Manuel is the manager at a fast-nutrient restaurant. Every once in a while someone from the quality control sectionalisation comes to Manuel's restaurant. If the restaurant is clean and the service is fast, everyone on that shift earns a $xx bonus. Manuel never knows when the quality control person will prove up, then he always tries to go along the eating place clean and ensures that his employees provide prompt and courteous service. His productivity regarding prompt service and keeping a make clean restaurant are steady because he wants his coiffure to earn the bonus.

With a fixed ratio reinforcement schedule, there are a set number of responses that must occur earlier the behavior is rewarded. Carla sells spectacles at an eyeglass store, and she earns a commission every time she sells a pair of glasses. She always tries to sell people more pairs of glasses, including prescription sunglasses or a backup pair, then she tin increase her committee. She does not care if the person really needs the prescription sunglasses, Carla just wants her bonus. The quality of what Carla sells does not matter considering her committee is not based on quality; it's only based on the number of pairs sold. This distinction in the quality of performance tin can help make up one's mind which reinforcement method is most appropriate for a detail situation. Fixed ratios are better suited to optimize the quantity of output, whereas a fixed interval, in which the advantage is not quantity based, tin atomic number 82 to a higher quality of output.

In a variable ratio reinforcement schedule, the number of responses needed for a reward varies. This is the about powerful fractional reinforcement schedule. An example of the variable ratio reinforcement schedule is gambling. Imagine that Sarah—generally a smart, thrifty woman—visits Las Vegas for the showtime time. She is not a gambler, but out of marvel she puts a quarter into the slot machine, and and then some other, and another. Nothing happens. Ii dollars in quarters subsequently, her curiosity is fading, and she is only most to quit. But then, the motorcar lights up, bells go off, and Sarah gets l quarters back. That'due south more like it! Sarah gets dorsum to inserting quarters with renewed interest, and a few minutes later she has used up all her gains and is $10 in the hole. Now might be a sensible fourth dimension to quit. And yet, she keeps putting coin into the slot machine considering she never knows when the next reinforcement is coming. She keeps thinking that with the side by side quarter she could win $50, or $100, or even more. Considering the reinforcement schedule in well-nigh types of gambling has a variable ratio schedule, people go along trying and hoping that the adjacent time they will win big. This is one of the reasons that gambling is so addictive—then resistant to extinction.

In operant workout, extinction of a reinforced behavior occurs at some point later on reinforcement stops, and the speed at which this happens depends on the reinforcement schedule. In a variable ratio schedule, the point of extinction comes very slowly, as described above. Merely in the other reinforcement schedules, extinction may come quickly. For example, if June presses the button for the pain relief medication before the allotted time her doctor has canonical, no medication is administered. She is on a fixed interval reinforcement schedule (dosed hourly), so extinction occurs apace when reinforcement doesn't come at the expected time. Amongst the reinforcement schedules, variable ratio is the about productive and the most resistant to extinction. Fixed interval is the least productive and the easiest to extinguish ([link]).

A graph has an x-axis labeled

The four reinforcement schedules yield dissimilar response patterns. The variable ratio schedule is unpredictable and yields high and steady response rates, with piffling if any suspension after reinforcement (e.thou., gambler). A fixed ratio schedule is predictable and produces a high response rate, with a brusk suspension later reinforcement (e.1000., eyeglass saleswoman). The variable interval schedule is unpredictable and produces a moderate, steady response rate (e.1000., restaurant managing director). The fixed interval schedule yields a scallop-shaped response pattern, reflecting a significant intermission subsequently reinforcement (due east.k., surgery patient).

Connect the Concepts: Gambling and the Brain

Skinner (1953) stated, "If the gambling establishment cannot persuade a patron to plow over money with no return, information technology may achieve the aforementioned issue by returning part of the patron's money on a variable-ratio schedule" (p. 397).

Skinner uses gambling every bit an case of the ability and effectiveness of conditioning behavior based on a variable ratio reinforcement schedule. In fact, Skinner was and so confident in his knowledge of gambling addiction that he even claimed he could turn a dove into a pathological gambler ("Skinner'due south Utopia," 1971). Beyond the power of variable ratio reinforcement, gambling seems to work on the brain in the same style as some addictive drugs. The Illinois Constitute for Addiction Recovery (n.d.) reports evidence suggesting that pathological gambling is an habit like to a chemic addiction ([link]). Specifically, gambling may actuate the advantage centers of the brain, much similar cocaine does. Research has shown that some pathological gamblers have lower levels of the neurotransmitter (brain chemical) known equally norepinephrine than do normal gamblers (Roy, et al., 1988). Co-ordinate to a study conducted by Alec Roy and colleagues, norepinephrine is secreted when a person feels stress, arousal, or thrill; pathological gamblers utilise gambling to increase their levels of this neurotransmitter. Some other researcher, neuroscientist Hans Breiter, has done extensive inquiry on gambling and its effects on the brain. Breiter (as cited in Franzen, 2001) reports that "Monetary advantage in a gambling-similar experiment produces brain activation very like to that observed in a cocaine addict receiving an infusion of cocaine" (para. 1). Deficiencies in serotonin (another neurotransmitter) might also contribute to compulsive beliefs, including a gambling addiction.

It may be that pathological gamblers' brains are dissimilar than those of other people, and perchance this departure may somehow have led to their gambling addiction, as these studies seem to suggest. However, it is very difficult to ascertain the crusade because it is impossible to conduct a true experiment (information technology would be unethical to try to turn randomly assigned participants into problem gamblers). Therefore, information technology may be that causation actually moves in the reverse direction—maybe the act of gambling somehow changes neurotransmitter levels in some gamblers' brains. Information technology too is possible that some disregarded factor, or confounding variable, played a part in both the gambling addiction and the differences in brain chemical science.

A photograph shows four digital gaming machines.

Some enquiry suggests that pathological gamblers use gambling to compensate for abnormally low levels of the hormone norepinephrine, which is associated with stress and is secreted in moments of arousal and thrill. (credit: Ted White potato)

COGNITION AND LATENT LEARNING

Although strict behaviorists such as Skinner and Watson refused to believe that noesis (such every bit thoughts and expectations) plays a role in learning, another behaviorist, Edward C. Tolman, had a different opinion. Tolman'south experiments with rats demonstrated that organisms can learn even if they practise not receive immediate reinforcement (Tolman & Honzik, 1930; Tolman, Ritchie, & Kalish, 1946). This finding was in conflict with the prevailing thought at the time that reinforcement must be immediate in lodge for learning to occur, thus suggesting a cognitive aspect to learning.

In the experiments, Tolman placed hungry rats in a maze with no reward for finding their mode through it. He likewise studied a comparison group that was rewarded with food at the stop of the maze. As the unreinforced rats explored the maze, they adult a cognitive map: a mental film of the layout of the maze ([link]). After 10 sessions in the maze without reinforcement, nutrient was placed in a goal box at the cease of the maze. Equally soon as the rats became aware of the food, they were able to find their way through the maze quickly, just equally quickly as the comparison group, which had been rewarded with food all forth. This is known every bit latent learning: learning that occurs but is not observable in beliefs until there is a reason to demonstrate it.

An illustration shows three rats in a maze, with a starting point and food at the end.

Psychologist Edward Tolman constitute that rats use cognitive maps to navigate through a maze. Have you e'er worked your way through various levels on a video game? You learned when to turn left or right, movement up or down. In that instance you were relying on a cognitive map, just similar the rats in a maze. (credit: modification of work past "FutUndBeidl"/Flickr)

Latent learning also occurs in humans. Children may larn by watching the actions of their parents but just demonstrate information technology at a later date, when the learned material is needed. For case, suppose that Ravi's dad drives him to schoolhouse every day. In this way, Ravi learns the route from his business firm to his school, only he's never driven there himself, so he has not had a gamble to demonstrate that he'due south learned the mode. Ane morning Ravi's dad has to leave early for a coming together, so he tin't drive Ravi to schoolhouse. Instead, Ravi follows the same road on his wheel that his dad would take taken in the car. This demonstrates latent learning. Ravi had learned the route to school, but had no need to demonstrate this knowledge earlier.

Everyday Connection: This Place Is Like a Maze

Have you lot ever gotten lost in a building and couldn't find your fashion back out? While that tin can be frustrating, you're non solitary. At one time or some other nosotros've all gotten lost in places like a museum, hospital, or university library. Whenever we go someplace new, we build a mental representation—or cognitive map—of the location, equally Tolman's rats congenital a cognitive map of their maze. Still, some buildings are confusing because they include many areas that look alike or have short lines of sight. Because of this, information technology'south often difficult to predict what'due south around a corner or determine whether to turn left or right to leave of a building. Psychologist Laura Carlson (2010) suggests that what we identify in our cerebral map tin affect our success in navigating through the environment. She suggests that paying attention to specific features upon inbound a building, such as a picture on the wall, a fountain, a statue, or an escalator, adds information to our cognitive map that can be used afterward to aid find our way out of the building.

Summary

Operant conditioning is based on the work of B. F. Skinner. Operant conditioning is a grade of learning in which the motivation for a behavior happens after the behavior is demonstrated. An animal or a human receives a consequence later performing a specific behavior. The consequence is either a reinforcer or a punisher. All reinforcement (positive or negative) increases the likelihood of a behavioral response. All punishment (positive or negative) decreases the likelihood of a behavioral response. Several types of reinforcement schedules are used to reward beliefs depending on either a set or variable period of fourth dimension.

Self Bank check Questions

Critical Thinking Questions

1. What is a Skinner box and what is its purpose?

2. What is the difference between negative reinforcement and penalty?

3. What is shaping and how would you use shaping to teach a dog to ringlet over?

Personal Awarding Questions

four. Explain the difference between negative reinforcement and penalty, and provide several examples of each based on your own experiences.

v. Remember of a behavior that you lot have that you would like to change. How could you use beliefs modification, specifically positive reinforcement, to change your behavior? What is your positive reinforcer?

Answers

one. A Skinner box is an operant conditioning chamber used to railroad train animals such as rats and pigeons to perform certain behaviors, like pressing a lever. When the animals perform the desired beliefs, they receive a reward: food or water.

two. In negative reinforcement y'all are taking away an undesirable stimulus in order to increase the frequency of a certain behavior (due east.g., buckling your seat belt stops the annoying beeping sound in your machine and increases the likelihood that yous volition vesture your seatbelt). Punishment is designed to reduce a beliefs (east.g., you scold your child for running into the street in gild to subtract the dangerous behavior.)

3. Shaping is an operant conditioning method in which you reward closer and closer approximations of the desired beliefs. If yous want to teach your dog to gyre over, yous might reward him commencement when he sits, and then when he lies down, and then when he lies downwards and rolls onto his back. Finally, you would reward him only when he completes the entire sequence: lying down, rolling onto his back, and then continuing to roll over to his other side.

Glossary

cognitive map mental picture of the layout of the environment

continuous reinforcement rewarding a behavior every fourth dimension information technology occurs

fixed interval reinforcement schedule behavior is rewarded after a set amount of fourth dimension

fixed ratio reinforcement schedule set number of responses must occur before a behavior is rewarded

latent learning learning that occurs, simply information technology may not exist evident until there is a reason to demonstrate it

law of effect behavior that is followed by consequences satisfying to the organism will be repeated and behaviors that are followed by unpleasant consequences will be discouraged

negative penalty taking abroad a pleasant stimulus to decrease or stop a behavior

negative reinforcement taking away an undesirable stimulus to increase a beliefs

operant conditioning form of learning in which the stimulus/experience happens after the beliefs is demonstrated

fractional reinforcement rewarding beliefs only some of the time

positive punishment calculation an undesirable stimulus to terminate or decrease a behavior

positive reinforcement adding a desirable stimulus to increase a behavior

primary reinforcer has innate reinforcing qualities (e.thou., food, water, shelter, sexual practice)

punishment implementation of a consequence in order to decrease a behavior

reinforcement implementation of a consequence in order to increase a behavior

secondary reinforcer has no inherent value unto itself and simply has reinforcing qualities when linked with something else (eastward.g., money, gold stars, poker chips)

shaping rewarding successive approximations toward a target beliefs

variable interval reinforcement schedule behavior is rewarded after unpredictable amounts of time have passed

variable ratio reinforcement schedule number of responses differ earlier a behavior is rewarded

holmeshoned1940.blogspot.com

Source: https://courses.lumenlearning.com/wsu-sandbox/chapter/operant-conditioning/

0 Response to "Reinforcment Increases the Likelihood a Behavior Will Occur Again"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel