From manipulation to communication : Treats, a reinforcer that scares us so much

 
 

Using treat in the education of pets meets many supporters and opponents. This article aims to shed light on the theory behind this stimulus in order to understand its various advantages, the reluctance to use it, as well as techniques for introducing it.

Before we can go into the subject in depth, let's start with some definitions.

What is a reinforcer ?

A reinforcer is a stimulus that increases the likelihood of a behavior occurring and makes that behavior more frequent (or faster, higher, for example, depending on the type of behavior). Thus, if the individuals are "rewarded", the behaviors are "reinforced".

There are several types of reinforcers. Knowing them can help us to take the measure of their importance in a process of creation or behavioral modification.

Primary reinforcers

Primary reinforcers do not depend on an association with another reinforcer and we do not need to have been previously exposed to them to find their reinforcing value: food when we are hungry, water when we have thirst, hot when cold, cold when hot, sexual stimulation, and human contact are different primary reinforcers.

John Baldwin (2007) defined sensory stimulation as a primary reinforcer. Biederman & Vessel (2006) identified the fact of learning as a primary reinforcer (indeed we often speak of “devouring a book or information” in french).

Having control over one's environment is a primary reinforcer for any individual, just like food, water, sex, ...

 

Secondary reinforcers

Secondary reinforcers depend on an association with a primary reinforcer (or with another secondary reinforcer): praises, recognition or compliments are possible secondary reinforcers. Secondary reinforcers are everywhere, all the time, and bring them to satiety is more difficult than primary reinforcers. We rarely get tired of being praised for who we are or what we do (especially when the form changes; for example, bravo-bravo-bravo could lead to satiety more quickly than bravo-that's great-I am proud of you).

Generalized reinforcers

Secondary reinforcers that have been associated with several other reinforcers are generalized reinforcers.

Its association with nearly all primary reinforcers makes money one of the most powerful generalized reinforcers. Money buys food, air conditioning, heating, and sometimes even human contact and sex as well.

It is also associated with secondary reinforcers because money can also buy games, televisions, cars, and so on.

But for the euros or dollars we have in our wallets to still be generalized reinforcers, they must be paired with primary reinforcers on a regular basis. If I give you a 500 bills, they will only be reinforcers if the currency in question still exists.

If I offer you a 500 monopoly dollars, it will immediately be much less reinforcing than if it is a real 500 dollars note.

Natural or artificial reinforcers

Primary and secondary reinforcers can be classified into two categories: natural reinforcers or artificial (or contrived) reinforcers.

Artificial reinforcers are those that are put in place by a person to change the frequency of a behavior.

Natural reinforcers are those who have an "automatic" and natural reinforcing power after the emission of a behavior.

Primary or secondary reinforcers can therefore fall into one or the other of the categories depending on whether they were placed there to modify the frequency of a behavior or whether the reinforcer is spontaneously a consequence of a behavior.

For example, if you watch a comedy movie that makes you laugh, the fun will be a natural reinforcer. On the other hand, if your spouse tells you how pretty you look when you laugh, in order to see you smile more often, it is an artificial reinforcer.

A few years ago, a question (not really existential though) bothered me: when a parrot preens the feathers of another (parrots with a strong friendship or love relationship mutually scratch the keratin particles surrounding the new feathers in moulting period), is it a primary or secondary reinforcer?

After reading and studying a ton of literature about reinforcers, I still couldn't decide on this reinforcer. I thought it must have been primary since the behavior exists even in parrots that have been taken from their parent's nest and hand-reared by humans. But at the same time it is not a reinforcer related to "survival" as we classify primary reinforcers, so it must be secondary.

It's bound to be a natural reinforcer since the behavior seems to be self-reinforcing but at the same time it may also be an contrived reinforcer to see the other parrot's feather preening behavior increase (it would seem that the parrots take turns doing it).

Completely lost since I was finally categorizing this reinforcer into all categories, I wrote and asked this question to Dr. Susan G. Friedman, Professor Emeritus of Psychology and a pioneer in the use of Applied Behavior Analysis in pets and captives animals, and here is a part of what she answered me:

“I think this strict dichotomy between primary and secondary reinforcers is not helpful in contemporary thinking.

Primary reinforcers are consequences that naturally reinforce behavior—they don't need a history to acquire their behavior-reinforcing function. So I imagine parrots are born with the tendency to use their beaks this way, but there must be a lot of learning involved in the behavior too.

We know that if the behavior is repeated, there is a reinforcing effect.

But what does it matter, in the end, what label we give it? »

Regardless of the label, what interests us is whether this particular stimulus will increase the desired behavior or not.

We hope that our new behaviors will bring natural reinforcers quickly but at the beginning, the best strategy is certainly to bring artificial reinforcers.

 

How does a reinforcer work?

 

How a reinforcer is delivered greatly affects its effectiveness. Reinforcement of a behavior will be most effective when the reinforcer is delivered in a contingent and contiguous manner.

This means that the reinforcer Y must depend on the behavior X and this, in a very short or even immediate time interval.

If I perform behavior X, I know that reinforcer Y will come right after, and that's why I'm going to repeat my behavior.

 

Example: if I put a euro in the coffee machine and I press the button (behavior), I know that a cup and my coffee will come out (contingent reinforcer) immediately afterwards (contiguous) which makes that the next time I want a coffee, my behavior will repeat itself.

 

The longer the delay between the behavior and the consequence (the reinforcer), the less effective the reinforcer is.

We therefore all need to see our behaviors reinforced immediately and in adequacy, in order to be able to repeat them more and more often.

Other characteristics affect the (in)effectiveness of a reinforcer: satiety, deprivation, type of reinforcer, variety, individual preferences, ...

Even though money is considered to be a very important reinforcer for humans, its effectiveness on the behaviors of people working on a voluntary basis is diminished. Appreciation, self-esteem, the happiness of others, recognition, and the success of a mission are, on the other hand, much more effective reinforcers in this type of scenario.

 

Which reinforcer for my pet?

 

Much of the success of creating or modifying behavior relies on finding what motivates the learner the most.

 

Thorndike, a precursor of behaviorism, developed “The Law of Effect” in 1898, which states that behavior that produces a satisfactory effect in a situation will be more likely to recur in that same situation. Conversely, a behavior that produces an uncomfortable effect in a situation will be less likely to recur in that same situation.

In other words, my dog ​​will be more likely to repeat the "sit" behavior when I give him the "sit" cue if that behavior produces a satisfying effect...for my dog!

 

Scientific studies tend to show that the use of positive reinforcement procedure is the most effective for learning a new behavior (Blackwell et al, 2008; Ziv, 2017; Mills et al, 2020). This means that to increase the frequency of a behavior, the most effective method will be to add a stimulus appreciated by the animal after the emission of his behavior.

 

Each individual is different.

Each individual has their own preferences.

Every situation is different.

 

This means that depending on the individual and the context in which we want to teach him a behavior, the reinforcer can take many different forms.

Sometimes the most suitable may be food, sometimes a caress, sometimes praises, sometimes choice, ...

 

The most important thing to remember when choosing a reinforcer for our pet is that it is not up to us to choose it!

 

If we want the behavior of our dog, our cat, our parrot or our horse to increase, the individual himself must find *this* stimulus in *this* situation reinforcing.

To define which reinforcer is the best for our learner, we can conduct tests and check both in the future frequency of the behavior and in the body language of the individual which has the most value.

 

Treats as reinforcers

 

If the legend says that dogs listen to their master to please them, the reality is quite different.

No offense to our ego, dogs often seem more attracted to primary food reinforcers than to our attention and praises (Feuerbacher & Winne, 2013, 2014), and the same would be true for cats (Willson et al, 2017 ).

 

Treats therefore have the advantage of allowing more effective and faster learning than interaction with humans (praises and/or hugs).

 

Although scientifically studied, documented and acclaimed, the use of treats in the education of pets still meets reluctance today.

 

*One of the main reluctance is the weight gain that food intake could cause during training sessions.

 

In order to explore this reluctance, we must first go back to what we know about the use of food in learning a behavior. If we are indeed seeking to deliver a stimulus appreciated by the animal after the emission of his behavior, it is the quality of the reinforcer that will take precedence over the quantity (Riemer et al, 2018). This means that when using a treat that is particularly enjoyed by the dog, a small amount will result in more effective and faster learning and performance than a large amount of a less enjoyed reinforcer.

Secondly, when learning a new behavior, we look for a fluidity in the session with a high reinforcement rate, fast, which would not be possible with large treats that the animal would then have to chew / swallow for longer.

 

If weight gain can today be a reluctance to use primary food reinforcers, then we can suggest:

- To use low-fat food reinforcers

- To use food reinforcers in very small pieces-sizes

- To reduce the ration of daily meals

- To use part of the daily meal as a reinforcer

- To increase the physical activity of the dog

- To use natural food reinforcers, without by-products, without preservatives, without added sugars

 

*Another reluctance sometimes comes from the fact that the human thinks that his animal is not motivated by food.


I don't know who this sentence belongs to but I've kept it in mind since I heard it years ago:

“An animal that is not motivated by food is a dead animal”

 

As long as an individual eats, then he is motivated by food. But perhaps he is not by the food that is offered to him.

 

Several questions can then be asked to improve the situation:

- Could the animal have a health problem?

- Can we try to offer more enjoyable treats?

- Are the treats offered in contexts where the animal is uncomfortable?

- Is the treat delivered in a way that makes the dog uncomfortable?

- Are the treats varied?

 

Variety is the spice of life. Just like humans, animals can have a variety of tastes and those tastes are not fixed in time.

If I'm offered avocado for breakfast, lunch, and supper every day for a week, even though I love avocado, by day 8, I wouldn't want a single bite of it anymore.

In dogs, the variability effect suggests using several different treats during positive reinforcement training sessions rather than the same treat during repetitions (Bremhorst, A., Bütler, S., Würbel, H. et al., 2018).

Also, if the animal does not know which treat he will receive after the emission of his behavior, he will be all the more motivated.

 

Neuroscientist Michael Platt has suggested that neurons fire faster for consistently high reinforcers than for lower gains. He also discovered that neurons were particularly active for “surprise” reinforcers.

Surprise creates motivation!

 

For the peakiest animals with food, we can reinforce the behavior of eating with a reinforcer appreciated by the learner (a toy, praise, petting, ...), like any other behavior, to see the frequency of the behavior of eating treats increase as the sessions progress.

 

*One of the main reluctance that can be heard regarding the use of treats in the education of a pet is that the latter should obey without having to use stratagems for that.

 

This reluctance often comes from the theory of dominance (or alpha theory), according to which the human should have a higher hierarchical status than the dog. This perspective proposes that dog behavior reflects the rigid and structured hierarchy that has been described in wolf packs and recommends that the human enforces strong dominance over their dogs for a stable relationship.

Dominance in dogs is an ambiguous concept because it can have two definitions depending on the ethological knowledge of the interlocutor.

If dominance, from an ethological point of view, is well documented in the dog, most of the advice unjustly given in its name rather reflects punishment procedures. These punitive interventions can, however, bring a lot of side effects that jeopardize the physical and psychological health of the animal as well as the safety of the people interacting with him (Herron, Reisner & Shofer 2009; Arhant et al 2010; Casey et al 2010; Azrin , Hutchinson & Heck 1966; Rilling & Caplan 1973).

 

No living being behaves out of unconditional kindness.

The dog, cat, parrot, horse, turtle or any other animal species is subject to the same fundamental laws and principles of behavior analysis as every human. This means that any species, any individual, will repeat behaviors that "work" and decrease behaviors that "don't."

 

In order to preserve our trust relationship with our pets, to limit the undesirable side effects of certain procedures, and to be in agreement with what science offers us, the use of a treat after the emission of a behavior to see the frequency increasing is therefore not only completely valid but also corresponds to the laws of learning.

 

A better understanding of the term “dominance” and the dynamics behind it could help us improve our relationship with our dogs. Clive Wynne (2021) recently offered an article bringing together a wealth of research on dominance in dogs and concludes that they do experience it, yet it is very different between dogs and dogs and between dogs and humans.

 

*Other reluctance to the use of treats as reinforcers can be observed, such as the fact that it is a type of manipulation.

 

There is an important difference between corruption, manipulation, blackmail and reinforcement.

The first three are conditions that occur before a behavior is issued.

The reinforcer, on the other hand, comes after the emission of a behavior, by rewarding the learner, in order to:

- seeing the behavior repeat itself more often in similar situations

- increasing the positive emotional response of the learner in this situation

 

Food does not act here as a currency of exchange but as a learning tool.

 

*The last reluctance I would like to talk about is the fear of having to use treats for life for every behaviors we ask our pet


While it is correct to say that a behavior must be regularly reinforced in order for it to be maintained over time, the reinforcers can be varied.

The use of treats in the creation and modification of behavior has many benefits:

- It's easy (and ease is something really great in life, unlike complexity which is something that really sucks in life)

- It's fast

- It's instinctive

- It makes the animal happy

I start almost all my training protocols with treats, at least 5 kinds of different values, sizes, shapes, tastes.

When my animal's behavior becomes fluid, that he responds to my cue without latency, that his body language is like one of the happy dogs in Disney movies, then I will start to vary my reinforcers and even sometimes to reduce the reinforcement rates.

For example, when I taught Marley, my Czechoslovakian wolfdog to come home from the garden when I called him, he always gave a very high value treat when he got back.
Then when he came back as soon as I called him 100% of the time, I started to sometimes give him a very high value treat, sometimes a more "standard" treat, then sometimes I praised him greatly in a very high pitched singsong voice, sometimes I scratched his ass, then sometimes I just closed the door behind and sometimes I let him go back outside again.

The use of reinforcers on a very regular basis is imperative to maintain the frequency of a behavior, but the rate at which these reinforcers are delivered may vary.

 

The different schedules of reinforcement

 

We have seen that for a behavior to be maintained or increased, it must be reinforced. There are different schedules at which a behavior can be reinforced, called schedules of reinforcement, which make a difference in the frequency of the delivery of a reinforcer.

 

The simplest schedule of reinforcement is the continuous reinforcement. Each occurrence of the behavior is reinforced.

 

Every time my parrot does X, reinforcer Y is there.

For example, every time my parrot lands on my arm when I say "Hop", I give him a treat.

 

When the reinforcement is no longer systematic and continuous, it is then intermittent.

There are several intermittent reinforcement programs, here are some examples:

 

·               Fixed ratio

 

In a fixed ratio schedule, one reinforces only a certain number of responses instead of reinforcing all occurrences of the behavior as continuous reinforcement does.

 

The number of responses is the same between each reinforcement.

Example: I give my dog ​​a treat after he sits down three times.

 

One of the effects to look out for when using this type of intermittent reinforcement is the pause that follows the reinforcement (“post-reinforcement pause”), before resuming the series of behaviors needed before another reinforcer. This pause can sometimes look like fatigue, but its main function would in fact be to escape the aversive aspect of the reinforcement rate (Perone, 2003). The higher the fixed ratio (for example the reinforcement happens every 50 sits) and the weaker the reinforcer (after 50 sits, have a treat then I have to repeat a series of 50 to have another one), the longer the pause will be.

 

Ultimately, the harder you have to work to get a reinforcer, the less you will want to work.

 

·               Variable ratio

 

The variable ratio schedule consists of varying the number of responses before reinforcement but remaining within an average.

For example, if I want to see my behavior of doing push-ups reinforced on average every 5 push-ups, I will generate several numbers in this average of 5 and the reinforcement will arrive, for example, after a set of 2 the first time, then a series of 3 then another series of 3, then a series of 7 and finally a series of 10.

In this type of intermittent reinforcement, the post-reinforcement pause may appear but will generally be of shorter duration and less often present after the reinforcements.

 

·               Fixed duration

 

Another type of reinforcement is the fixed duration reinforcement program.

Here, the behavior must take place for the entire time previously defined to be reinforced.

For example, my pig has to stay in station for exactly 20 seconds to get a back scratch.

 

 

·               Variable duration

 

In variable-duration schedule, the amount of time varies randomly before reinforcement but remains on average.

For example, if you want to see your child's playing piano behavior reinforced every 10 minutes on average, you will deliver a reinforcer at different variables between 5 and 15 minutes. The reward will sometimes come after 11 minutes, then 8, then 5 then 11 and so on. We will sometimes accept the "work" (playing the piano here) more easily if we know that we can actually do 15 minutes of it but also only do 5 minutes.

 

Giving a treat is not that complex!

 

In the end, if the action of giving a treat is perhaps not complex in itself, understanding all the theory behind it makes it possible to better consider this stimulus as a communication tool rather than as manipulation.

 
Précédent
Précédent

Premack’s principle

Suivant
Suivant

50 shades of NO