How do horses learn to respond to cues?
Often overlooked in the horse world, a solid understanding of Learning Theory is a huge asset when working with horses.
Every horse rider knows that there are a set of ‘aids’ we give to horses that tell them what to do when we are riding, and also on the ground. For example we have leg aids that tell the horse to move forward. Horses learn what these cues mean through training. What you may not know is that there are several different ways a horse can learn to respond to cues.
When you understand the different ways a horse can learn to respond to cues, you can adapt your training to best suit the individual horse and what you’re trying to teach. It also means you can really tackle the root of any problem instead of inadvertently masking or suppressing a behaviour, which might just be a temporary quick fix.
There are four broad categories – or ‘quadrants’ – that describe how horses – and other animals – learn to respond to cues through ‘operant conditioning’ (a topic for another time). These are positive reinforcement, negative reinforcement, positive punishment, and negative punishment. Here is a simplified overview of the four quadrants that I hope will come in handy with behavioural problem-solving and assessing different training approaches.
The word ‘’positive’ in this phrase is nothing to do with it being good or bad. It is a purely descriptive word that refers to the addition (+) of something. In this case we are talking about adding something that will increase the frequency of (reinforce) a behaviour.
From our perspective as trainers, we want to add something to the horse’s experience that makes them choose to repeat a behaviour that we want in response to a particular cue.
An example would be giving a horse a food reward when they come to call or giving them a treat when they put their muzzle through the noseband of a headcollar. By giving them something they like when they produce the behaviour that we want, we make them want to repeat the behaviour in future. So every time they hear us calling their name (a vocal cue) or see the headcollar (a visual cue) they come over and catch themselves!
Clicker training is probably the most familiar form of positive reinforcement-based training though a clicker is by no means required to use positive reinforcement.
Positive reinforcement is very widely used in training all kinds of animals in professional and domestic contexts the world over. Captive wild animals like rhinos and tigers are trained to perform specific behaviours to make medical procedures possible or general husbandry easier. Sea lions are trained to perform for audiences in zoos. And police, military and service dogs are trained to perform many kinds of complicated tasks. In the scientific community – and the vast majority of the professional animal world – it is accepted as the most effective way to train new behaviours across species.
Despite this, it is still unusual to see positive reinforcement being used with horses. This isn’t because it isn’t effective – in fact, it has been shown in various studies to be both quick and effective in a variety of contexts. More likely the reason positive reinforcement isn’t more widely used is that it is quite simply very poorly understood in the horse world.
Traditional methods of training rarely involve positive reinforcement and many myths now surround and discourage the use of food when training due to improper use leading to behavioural problems. Positive reinforcement is probably the most powerful tool we have with which to train horses – but precisely because it is so incredibly effective, it is also very easy to misuse and train the wrong thing!
This is the other type of reinforcement and, again, ‘negative’ doesn’t mean good or bad. It simply refers to the removal or subtraction (-) of something which encourages (reinforces) the future repetition of a desired behaviour.
An example of using this in training would be applying the leg aid and releasing as soon as the horse begins to move forward. The release of pressure is the removal of something that the horse doesn’t like. With some repetition the horse learns that if they move forward, the unpleasant sensation stops. So next time you begin to squeeze, the horse moves forward straight away in order to stop the unwanted pressure. Eventually your legs provide a tactile cue that the horse learns to respond to in a specific way.
Negative reinforcement is the most widely used of these concepts in regular horse training. Almost all of riding, traditional training, Classical training, ‘natural horsemanship’, and variations thereof rely heavily on a horse’s ability to learn through negative reinforcement. Even the most basic things we do with a horse – such as leading them with a headcollar – rely on negative reinforcement (pressure-release) to work.
Positive punishment is probably the most controversial causes the most upset! Punishment is the opposite of reinforcement – instead of increasing the frequency of a behaviour, we’re looking for a decrease in the frequency of a behaviour.
As before, the ‘positive’ part refers to the addition of something. So what positive punishment really means is adding something to the horse’s experience that reduces the frequency of a behaviour.
A straightforward example of this is smacking a horse that has just refused a jump. The hope when doing this is that the horse will make a connection between the smack and the refusal so that the next time they approach the jump they don’t refuse, in order to avoid being smacked. In other words the rider is adding pain to reduce the frequency of refusals. In practice using positive punishment this way can backfire. It’s really important to thoroughly understand the root cause of a behaviour before resorting to positive punishment.
Milder forms of positive punishment are things like making a horse back up as a punishment for barging into you or generally making a horse ‘move their feet’ if they do something you want to suppress. In practice there is an overlap between negative reinforcement and positive punishment. In certain contexts the two are indistinguishable from the horse’s perspective.
When administering any kind of punishment, it’s really important to remember that masking or suppressing behaviours does not address their root cause. If you ignore the root cause of a behaviour in favour of a quick fix, you may find that the horse reverts in future when the motivation to behave a certain way surpasses the fear of repercussion.
A typical example of this is when a horse is bucking due to discomfort and the bucking behaviour is suppressed with punishment. If the source of discomfort is never addressed and worsens, the horse is likely to eventually ‘explode’.
Unwanted behaviours frequently begin as more subtle indicators of discomfort or unhappiness such as flattened ears, tension or tail swishing. These may go unnoticed for a very long time but can develop into more problematic behaviours like bucking and rearing, which are then met with punishment.
One way of thinking about this progression is as the horse trying to communicate. At first they talk quietly but as their situation doesn’t improve and their discontent grows, they start shouting. If we then silence them with punishment, they might be quiet for a little while – but when they eventually reach the end of their tether, the protest is likely to be explosive.
This form of punishment is probably the trickiest to understand. It is the removal of something – something the horse wants – resulting in a decrease in the frequency of a behaviour.
A human example of this is taking a teen’s mobile phone away as punishment for not doing their homework. Negative punishment works in this context because you can explain to a teenager why their mobile phone is being taken away. This isn’t possible with a horse!
Sometimes horse people try to use negative punishment in the same way as it would be used by parents of a disobedient child. For instance, an owner might not give their horse their dinner because the horse bucked them off during their ride. Since the horse has no way of knowing why their dinner hasn’t been brought to them or that it was a consequence for an earlier behaviour they exhibited, this is a completely ineffective thing to do.
However, it is still possible to use negative punishment in some situations – an example would be removing food the horse is eating to stop pawing behaviour and returning the food once the horse is standing still. Here you can also see the overlap with positive reinforcement – taking the food away is negative punishment, but returning it is positive reinforcement.
Negative punishment is unintentionally used when shaping behaviours learned through positive reinforcement. An example is withholding a food reward when your horse offers a different behaviour to the one you want to reward. Because the horse expects a food reward, not giving it to them is effectively the same as taking it away and can lead to some degree of stress and frustration.
This consequence of withholding food rewards and this overlap with negative punishment is one of the reasons the rate of reward when using positive reinforcement is very important, as well as the general availability of food and the perceived value of the rewards we are using. It is also one of the reasons we need to make an effort to break complex behaviours down into more manageable segments.