360 Feedback: Overcoming the positivity bias01.Jan.2012
Clients often ask us to help them introduce more ‘edge’ into their 360s. They are frustrated that 360 feedback ratings invariably err toward the positive. This leads to bland 360 reports which are hard to interpret, adding little value to an employee’s clarity about what to improve on. Recent analysis conducted by YSC Online found that simple steps such as changing the rating scale can make a meaningful difference. Our main findings and practical tips are outlined below:
Ratings as observation, evaluation or opinion:
Over the past 10 years YSC Online has run over 170 variations of the classic 360 survey using many different types of items. In a new piece of research we have analysed the most frequently used to see which worked best in overcoming positive bias. Three types of item stood out as the most popular:
- Observation or frequency ratings – how often does this person do something – e.g. Never to Always
- Evaluation or developmental ratings – how skilled is this person in doing something – e.g. Significant Development Need – Significant Strength
- Opinion or agreement ratings – to what extent does the rater agree with a description of this person doing something – e.g. Strongly Disagree – Strongly Agree
- Frequency scales are the most commonly used. They are often favoured by clients as they encourage respondents to rate based on what they have actually seen. 4 and 6 point scales display a relatively low bias for ratings on the positive end of this type of scale. However, approximately 95% of ratings on a 5 point scale err towards the positive.
- Development scales are especially popular when clients want clear cut information on strengths and development needs. Here, a 5 point scale produces the widest distribution of ratings with only 59% being “Strength” or above. Ratings on the 6 point scale are slightly less spread out. 4 point scales are the least effective with 90% of ratings being on the positive end of the scale.
- Clients often opt for agreement scales as they are easy to understand and allow for more creative descriptions of behaviour, attitude, etc. However, they display the greatest bias for positive responses. On 4 and 6 point scales, 95% of respondents chose “Agree” or “Strongly Agree”. 5 point scales appear slightly less skewed with 13% selecting “Neither Agree nor Disagree”.
We explored these findings further using classic cognitive psychology as our guide:
People will agree with just about anything: Agreement scales are widely used. However, our research highlighted a clear psychological hazard when using them – the ‘positive response bias’. In 5 and 6 point scales over half the respondents plumped for “Agree”. In 4 point scales, half opted for “Strongly Agree”. This is in line with a 2005 review of over 100 studies that concluded “respondents are inclined to agree with just about any assertion, regardless of its content”.
Neutral feedback is a valuable anchor: Scales with an even number of ratings force respondents to make a choice. This may seem like a good idea as it prevents respondents from ‘sitting on the fence’. But in 360 feedback, people rarely have strong opinions on all aspects of an individual’s behaviour. Forcing the respondent to make a positive or negative rating creates a false dichotomy. When forced to make a choice, respondents usually prefer to be optimistic and give positive ratings – leading to a positive tendency overall across the feedback.
This is easily avoided when rating scales have a natural neutral mid-point (e.g. ‘Neither Agree nor Disagree’). This acts as a psychological anchor between positive and constructive ratings. It must be noted that neutral feedback is qualitatively different from a “can’t say” response. The former expresses an absence of opinion, the latter an absence of opportunity for observation.
Bringing order has its problems: A common route to avoiding inflated ratings is to force respondents to choose strengths and development needs from a list and rank them in order of importance or frequency. This creates a clear set of relativities. It can also reduce the time taken to complete the survey. However, it creates other issues in terms of interpretation and relevance of the feedback. It doesn’t help with comparisons, e.g. over time or with other people. Also in our experience, recipients struggle with converting rank ordered areas into clear development actions. The list of behaviours or competencies presented for ranking needs to be of a manageable size, otherwise respondents express a preference for the items appearing early in the list. This can also limit the utility of the feedback exercise.
Clarifying for consistency: Rating scales that invite personal interpretation, e.g. those with descriptive labels only at each end of the scale, are often believed to provoke more thought and thus more astute evaluation of performance. This is because verbal labels are seen to be ambiguous. However, we found the opposite was true. Having descriptive labels only at each end of the scale increased the bias for positive ratings. Choosing a rating scale with verbal descriptions at each point appears to prompt actual perceptions of behaviour. In addition, numbers have no inherent meaning, so including verbal labels at each point helps respondents latch onto nuances of difference and makes completing the survey easier and quicker.
Tips for overcoming the ‘positive response bias’ in your 360s
- Avoid agreement scales – Using frequency or development scales may mean slightly altering the survey items but this extra thought upfront will pay off when respondents receive more balanced feedback.
- Use rating scales with a mid-point – The 5 point development scale is the best to use. In our experience it is the least skewed with only 57% of responses sitting at the positive end of the scale and a more even spread of ratings across the full spectrum.
- Label all ratings – Fully labelled scales display less of a bias for positive ratings and also ensure that respondents understand the meaning of each point on the scale. People prefer using rating scales with verbal descriptors.
- Mix and Match – Include some rank ordering items along with more traditional scaled quantitative items and open-ended qualitative questions to get the richest form of feedback.