OKRs Scoring: A Brief History
Your approach to scoring Key Results is one of your most important deployment parameters. My recommended scoring system is the coolest thing I’ve created and applied with my OKR clients. You can find an updated approach to scoring in the book I wrote with Paul Niven on OKRs in 2016 that removes the “0.5” as it often does not add value when defining scoring upfront. This post includes a brief history of scoring in OKRs. I hope it gives you context for thinking about how best to score your OKRs! Note: I do not recommend scoring Objectives. If you wonder why, please post a comment and let’s get a conversation going!
Part 1: Binary Scoring
When I first got going with OKRs about 5 years ago, we did not apply a scoring scale. Each Key Result was either achieved or not. Things were simple. It was binary. If your Key Result was “10 new customers by end of quarter” and you ended the quarter with 9 new customers, the Key Result was not met.
In fact, it was assumed that you’d hit 10 customers midway through the quarter, cross out the 10 and raise the bar to 15 and then end the quarter with 20. This approach is sometimes referred to as “set the bar high and overachieve.”
It was an unwritten rule that if your team achieved its Objectives, your team would be more celebrated and more likely to get promotions. Your team was successful to the extent that the OKRs were achieved. To be clear, individuals on the team that achieved its Objectives were more likely to get a bonus. After all, shouldn’t a bonus be tied to success?
This system didn’t always work well nor did it claim to be perfect. Suppose a team actually had the Key Result “10 new customers” but ended up with 9. There would be a sense of failure given that we had a binary scoring system. In other words, 9 was interpreted as “falling short.” Not by much, but still, the feeling was one of losing. In summary, this approach to scoring OKRs was diametrically opposed to the culture of OKRs at Google where the worst thing you can do is blow out all your OKRs.
Part 2: Google-style grading on a 0-1 scale.
I later learned about how Google grades OKRs, about the time the Google Ventures video came out in 2013. The idea was to standardize how all OKRs are scored across the organization. A score of “1” reflects a complete achievement; a score of “0” is “no progress.” At Google, the culture values stretch goals. So much so that scoring all 1s on your OKRs means you didn’t set your goals high enough. Now I recently heard a story about a Googler who set goals very high and then went on to achieve all them. Apparently, everyone assumed he sandbagged. I’m not 100% sure this story is true, but given the number of people working at Google, it’s likely that this scenario has occurred multiple times.
I see Google’s normalized scoring model can be very effective. It gives everyone a way of knowing how to measure success. While it may not be perfect, it probably solves way more problems than it creates. If nothing else, it standardizes conversations and streamlines communication about performance on Objectives.
Part 2a: Adding “pre-scoring” into the Google-style grading.
As an OKRs coach, I find most organizations that implement a scoring system either score the Key Results at the end of the quarter only or at several intervals during the quarter. However, they often do not define scoring criteria as part of the definition of the Key Result. If you want to use a standardized scoring system, scoring criteria for each Key Result SHOULD be defined as part of the creation of the Key Result. The conversation about what makes a “.3” or a “.7” is also not very interesting unless we translate the “.3” and the “.7” into English. After discussing this with Vincent Drucker — and yes, Vincent is Peter Drucker’s son — I arrived on the following guidelines that my clients are finding very useful:
Key insight from OKRs coaching: Clearly define OKRs with consistent scoring system for every Key Result
Grading Key Results
- When – At the beginning of the Quarter as part of defining a Key Result
- Make the “rules of the game” clear
- How – Apply a 0-1 scale as follows:
Here’s an example showing the power of defining scoring criteria upfront for a Key Result.
Key Result: Launch new product ABC with 10 active users by end of Q3
- 0.3 = Prototype tested by 3 internal users
- 0.7 = Prototype tested and approved with launch date in Q4
- 1.0 = Product launched with 10 active users
Notice that the “1.0” is identical to the Key Result itself, so it need not be included. Also, the 0.5 is optional and is not typically used. This forces a conversation about what is aspirational versus realistic. The Engineering team may come back and say that even the 0.3 score is going to be difficult. Having these conversations before finalizing the Key Result ensures everyone’s on the same page from the start.
*An alternative approach to scoring Key Results, that I first heard about from a super-cool colleague, takes yet another approach to scoring OKRs that emphasizes the future rather than the past. For more on this, please refer to my white paper.
Part 3: Predictive scoring
Most organizations that approach me for help with OKRs do have some form of scoring OKRs. However, their scores focus exclusively on “progress to date.” You wind up with a data point for each Key Result in the form of “X% complete.” X% complete may have some value; however, more and more OKR users include a predictive element to their scoring. Let’s go back the “10 new customer” Key Result to analyze why predictive scoring is gaining so much traction.
Say you signed 6 customers in the first month of the quarter. Great, you’re 60% complete! However, say you do not believe your team will sign additional customers because the pipeline is dry or a key sales rep just left for a better gig. If you had a way of communicating that you’ve lost confidence and feel this Key Result will stay at 60% and will not be met, you could alert your colleagues. Predictive scoring serves as an early-warning system to better manage expectations and leadership void.
I predict scoring systems will continue to evolve. While some organizations first getting started with OKRs may not grasp the importance of scoring, my prediction is that scoring will continue to be one of the most critical variables to get right for your organization to ensure a successful OKRs deployment for the long term. Some organizations may wind up taking a hybrid approach that combines predictive and historical scoring for OKRs.
However you score OKRs, remember the intent is to communicate targets, manage expectations, and enable continuous learning. Please share your approach to scoring here or contact me, Ben@OKRs.com, if you’d like to discuss privately.
*My colleague is Christina Wodtke. She writes: “Status toward OKRs: If you set a confidence of 5 out of ten, has that moved up or down? Have a discussion about why.” Source: https://eleganthack.com/monday-commitments-and-friday-wins/
Loved the read, really good information about scoring OKRs!
I just also read your “2×2 Matrix for Deploying OKRs” and I think it would also be a good idea to help people getting started with OKRs on how to start scoring, so if you haven’t written about that, please consider it 🙂
I see OKRs as a way to stretch what we think we are capable of doing, so the scoring in itself is not that important to me, the progress that is made is. I like to use the confidence of achieving a KR so that discussions can emerge and work can be managed around the confidence level (try to bring it up!).
Thanks for sharing!
Thanks Carlos! I like the idea of creating a template for helping define the scoring criteria for Key Results. I’ve written about it but will create a more practical worksheet with examples. In fact, I’m just now creating the OKRs Library which will have examples of OKRs with a summary for how the coaching conversation led Key Result scores. You have to pay to access it, but since you posted a comment on my blog, send me an email Ben@OKRs.com and I can send a code to get free or nearly free access for a year.
And yes, more and more organizations are using “predictive scoring” since that keeps everyone focused on the future rather than trying to measure the past. This makes sense to me given that most teams already have systems in place to measure historical progress.
Hi Brooks, Thanks for the comment!
As you know OKRs are really part of the DNA at Google and OKRs exist on a massive scale. So, my recommended approach may or may not be relevant to Google. To my knowledge, no one is using a “pre-scoring” approach at Google, so if you do try out pre-scoring, please let me know
Key Results that are “a prerequisite for some other project” are often treated as “tasks” that need not be measured, but rather are important things we need to do in order to drive a larger outcome. In such cases, we must ask, “What are the Key Result/s associated with the ‘other project’?
And we can then jointly own the Key Result (with the team that owns the ‘other project’) so that we then score the single Key Result. Certainly taking the time to jointly define the Key Result can increase cross-team alignment. We can agree on what amazing looks like (1.0) as well as what expect to achieve with business-as-usual effort and no luck (0.3) as well as a target even though the Key Result is highly dependent. Having such conversations is often very valuable for my clients. Happy to elaborate!
However, in many cases the Key Result is not a “task.” In such cases, your “partial answer” turns out to be the solution many teams wind up using. Draft key results take the form “Deliver X to team A so that they can achieve KR Z.”
Sample Key Result with Scoring: X is delivered to team A by day 20 of Q2, (.3= day 60, .7 = day 45)
In this case, you’re communicating that we can have X ready by day 60 almost for sure, so you can plan on that. We’re targeting 45 days and it we feel it’s amazing and possible we can have it in 20 days.
Taking the time to define and specify what an amazing outcome looks like, versus what we can “take to the bank” upfront is often time very well spent.
This is particularly interesting to me, since we’re in the midst of setting our 2017 OKRs on my team at Google.
One thing that occurred to me in thinking about how to apply this to our team’s general practice is that we seem to have two types of KRs, down at the individual-team level. There are the ones where we are aiming for something that’s a goal in itself, and there are some where it’s a prerequisite for some other project. For the former, it makes sense to set the goal as something that’s a stretch, and there’s value in overachieving. For the latter, though, overachievement is often wasted effort; once the other work is unblocked, further work isn’t helping the overall objective. So how does this scoring system fit into the “prerequisite” sorts of goals, and how do you define the pre-scores there?
Thinking about a partial answer, I suppose that one stretch that may make sense for some of these is schedule-driven. If our hope is to have the other work unblocked by the end of the quarter, then a stretch would be to have them unblocked in two months, or even one month. On the other hand, sometimes even that doesn’t matter — I have one that’s “have this thing in place by the time it’s needed at end-of-quarter”, and getting it in a month early probably just means there were other more-urgent things I should have worked on. Maybe for those, the right thing is just to be clear about that in the scoring: the metric for 0.5 is the same as for 1.0, because “what know we can get done” is the same as “the most we could hope to get done”. If it doesn’t get done, we score a 0.3 at best, otherwise it’s a 1.0.
Thoughts? How does this work for other teams?