Performance indicators are undoubtedly the most powerful management tools in the hands of managers. Set the correct work goals, accurately track the progress, and do these two things, and the company can go nowhere. But to be effective, work objectives and assessment indicators need to be simple and clear, and the fewer items the better.
Our company, Agoda, is growing in size, and we have noticed that simplifying our assessment metrics is more beneficial for us to achieve our desired goals. Previously, we set too many goals that were too cumbersome, and it was difficult to unify opinions within departments. Not to mention cross-departmental coordination, it was difficult for employees to decide what to do. But the expected effect could not be achieved. So we changed our approach and had only one KPI (Key Performance Indicator) in the main customer-facing functions. This department is responsible for the design, creation and daily maintenance of the company's online stores. This key metric can be unified across teams to make better-informed investment decisions. However, we also realized at the time that, in the absence of any constraints, if only one KPI is set, it is easy to have unexpected and serious consequences. We know that any basic goal must have matching constraints, which can be summed up in one sentence: "Maximize X without reducing Y".
Agoda is an Asian subsidiary of online travel group Booking. In this article, we will share the company's experience in optimizing business operations with a "single KPI + constraints" approach. We have also experienced some trial and error; after testing the established assessment indicators, we found that the results are good or bad, some have a positive motivating effect on employee behavior and work performance, and some are not; slowly, we After exploring some guiding principles and implementing them into work practice, the company's performance has been improved, and a cultural atmosphere of learning and cooperation has also been created.
We believe these lessons and lessons can be helpful to other companies, both e-commerce companies and non-e-commerce companies, so we share them in this article.
Exploring the index system through testing
From the very beginning of the company's establishment, Agoda has made continuous optimization of the front-end, that is, the interface of the company's online stores as its core work. Consumers log in to the company's website or mobile application to search, select and purchase related travel products. Our goal is to make consumers better. of online visits translate into actual product sales. By increasing this conversion rate, digital businesses can be positioned more effectively because it is easier to maintain existing customers than to acquire new ones. After customer visits are converted into product purchases, the company can also use the operating income generated from it to further improve marketing effectiveness. After more conversions in the future, it can bring higher returns.
Considering these factors, it seems that conversion rate can be the perfect basic KPI, with return on investment (ROI) the obvious constraint. (Why waste energy on a conversion rate goal if it’s hard to hit and doesn’t generate revenue in the future?) When calculating conversion rate, it’s common practice to divide sales by visits. But unfortunately, measuring traffic is not as simple as it sounds: There are many factors that influence traffic, including unruly bots, marketing campaigns, and indirect customer visits to your website.
But we were clear that we wanted to drive revenue through higher conversion rates and do that before our competitors. But if traffic numbers are so unreliable, how can we accurately measure the progress of the work and reach the desired goal? To get our head around it, we started with a series of small experiments. What platform tweaks can help improve conversion rates? After we have a certain basis, we test it through small experiments one after another. We used A/B version controls in our tests, and for each customer segment we had a control group that tested a small dot at a time (like a color, a button, a picture, or something like "Limited number of rooms!" You made the right choice!" and so on in a simple sentence). Whichever group in the experiment led to a higher conversion rate was considered a success. All of these options that won the test are integrated into the program code and then pushed to more customers.
Discovering success options is not easy. In many cases, we are optimistic that the solution to increase sales will not work. About 80% to 90% of the ideas in the experiment came from very smart employees, but it didn't work as expected, and sometimes it even backfired. For example, when we tell a customer to "order now or the room will be gone", the customer may book the room because of this, but may also feel disgusted and exit the booking process. Also, unexpected vulnerabilities may appear after modifying the program. But these mistakes are unavoidable in the process.
Through experiments, we can judge which solutions are feasible and which are problematic. In the beginning our experiments were all one-offs. But then in order to learn new knowledge from experiments more effectively, we built a central control system through which we could log in to the relevant applications, analyze the test results in detail, and then modify the website and evaluate the effect of the modification. With the "engine" of experiments, we have a more unified and controllable basis for making relevant decisions and evaluating conversion effects. From this we derive a fundamental evaluation metric of earlier work: speed of experimentation.
Speed is a double-edged sword
After understanding the importance of experimentation for front-end decision-making, we quickly realized that it was necessary for companies to speed up the experimentation engine. To convert more online visits into actual sales, we set velocity as a basic KPI of how many experiments we can run per quarter. If too few experiments are carried out, no matter how good the results of each experiment, the company's development is not reliable.
By constantly subdividing experiments, testing one option at a time, we find and solve problems faster and faster. In addition, we have managers who specialize in engineering and process management do process optimization. At first we could only do dozens of front-end experiments a quarter, but after a few quarters, the number of experiments grew to more than 1,000. Although this has led us to find many ways to improve conversion rates, the company's website procedures are getting worse and worse due to frequent revisions. To put it bluntly, there are so many bugs in the program that it explodes. So we introduced a constraint to the KPI of experimental speed - code quality, which is aimed at the number and severity of program vulnerabilities.
The pursuit of speed metrics while meeting quality requirements has transformed the way we operate. We re-examined the system architecture, looking at how to make changes faster, but not fast enough to break the system. To that end, we built software tools on thousands of servers in the company's data centers around the world to consolidate, automate, speed up, and monitor code tasks.
Many companies take a week or more to deploy new code, and we update it 4-5 times a day. To support this update rate, we rethought our systems, staffing, and organizational structure. As system vulnerabilities increase, we must also establish better means of quality assurance. We have invested heavily in the network operation center to track the performance of the platform. Once the website behavior has any serious deviation, the operation center will alarm in real time.
Deviations in website behavior are common, either as a result of internal changes or from external factors. Only with a firm grasp of the data can management determine which deviations are normal and which are truly problematic. To that end, the company has developed a new system for finding statistically significant changes, even small ones. In addition, we have developed monitoring tools, which can help us record code modifications that cause changes to the operation of the system, allowing us to analyze the root cause of the problem and solve the problem faster. We also invested in data systems to track traffic to each market. When there is an obvious abnormality in the website application, the system can quickly identify it and prompt the network operation center. With this system, we can focus on problems that may have been overlooked before, thereby improving the responsiveness of website applications.
Revise basic KPIs
The increased speed of focused experiments helped us achieve a key goal: more experiments allowed us to increase online room bookings to a certain extent. But sometimes this metric creates the wrong signal. Some teams run many small experiments in order to gain speed bonus points (and corresponding rewards), and they don’t care if these experiments have any effect on improving the actual conversion rate of the platform.
What changes will maximize conversion rates? To answer this question, we developed more powerful data analysis tools. We have come to realize that to create greater opportunities, we need to make adjustments to the basic KPIs.
We switched to incremental bookings per day (IBPD) as the basic KPI, with the constraints remaining the same, the number of system vulnerabilities and the severity of the problem. This new KPI is very simple: in the two test versions A and B, the number of bookings generated by version A (as the control group, maintaining the previous practice unchanged) is n1, and version B (doing a certain amount of work on the previous basis) adjustment) yields an amount of bookings n2. With the following formula, we can calculate the impact of new daily bookings for the new version:
Both versions A and B can bring room bookings, and we have to choose the one with better effect.
This KPI was created to reward teams that can use experimentation to create value for the company. We set a daily new booking target for several teams at the same time, and let them go hand in hand and compete with each other. We hope that in this way, the team can try larger and more aggressive experiments, rather than simply pursuing the number of experiments. We're looking for experiments that maximize conversions. If teams are afraid of failure, they will experiment with low-risk, low-value contributions. In order to dissuade the team from fear, we actively encourage them to try higher goals, even if it takes a little risk. If a team tries this, we will adjust their work goals appropriately, allowing them to have room for trial and error without fear of being punished for failure.
The daily new bookings metric provides real benefits to the company. As we expected, the team shifted its focus from the quantity of experiments to the quality. They start to develop optimal solutions and share them with each other. Once the precision of the experiment improves, it becomes easier for the team to make new discoveries. And because everyone wants to find good ideas and keep adding to the icing on the cake, employees are also very active in exchanging success stories with each other. For example, they learned that when giving customers choices, they can’t have too many options, they can’t have too few options, they have to fit, otherwise, the customer won’t book a room. The team first made the above findings from the company's central data system's analysis of customer login and transaction behavior, and they applied this principle to many aspects of website design, such as how many hotel search results to provide customers How many photos are displayed in the open gallery, etc.
The new KPIs are also significant for management. First, it allows us to compare the conversion rates of different teams to determine where to invest more and where to pull back or change strategies. The new assessment method allows teams to catch up with each other, but it also makes them more cooperative. Company employees found that if they shared new learnings with other teams, they were willing to share what they learned, which ultimately helped everyone achieve their mission goals.
Having improved the performance of our online stores by measuring the number of new bookings per day, we also realized that the right KPIs can help in all of the company's business decisions. For example, the original basic KPI of the marketing department is the number of visitors brought by marketing activities within a certain period of time, and the corresponding constraint is ROI (visitor output brought by marketing expenses). This metric is limited because increased traffic doesn't necessarily lead to increased conversions. Conversion rates improved significantly when the marketing team also began experimenting with new daily bookings targets. For example, we do marketing to attract visitors at the same time period, targeting the same market, and we also prepare two versions, A and B, so that we can compare which version can generate more room reservations. Although some external factors will affect the actual effect of some promotion activities and cause the conversion rate to fluctuate, we can more accurately judge whether the promotion activities are effective by comparing the A and B versions. These campaign ideas can be further tested in other markets.
Through the daily new bookings metric, management can also better balance different business units (such as increasing the staffing of the engineering department or increasing the marketing expenditure), thereby further optimizing the company's investment. With the daily new bookings metric, we can not only see how many conversions we have added each quarter, and how that conversion rate has changed quarter-on-quarter, we can also see what factors are behind the success.
Using the same KPI to evaluate different internal business departments allows management to have an overview of corporate performance and make it clear at a glance. With this overall view, internal investment decisions are more informed. With the daily new bookings metric, we can not only compare the contributions of different teams and allocate resources accordingly, but also evaluate the performance of product managers. If they don't meet their job goals, our first reaction now is to move them to a more productive field to see if their performance improves. In effect, this is an A/B test of a product manager.
Continuously fill gaps and correct deviations
While we've taken a big step forward in management by adopting daily new bookings as our base KPI, we're still debating: Are we heading in the right direction? Previously, we used the speed of launching experiments to assess the team, and as a result, some people strayed. Now that this new indicator is in place, will there be another practice of discarding the basics? Are there other better evaluation metrics? Are there aspects that we have overlooked?
Regardless, we understand that the daily new bookings metric doesn't help us solve all management issues once and for all. For example, there are always bugs in programming. Some of the big, obvious bugs we've addressed pretty well. However, for some small loopholes, the team is not very concerned, because it is difficult to find small loopholes, and the number of room reservations cannot be greatly improved if they are solved. However, if there are too many small loopholes to a certain extent, the company's website will collapse due to the superposition effect, because the destructive power of loopholes will be great when they accumulate, and the result is equivalent to every user encountering loopholes. We later learned that even small bugs require a clear minimum for companies, which means that we can only tolerate a small number of small bugs at any one time. Above a certain number, the team must fix the bugs before receiving the full performance bonus, even if these small bugs have little impact on the number of new bookings per day.
One of the more divided (and more difficult) points of the debate is how to use statistics to promote behaviors that actually benefit the company. On a highly optimized platform, the vast majority of successful experiments can only increase daily new bookings by less than 1% (the company is constantly optimizing the program, which means finding a point to improve and then improve performance becomes more difficult). Because the performance change is smaller, it is difficult to distinguish which experimental protocols lead to real change and which are just statistical noise. Only when this problem is solved can the product manager make the right decision.
Once again, we found that incentive policies led to a misalignment of efforts: we asked teams to show evidence that so-called successful practices were indeed related to adding business value. But we found that when bonuses were tied to the daily new bookings metric, teams tended to be more results-oriented, and as long as the experimentation gave performance points, they saw it as a valuable contribution. They didn't really care whether the experiment brought significant improvements to the company or was just meaningless noise. This is a natural behavioral response, but it does not produce results that are beneficial to the company as a whole.
To this end, we further optimize the KPI, which we call Unbiased Daily New Bookings (UBI). Here's how: each time an experiment is marked as a success, we extend the same experiment for an additional period of time (usually a week) before rolling it out on a larger scale. The results of this subsequent evaluation will be factored into team performance. If the first experiment has a seemingly positive effect because of statistical noise, it is equally likely to have a negative result in subsequent evaluations. When the sample is large enough, the bias will disappear on its own. When this happens, the team's UBI score is zero, which means that the team's efforts around performance goals have not made substantial progress, and they will not receive performance rewards. When teams recognize that their experiments may be at risk of scoring zero or even negative scores, they will actively approach those experiments that actually lead to improved performance. Instead of moving forward with an experiment because it seems to work, they will judge after more careful study whether the adjustments they are making will actually change customer behavior.
However, this assessment indicator is not perfect. UBI is effective when experiments are carried out in multiple teams and reach a certain scale. But if a team only does 10 to 20 experiments in a quarter, they are prone to a lot of noise because they don’t have enough tests to effectively exclude bias. (Our experience is that at least 50 experiments are required to exclude bias.) Also, if the results of the experiments are not clear enough, the team may not accept it, so they will repeat it over and over again, hoping to get better statistical results. But this will reduce the speed, which is equivalent to returning to the original predicament. It's the same as when customers make choices, it's not good to have too many or too few options. When the team conducts experiments, the number is small, and the conclusions are not accurate; if the number is too large, it is easy to slow down the speed and hinder the development.
We would like to illustrate that setting basic KPIs for teams is a process that requires constant adjustment. Despite the limitations of the UBI system, we believe that it is generally very effective. This assessment metric more effectively unifies team behavior to the fundamental principles of improving conversion rates and creating business value, while at the same time, speed is not seriously affected. The UBI indicator allows us to evaluate the turnover contribution of teams and individuals on a quarterly basis, because each additional room booking can be converted into corresponding revenue. Managers can use the data to determine who deserves to be rewarded for outstanding performance and which areas of the business should increase investment.
change corporate culture
We began to try to set up the management method of "basic KPI + constraints" from the online store and the marketing department, and now it has been implemented throughout the company. Today, in every functional department of Agoda, even those that do not traditionally focus on experiments, the management method of "basic KPI + constraints" has become an important basis for the person in charge to make decisions. In the company's home buying department, we assign tasks to employees in 35 countries and then use marginal points (a metric we use in departments that cannot directly increase the number of room reservations, as an alternative to UBI) to evaluate their The impact of each task on performance. The corresponding constraint is to reduce the negative behavior of the partners (we hope that the property procurement team will actively strive for available properties, thereby helping the company reduce costs and improve profit margins. But if they lower the price of the partner hotels too much, the partners will may stop working with us). We conduct experiments to evaluate strategies that maximize profitability while maintaining mutually beneficial relationships with our partners.
Focusing on KPIs has brought unexpected changes to the company culture. We spend a lot of time explaining the assessment indicators at the start of each project, and repeatedly preach during the process of the project, because we want team members to take responsibility for the assessment standards. But there is one change that surprises us. Hierarchy becomes less important in the decision-making process. People debate how to time the measure of success, often on a case-by-case basis rather than self-centeredness. In the process of constant testing, even the most senior and experienced people are often proven wrong. Even senior leaders can't bully them when they see that their subordinates have better ideas. And since everything depends on experimental results, persuasion skills are not as important as they once were.
Unifying KPIs, setting constraints, and then continuously optimizing, the biggest benefit to corporate culture is that the team has a common goal. Previously, there were hundreds of assessment indicators, which often caused internal confusion, and departments went their separate ways. After the assessment indicators were simplified, everyone’s efforts were in the same direction. Since employees can easily see the contribution of their work to the company's performance, it is easier to form an atmosphere of cooperation, mutual assistance and learning within the company. Employees will recognize that they can benefit from the work of their colleagues and that knowledge generated and shared within the company is a common asset.
For the above reasons, we believe that setting the correct KPIs and constraints, and constantly tracking corrections, is an effective weapon for us to deal with severe market challenges. This allows us to know what we should do, and it also points us to the future development path. I believe this will also help other companies.
%20(166).jpeg)