Scrum@Scale Case Study
“Nail It Before You Scale It”: Set Up a Solid Reference Model
– Hiren Doshi
CASE STUDY SNAPSHOT
Industry: Information Technology and Security
Topic: Reference Model
Industry: Information Technology and Security
Website: Hiren’s website
According to Scrum at Scale trainer, Hiren Doshi, an organization that lacks good functioning Scrum teams prior to scaling will see dysfunctions multiply as they scale. For this reason, setting up a robust reference model prior to scaling is absolutely essential. The goal of a major data storage organization was to create a flagship product used for a complex multi-vendor IT infrastructure. The product can be installed on one machine that then manages the entire storage area network, providing services like reporting, planning, and provisioning.
A Dying Legacy
With revenue of over $600m annually, the organization had over 250 employees spread out across three continents—North America (Boston), Europe (Ireland), and Asia (India). Though the organization’s flagship product reliably served over 4,000 active customers daily, the legacy software was written about 15-20 years ago and upgrading their monolithic codebase had become a problem. The software was written before the virtualization revolution, and this organization had a massive problem with technical debt. The product they relied on was not scalable and had no automation in place—if, for instance, a new piece of Cisco firmware was rolled out for their machines, they would require many months before they could support the update. Their challenge, therefore, was to change or to perish.
A Few Test Sprint Iterations
Both because of its age and size—containing around 27 million lines of code—a new codebase could not be rewritten all at once. Hiren was faced with the challenge of how to rewrite one section of code. He divided the work into modules and chose one module from a suite of modules to update, while maintaining the system’s functionality. Another concern was that if changes only took place at one site, other sites would lag behind, so they also formed three geographically distributed cross-cutting Scrum teams to set up an initial reference model.
Hiren put in place a Scrum of Scrums (SoS) Team that met at the Scaled Daily Scrum, initiated an Executive Action Team (EAT), and a Metascrum. Once these were introduced, they tested this reference model by running a few sprint iterations and collecting feedback along the way. Upon inspection they uncovered a few findings: Teams had a poor understanding of Agile and Scrum, they were working at a non-sustainable pace with ad-hoc work piling up, the monolithic codebase was a liability, and there was poor decision latency with many layers before a decision could be approved. Their adaptive response was: An Agile Process dedicated to formal training for Scrum Masters and Developers, agreement to achieve minimal viable product (MVP), a single ordered product backlog to assist in prioritizing work, a scalable Representational State Transfer architecture (RESTful) to begin to update technological debt, and co-located Scrum Teams at each site to ensure everything was visible across the sites.
Some Demands, Accountability, Before & After
After implementing the response plan, they were able to target the areas that needed to be addressed first. They recognized the need for ruthless elimination of waste and impediments, allowing them to embrace continuous integration and continuous delivery. They asked for tools, environment, infrastructure, and understanding of dependencies and empowered the Scrum of Scrums to address all of the hidden dysfunction that was revealed. They needed budget approvals, training, recruitment, skill enhancement, and asked the EAT and the Agile Center of Excellence (ACOE) to fund and implement the necessary programs and training to allow growth. And finally, the EAT and Metascrum embraced the need for refinement, ordered a single product backlog with clear accountability, and applied MVP (20/80 rule).
These changes had great results. After one year, the organization went from a 12 to 18 month release cycle to a quarterly cycle and eliminated the monolithic system to implement a RESTful architecture in all modules. Also, happiness surveys from team members also went up dramatically—from 20% initially unhappy to 80% very happy in the same period, showing a significant increase in job engagement and satisfaction. And ultimately, two metrics that indicated significant change and improvement brought by implementing Scrum at scale were the defect rate—which improved from an initial state of 20,000 defects to an ending state of fewer than 200—and revenue—which grew from an initial operating state of an annual loss of $100m to an annual gain of $200m.