We also support our readers through the Code by Refactoring Slack channel. Please join us there to discuss any part of the technique along your way to gathering your own scattered component.
My Code is Everywhere!
Let’s continue with the last newsletter’s example of an organization with more than 100 teams editing a single codebase. The codebase was well-structured. It had to be in order to support that much change.
It had a well-designed, cleanly-implemented 4-tier architecture. The key innovations that drove the business were factored into their own mega-modules. Those mega-modules connected to the rest of the code through well-designed, generic, extensible interfaces. Each mega-module encapsulated person-decades of innovation and complexity without bleeding much into the rest of the code.
Yet even so, most features required changes in several mega-modules at once. For example, adding a new element to an analytical display required small changes in each of 3 modules: data collection, computation, and UI. Because the modules and tiers were architected to be so independent, it was hard to share data between different parts of the same story. Every change was costly, verification was hard, and bugs were common.
The clear solution was to extract our existing code into components, but the code for each component was scattered around each module and isolated by the module boundaries. We needed a way to iteratively bring the code together and reduce isolation between parts of the component, while maintaining the good isolation in our architecture.
Our component isolation strategy needed to be incremental. With a hundred teams each making changes, any team that branches away from main for long will cascade integration problems across many adjacent teams. This is especially true with refactoring. Our strategy needed to answer two key questions:
- What common design constraints should we apply to each component? What design patterns should they follow?
- How should each team perform their component extraction to minimize impacts on other teams and the release schedule?
What: Expose a Chunky API to Each Module
Our technique draws from functional programming, tell-don’t-ask architectures, microservice architectures, pub/sub systems, language design, and distributed dependency management. This will be the most concept-rich newsletter in the entire sequence.
With over 100 teams, this application will have thousands of components. We cannot afford for each component to expose a fine-grained interface. That would create too many dependencies and interactions, so we would keep most of our current problems. Therefore we want to remodel each component to expose a narrow API where each invocation does a lot.
We also want to keep the current module structure in the main application. Each method in the component’s API needs to target one module. The component needs to internally manage the cross-module sharing. Because there will be so many components, we cannot afford for the main application to hold a reference to each component. We also can’t afford for each component to hold reference to each of its dependency components. The only references will be from the component to the main application.
However, the code that will go into the component is currently fragmented across the entire codebase and each chunk both references and is referenced by foreign code. We need a strategy to fix that.
How: Gather, then De-Isolate
First we are going to solve the scattering, then the isolation. We will proceed in 4 steps:
- Find: Find the code that belongs in the component, even if it isn’t being changed right now.
- Gather: Separate the code from its dependencies with the rest of the application and move it into the component. This is the same thing we did last month.
- Merge: Refactor the code used by each module into a course-grained API.
- De-Isolate: Refactor the isolated parts of the component into a cohesive whole, independent of the application’s structure.
Extract the Pirate Ship From the Lego Castle
Extracting a component is like sorting a bunch of Lego and building a pirate ship. We have a large Lego castle with tons of pieces. We are going to break apart all the ship related Lego, pull them out and remodel the castle, and connect the ship Lego back up to make a ship.
The challenge is the same with both the code and the Lego. We are creating a new construct — the ship / component — which was previously scattered across the entire castle / application. The new concept is its own entity and all hangs together, rather than having each block interlock with foreign blocks. Creating the new concept requires us to change which blocks link to each other.
In the code, code blocks link together when they share data or when they call each other. Thus, changing the linkages means changing the data flow and order of program execution. Even though we do the same operations, changing their execution order can change program behavior.
This is a problem that cannot be solved by refactoring. Every refactoring will maintain program behavior, including execution order. Instead, we need to combine lots of refactoring with judicious remodeling.
Gathering a scattered component is complex. My first several passes at this newsletter were way too long and got lost in entire bogs of weeds. So this month we have split the strategy discussion into a page of its own. Click to read the concept page of gathering scattered code, which covers the strategies below.
- Remodeling Code, Not Refactoring: The difference between them and how to use them together.
- Overall Sequence: What execution order we will create, why, and the strategy we’ll use to get there.
- Where to Start: Why we gather before de-isolating, even though it seems like the cross-module scattering is more valuable to fix first.
- The Process: What the steps find, gather, merge, and de-isolate really mean.
Solve scattering and isolation by Finding, Gathering, Merging, and De-Isolating the code.
Your Component Can Evolve Independently From the Application
Now you are able to finish off the component that you started with during the last newsletter. Previously we captured all new code related to the task; now we also gather all the fragments of existing code. You can now create a simple design to accomplish the exact purpose of this component, independent of the rest of the application. Future story edits will be collected in one place and tend to be semantic design changes, rather than shotgun surgery. It will be easier to test your new component independently from the rest of the application.
Everything is incremental. Gathering your component should not block your own feature work or impact other teams. You can ship while you are in the middle of your extraction. While there is a lot to learn, you can learn it as you need it during your first component extraction. And a component extraction is pretty quick once you know how. Expect to extract your first component with a total of about two person-weeks of effort spread across two to four people and two to three months. Expect your second and later components to take two person-days spread across a single sprint.
- Incremental path to isolation.
- After extraction, > 90% of the stories related to this component should require 0 edits in the main codebase.
- Simpler design for this component’s task.
- Reduce bugs created while writing stories related to this component’s task by > 50%.
- Eliminate cross-team dependencies for stories related to this component’s task.
- There is a lot to learn. This process introduces many new techniques.
- This is not a provable refactoring. You can introduce bugs while extracting. You will need to be careful, which means the economics of this step aren’t as favorable. You will be investing now to gain later. Make sure that everyone is aligned and ready to make this investment.
Demo the value to team and management…
Show three things at your sprint demo:
- Example: one simplified component interaction.
- Progress: block movement chart + percent of independent changes chart.
- Impact: decreased waiting.
Example: one simplified component interaction
Your goal is to show that you can now simplify your component. You should be able to change its design without impacting the main application.
Show one place where you have merged several blocks into a chunkier public API. Then show a small amount of refactoring within that API and how it makes your code simpler, while having 0 impact on the application or other teams.
Re-affirm the message from prior demos that this is just code organization. Product Owners and Managers still don’t need to know about component boundaries and it does not impact story definition. It just reduces your costs and error rate when you implement stories.
Progress: block movement chart
Each week, count the number of code blocks (inline method chunks + methods + classes) that belong to your component and are in each of the following states:
- Intertangled with foreign code.
- Public, in your component’s namespace, but not yet clean.
- Exposed as a designated public API and called accordingly.
Create a stacked line chart, showing how these change over time. Progress is visible as the total number of blocks increases (as you find them), then drops, until eventually there are just a small number, all of which are public API.
Additionally, keep presenting the progress chart from the previous newsletter that shows the percentage of changes that are independent from foreign code and teams.
Impact: decreased waiting
This is the same measure as the previous newsletter. Continue presenting the same data.