You are responsible product manager. You carefully hone your multiple ideas. You split test (A-B test) to refine and pick the best ones to invest. You build out the winner. You make your big release … and it flopped.
What happened!? You put in all the effort to make sure this won’t happen, and yet it did.
Getting the right data.
What you didn’t know was that several of your tests had a small cosmetic bug on one treatment. A critical link was too difficult to spot. A button went to the wrong place. The product images were off-screen in one browser.
But those are minor! So minor, in fact, that your developers will say to not even bother fixing cosmetic bugs. The problem, however, is that bugless split testing results can be only 2-3% difference in audience engagement. Thus, if you have even a 4% difference in audience engagement due only to the bug, your results no longer represent the actual product differences.
Garbage out of your DevOps pipeline has now created garbage data in your product testing.
It’s OK … we use telemetry.
A well-put together DevOps pipeline is relying on telemetry to find defects quickly, and then deploying the fix. The fatal flaw with that reliance is that telemetry detects problems by observing changes in user behavior, while split testing detects product preferences by observing changes in user behavior. As a result, every time behavior changes, you have to guess whether it indicates a bug or important information for the split testing. Getting that guess wrong even a few times will invalidate your data. Worse, it’s improving your confidence in the wrong data.
Two goals … one signal flag.
On paper, the DevOps pipeline is supposed to detect any defect quickly, allowing you to remove it before it has business impact. Also on paper, the Lean Startup product testing is supposed to detect poor product direction choices quickly, allowing you to fix them before significant business impact.
The reality is that telemetry and split testing is using the same signal, so they corrupt each other’s data. The only way to ensure the split test data is accurate is to ensure that both have the same bugs. The problem is that you don’t know what bugs you don’t know about, so ensuring identical bugs is harder than ensuring zero bugs.
Focus your pipeline on product signals.
The only way that we can ensure clean data to make product decisions is if the pipeline is not corrupting it. Whether you are using an old codebase or creating new code, it is necessary to have no garbage (bugs) going into your pipeline so that you can pick up the product signals and have accurate data.
Bugs come from dangerous code; code that does the right thing now but is difficult to change without introducing a bug. Old codebases grow dangerous over time, while new codebases rush through frequent changes that create dangerous code rapidly. In order to prevent bugs in the pipeline, we need to eliminate dangerous code. This is moving away from identifying every bug and fixing it to a mindset of bug prevention.
Adopt bug prevention practice.
While TDD will point out where the dangerous code lies, it will do nothing to make it less dangerous. The missing skill that truly empowers bug prevention is Disciplined Refactoring. This method of refactoring isn’t just any interpretation; it’s a prescriptive method for improving the developer’s usability of the code while proving it doesn’t change the behavior of the code – even without tests.
There are two soft starts to get into Disciplined Refactoring so that you can start unclogging your pipeline.
Ending Untestable Code
Adopt Arlo Belshee’s Insight Loop Change Series
These techniques allow the developer to extract one untestable behavior at a time, figure out what it does, and then test it. This lets your pipeline detect garbage that was previously not detectable.
There are clear techniques for developers of any level to apply in their everyday coding work that will naturally clean the code as they work. While learning these takes time if taught through workshops and technical coaching, that time is drastically reduced with teams adopting very specific behaviors. Just like the fix, the learning itself has no magic shortcut. But if you improve one technique at a time, every developer will create clean code.