Get to Domain Abstraction (Article 7)

Friday, October 11, 2019

8 minute read

Finally we are ready for the most important step in naming. This step is why all of evolutionary design really comes down to “name things well. Continuously.” It is time to correct the domain abstractions we are using and adjust the names.

We currently have a set of names used together. Each one expresses Intent individually. Responsibilities are factored into the right places (each Does the Right Thing). But taken together, the names are a mishmash of ideas. Each expresses itself and its context well, but they don’t create a shared context.

A domain abstraction is just a shared context for some set of code.

At this point, the insights we seek are Whole Values. We want to find generative domain patterns. Concepts that, once they are in place, make the missing code as obvious as the code that is already written.

We won’t necessarily write the missing code yet—we could be wrong about the domain abstraction we see—but we will leave the code in a state where it is obvious where and how to add this code if we need it.

The fundamental process we will do at this step is to find Primitive Obsession, have an insight about a missing Whole Value, and write it down. We will do this mostly mechanically as we want to be able to have insights in code where we don’t yet see the unifying concept.

Finding Primitive Obsession mechanically

We are looking for Primitive Obsession: anywhere that we are trying to describe actions in a high-level domain entirely using concepts in a low-level domain. For example, we might pass around a social security number as a String, rather than as a SocialSecurityNumber. Similarly, we might talk about Set<Cards>, rather than a PlayingCardDeck.

It is entirely possible to spot these based on your own intuition. Use that! However, Primitive Obsession also causes common antipatterns in code. You can look for those in order to learn to spot new kinds of Primitive Obsession.

Here are some of the common patterns I look for. Each identifies some primitive that I am obsessing over.

In sets of classes:

Tight interdependencies with few connections outside the set.
Several methods in one class that use public getters from another.
Classes with similar names:
- Thing / ThingIsValid,
- Repeated prefixes or suffixes: -Transform, -Traversal, Tree-, Tail-.
Classes with CSy or grab-bag names:
- -Manager,
- -Transform.
Names that end with -er, indicating an action rather than a result / object.
Feature envy: obsessing over the methods and properties of another object and calling it all the time.

In sets of methods:

Sets of methods that take the same set of parameters.
Sets of methods that each use the same subset of an object’s fields.
Methods with similar names:
- begin / end / see progress / poll for data,
- create / read / update / destroy,
- source / sink / transform,
- Names which contain near-synonyms.
Chunks of methods which obsess over some variable or small set of variables for several lines (many operations on the same data).
Methods which are often called together (especially common in error handling or creation logic).
Common phrases in the name without a matching noun in the system.

Fields & parameters:

Names that narrow what the thing is, because the variable’s type name cannot (firstName, lastName, SocialSecurityNumber—all on parameters of type String).
Names that narrow or interpret allowed values (rangeInMeters, hardwareModeFlags—both on fields of type int).

Each of these indicates that something is missing in your set of domain abstractions. Other code is coddling that missing abstraction.

Each of these also indicates code that is using an overly-general concept where something more explicit and single-purpose would do.

Part 2: Look for a primitive and the domain concept that would replace it

Every time we see one of the above, we know something is missing. Our goal is to figure out what is missing and create it. This comes in two steps. First we name the primitive, then we name the thing we should use instead.

We don’t need to rename the primitive in the code. We’re about to replace it. Instead we write it down on a notecard or whiteboard. State precisely which primitive we see and what obsessive behaviors we see the code doing over it.

Part 3: Write down by adding new Whole Value that matters

Now we want to write our insight down into code. We need to do the following:

Break the specific concept out of the general concept.
Examine each piece of coddling code.
Break the part ‘../../post/get-to-domain-abstraction’that is related to the new concept out of its context.
Move the related part ‘../../post/get-to-domain-abstraction’to the new concept.

The first 3 steps use refactorings that we already used in earlier phases. Break a concept up by moving from missing to nonsense, then climb the naming levels as desired.

Step 4 uses:

Move Method / Class
Convert to Instance Method
Convert to Static
Convert to Extension Method (if in C#)
Rename

The most common form of rename is to intentionally combine code chunks.

When a primitive has been running around the system for a while, often multiple pieced of code wanted to do the same operation to that primitive. The operation was small (often a 1-line loop with an accumulator variable or a simple conditional). So they just duplicated code inline.

In step 3 we extracted these duplicate chunks into tiny methods. We gave each a good name based on its context (an intention-level name).

In 4 we:

Move them all to the new class.
Use non-overlapping names so that we ensure zero behavior change.
Sort together those with identical implementation.
Examine pairs with identical implementation to see if they also share intent (not all duplicates have the same intent, especially when we are adding a missing domain abstraction).
Any time we find a pair with same implementation and intent, we rename whichever has the less-revealing name to match the name for the other.
Then delete one, leaving the callers sharing code.
Repeat until each pair that shares implementation doesn’t share intent.

It is also common to find pairs that share intent but not implementation. These could be bugs (someone updated one use of the primitive but missed another). They could also be features (the intent is nuanced).

When you find two chunks with similar intent but different implementation:

Rename them to be very similar.
- Use a common prefix followed by something that distinguishes the difference.
- The difference will be implementation, not intent / domain-relevant. That is a smell but better than we had before.
- We can later address that smell by going through the naming process again. Finish this pass first.
Write a pair of tests, one for each chunk. Describe both the common part (with a shared assertion method) and the difference (with two clear, separate asserts).
Optional: combine the two methods into one method with a discriminator variable.

The last step can make things either more or less clear. It can hide the code smell and reduce the probability of fixing, but it can also create a new concept (the discriminator variable itself) which can grow into a valid domain abstraction.

For details on how, see my gist Combine 2 methods into one with a discriminator variable (I’m combining two slightly different implementations of Car.Drive()).

The history shows a complete set of refactorings (17 steps) to implement the re-design without ever violating safety.
I have tests only to show what happens to call sites. I never relied on tests to verify any step.
I would be equally comfortable (and safe) performing this on entirely untested code with thousands of call sites.

And that’s it!

We identified our domain abstraction and wrote it down. At this point our naming is done.

In the real world it probably took me a week or so to get through this whole process (at a total time cost of 30 minutes spread across that week). I find missing names one at a time, create them, and get them up to the honest level. After doing that 4-6 times in one area area I upgrade each of those names up to does the right thing (through complete on the way). After I do that 2-3 times the intent becomes clear and I upgrade the names.

And every so often I see primitives lying around or duplicate sequences of statements that operate on the same variables. When all the names around are intent-revealing, the missing Whole Value becomes obvious. It often requires 2-5 flurries of creating intent-based names before a whole area is intent-revealing.

I introduce a new domain abstraction. Usually, I introduce 3-5 of them in a single quick flurry. The code rapidly changes across the project to use the new abstraction in about 60% of the applicable cases. Then slow change happens over another week or two as the abstraction proves to be valuable. It attracts new operations and cleans up the other applicable cases.