Writing Builders for Complex Data Structures

1 Introduction

New objects are usually created by using the “new” operator and possibly passing a number of arguments to the constructor.  But for classes with many properties or with a more complex structure, the number of parameters the constructor needs becomes unwieldy or intractable.  Allocating an uninitialized object and then initializing its properties with setters gets very repetitive and unwieldy as well.  The builder design pattern offers a concise alternative.

A builder is a class that has a method (or methods) for each property to be initialized that initializes that property.  Each of these methods returns the builder so calls can be chained together using the fluent syntax pattern.  At the end of the chain, a build() method is called to actually build the object.  Typically,  a builder’s build() method can be called multiple times to construct multiple instances of an object.

This tutorial will examine what I have found to be the best way to design and implement the builder’s API.

2 Where to Define the Builder

Ideally, a class’s builder should be defined as a public part of the class itself and simply called Builder (though later on, a specialized builder with a variation on that name will be explored).

public class Resource {
    public Long id;
    public String resourceType;

    public class Builder {
        ………
    }
}

However there will be times when one doesn’t have control over the definition of the class being built, requiring the builder to be coded as a standalone class.  In this case, I recommend concatenating the class’s name with the word “Builder” to name the class’s builder.

public class ResourceBuilder {
   ………
}

3 Cloneable

In order to return a unique instance of an object for each call to build(), it is necessary to be able to make a deep copy of the object being built.  A copy constructor could be used, but I find implementing the Cloneable interface more convenient as it requires less maintenance work when the class changes.

Public class Resource implements Cloneable {
    // declare properties
    ………

    @Override
    public Resource clone() {
        try {
            Resource clone = (Resource)super.clone();

            // do any deep copying needed here
            ………

            return clone;
        } catch (CloneNotSupportedException e) {
            // this should never happen
            throw new RuntimeException(e);
        }
    }
}

In cases where one doesn’t have control over the implementation of the class being built, it may be necessary to have a utility method to perform the deep copy in the builder itself.

Public ResourceBuilder() { 
    ………

    private Resource clone(Resource resource) {
        Resource clone = new Resource();

        // deep copy from resource to clone
        ………

        return clone;
}

4 The Builder’s Constructors and build() Method

4.1 build()

The builder needs to have an object of the type being built to serve as the prototype for the object returned by the build() method.  The build() method will return a clone of the prototype object, allowing the prototype object be be further modified to build other similar object(s).

Though build() is the most common name for this method, something like toResource() where Resource is the type of the object being built is acceptable as well.  The important thing is for the method’s name to not resemble the name of a field being initialized.

4.2 Constructors

For building an object from scratch, a parameterless constructor is defined which allocates a new uninitialized prototype object.

For building an object similar to another object, a constructor is defined which takes an object as a parameter which the constructor will clone to use as the builder’s prototype object. The user might then modify just a few properties before calling the build() method to get an object that’s the same as the original object, except for the handful of properties that were modified.

For building an object similar to what would be built by another builder, a constructor is defined which takes another builder as a parameter. The constructor will clone that other builder’s prototype object to use as the new builder’s prototype object.

class Builder {
    Resource resource;

    public Builder() {
        resource = new Resource();
    }

    public Builder(Resource resource) {
        this.resource = resource.clone();
    }

    public Builder(Builder builder) {
        this.resource = builder.resource.clone();
    }

    public Resource build() {
        return resource.clone();
    }

    ………
}

4.3 Prototype Object

The “resource” field serves as the prototype or template for the object to be built.  It is common to name this variable “prototype” or “template” as well, but I prefer a name to describe what it is rather than how it’s used.  This is a matter of personal taste though.

In many cases, it can be private, but for complex objects it’s more convenient to leave it package scoped so that it is visible to subobject’s builders.  These cases will be expored in a later section.

4.4 Abstract Base Class

If a program has lots of builders, this boilerplate could be refactored into a parameterized abstract class for all builders extend.  But given the simplicity of the code (all the methods are one-liners), this may be more obfuscating than it’s worth.  The derived classes would still need to define constructors that call super(…).

abstract class AbstractBuilder {
    T prototype;

    public AbstractBuilder() {
        prototype = new T();
    }

    public AbstractBuilder(T resource) {
        this.prototype = prototype.clone();
    }

    public AbstractBuilder(Builder builder) {
        this.prototype = builder.prototype.clone();
    }

    public T build() {
        return prototype.clone();
    }
}

5 Simple Builders

Consider the following simple class (a real class would define setters and getters, but for the purposes of illustration, I omit these details along with the Cloneable implementation):

public class Resource implements Cloneable {
    public Long id;
    public String resourceType;
    ………

    public class Builder {
        ………
    }
}

A builder for this class, might be used to create an object in this manner:

Resource resource = new Resource.Builder()
        .id(1234L)
        .resourceType(“PLACE”)
        .build();

Each property has its own method, with the same name as the property, for initializing that property.  Each method returns the Resource.Builder object so calls can be chained together using fluent syntax.  Also, there’s a build() method that returns a unique instance of the built Resource.

Some people like to name the initialization methods withId(), withResourceType(), etc.  But the addition of “with” to the name is syntactical sugar that just makes the syntax look cluttered.  I don’t recommend it.

public Builder id(Long id) {
    resource.id = id;
    return this;
}

public Builder resourceType(String resourceType) {
    resource.resourceType = resourceType
    return this;
}

6 Objects with Sub-Objects

Objects can have other objects as properties (in order to group related properties, for example).

 public class Resource {
    public Long id;
    public String resourceType;
    public PersonDetails personDetails;
    public PlaceDetails placeDetails;
    public ThingDetails thingDetails;
}

public class PlaceDetails {
    public String city;
    public String state;
}

………

A builder could be used to build each of the sub-objects.  Then the sub-object properties could be initialized in the object like any other property.

Resource resource = new Resource.Builder()
        .id(1234L)
        .resourceType("PLACE")
        .placeDetails(new PlaceDetails.Builder()
            .city("Phoenix")
            .state("Arizona")
            .build())
        .build();

But this is really ugly and messy, and it gets worse when sub-sub-objects and arrays of sub-objects get involved.  Better would be to access a PlaceDetails.Builder directly in the fluent syntax, like this:

Resource resource = new Resource.Builder()
        .id(1234L)
        .resourceType("PLACE")
        .startPlaceDetails()
            .city("Phoenix")
            .state("Arizona")
        .endPlaceDetails()
        .build();

6.1 The Start Method

Here’s how this can be coded.  In Resource.Builder, define startPlaceDetails() to return a builder for the PlaceDetails sub-object.

If the placeDetails property in the builder’s prototype is not yet already initialized, it is initialized here with an uninitialized PlaceDetails object, otherwise it is left alone.  This allows multiple calls to be made to startPlaceDetails() if the Resource.Builder is being used to build multiple similar objects.

public PlaceDetails.NestedBuilder startPlaceDetails() {
    if (resource.getPlaceDetails()==null) {
        resource.setPlaceDetails(new PlaceDetails());
    }
    return new PlaceDetails.NestedBuilder(this,
                                          resource.getPlaceDetails());
}

6.2 NestedBuilder

The sub-object’s builder here is called NestedBuilder to distinguish it from a regular standalone builder that terminates in a build() method.  In a later section, I’ll discuss having both a Builder and a NestedBuilder class for the same type.

The nested builder’s constructor is package scoped so that it can only be used from within the package defining resources and sub-resources.

The nested builder takes a pointer to the parent builder as an argument. When it terminates with endPlaceDetails(), that parent builder must be returned for the fluent syntax to continue with initializing the parent object.

The nested builder also takes a pointer to the sub-object it will use as its prototype object. For nested builders, it is the concern of the parent builder to manage that property and the concern of the sub-object builder to manage its initialization.

A parent object could conceivably have multiple properties of the same type.   Resource could have a PlaceDetails properties specifying birth location, marriage location, and burial location, for example.  The same nested builder can initialize all three.

Recall above that the prototype object was declared with package scope.  This was so that these nested builders can access it.

public static class NestedBuilder {
    private Resource.Builder parent;
    private PlaceDetails placeDetails;

    NestedBuilder(Account.Builder parent,
                  PlaceDetails placeDetails) {
        this.parent = parent;
        this.placeDetails = placeDetails;
    }

    public Account.Builder endPlaceDetails() {
        return parent;
    }

    public NestedBuilder city(String city) {
        placeDetails.city = city;
        return this;
    }

    public NestedBuilder state(String state) {
        placeDetails.state = state;
        return this;
    }
}

6.3 Clear Method

Ordinarily, I omit having a placeDetails(…) method when there’s a startPlaceDetails() method.  But this leaves no way to clear the placeDetails property.  If this functionality is needed, a clearPlaceDetails() method may be defined.  This sort of functionality should be used sparingly however (as discussed below).

public Builder clearPlaceDetails() {
    resource.placeDetails = null;
    return this;
}

6.4 Building Multiple Similar Objects

Several similar objects can be built with a shared builder with this setup. Obviously, this example is quite contrived, so that it would be better in this case to use separate builders, but for larger more complex objects with only minor differences between them, this capability can be very convenient.  Methods like clearPlaceDetails() should be used only very sparingly so that interdependence between the building of multiple objects doesn’t become fragile.

// create builder to reuse
Resource.Builder builder = new Resource.Builder().resourceType("PLACE");

// Phoenix
Resource phoenix = builder
        .id(1L)
        .startPlaceDetails()
            .city("Phoenix")
            .state("Arizona”)
        .endPlaceDetails()
        .build();

Resource Tucson = builder
        .id(2L)
        .startPlaceDetails()
            .city("Tucson")
        .endPlaceDetails()
        .build();

Resource abrahamLincoln = builder
        .id(3L)
        .resourceType("PERSON")
        .clearPlaceDetails()
        .startPersonDetails()
            .name(“Abraham Lincoln”)
        .endPersonDetails()
        .build();

7 Objects with Arrays of Sub-Objects

What if Resource has, say, an inventory list?  This can be handled by Resource.Builder having a startInventoryList() method that returns an InventoryListBuilder, which in turn has a startInventoryItem() method that returns an InventoryItem.NestedBuilder.

InventoryListBuilder is coded as a nested class in Resource.Builder, so its constructor doesn’t need to be passed the Resource.Builder object as an explicit parameter.  It’s startInventoryItem() method adds a new item to the list.

The InventoryItem.NestedBuilder’s constructor gets passed the InventoryItem to use as its prototype object.  It is the responsibility of the InventoryListBuilder to do all manipulation of the inventoryList.

public InventoryListBuilder startInventoryList() {
    return new InventoryListBuilder();
}

public class InventoryListBuilder {
    List inventoryList;

    InventoryListBuilder() {
        if (Builder.this.account.getInventoryList()==null) {
            Builder.this.account.setInventoryList(new ArrayList());
        }
        inventoryList = Builder.this.account.getInventoryList();   
    }

    public InventoryItem.NestedBuilder startInventoryItem() {
        InventoryItem inventoryItem = new InventoryItem();
        inventoryList.add(inventoryItem);
        return new InventoryItem.NestedBuilder(this, inventoryItem);
    } 
}

An object containing an array property can then be initialized like this:

Resource resource = new Resource.Builder()
        .id(100L)
        .startInventoryList()
            .startInventoryItem()
                .item("Sword")
                .count(2)
            .endInventoryItem()
            .startInventoryItem()
                .item("Knife")
                .count(4)
            .endInventoryItem()
        .endInventoryList()
        .build();

When using builders to create several similar objects, it is convenient to add additional methods to the InventoryListBuilder to manipulate the list.  I find this particularly useful when construction test values and expected results in tests.

// modify an existing element
public InventoryItem.NestedBuilder startInventoryItem(int index) {
    return new InventoryItem.NestedBuilder(this, inventoryList.get(index));
}

// insert before an existing element
public InventoryItem.NestedBuilder insertInventoryItem(int index) {
    InventoryItem inventoryItem = new InventoryItem();
    inventoryList.add(index, inventoryItem);
    return new InventoryItem.NestedBuilder(this, inventoryItem);
}

// remove an existing element
public InventoryListBuilder removeInventoryItem(int index) {
    inventoryList.remove(index);
    return this;
}

InventoryItem.NestedBuilder needs a constructor that takes both the parent builder for being returned by endInventoryItem() and a prototype object to use.  It doesn’t clone the prototype object like the public Builder constructors do because we are explicitly wanting this builder to edit a particular prototype object in the parent builder.

NestedBuilder(Resource.Builder.InventoryListBuilder parent,
              InventoryItem inventoryItem) {
    this.parent = parent;
    this.inventoryItem = inventoryItem;
}

8 Multiple Builders for a Class

8.1 AbstractBuilder

Sometimes, both a Builder for building standalone objects and one or more NestedBuilders for building objects that are sub-objects of another object might be needed.  It is desirable for these builders to share as much code as possible (the DRY principle: Don’t Repeat Yourself).  For encapsulating the common code between the builders, a private AbstractBuilder can be defined which Builder and NestedBuilder extend.  This AbstractBuilder defines the property initializer methods while the derived builders define the constructors and the build() and end…() methods.

Writing an AbstractBuilder is a little tricky because it must be parameterized so that the property initializer methods return the derived builder object, not the AbstractBuilder object.  So, AbstractBuilder gets parameterized with a type that’s an extension of AbstractBuilder.  Amazingly, this works.

The builders derived from AbstractBuilder will provide a subThis() method that returns the subclasses’ “this” pointer.  It is used to provide the value that is returned by the initializer methods.

private static abstract class AbstractBuilder {
    protected InventoryItem inventoryItem;

    // sub classes provide a this pointer of the appropriate type
    protected abstract T subThis();
    private T subThis = subThis();

    public T item(String item) {
        inventoryItem.setItem(item);
        return subThis;
    }

    public T count(int count) {
        inventoryItem.setCount(count);
        return subThis;
    }
}

8.2 Extensions of AbstractBuilder

Extensions of AbstractBuilder should parameterize AbstractBuilder with their own type so that the initialization methods can return the subtype’s object instead of the AbstractBuilder object. They provide a constructor, a subThis() method, and an appropriate build() or end…() method. The constructors and build() or end…() methods are coded just like they are for regular Builder and NestedBuilder classes as described above.

The subThis() method just returns the extension’s “this” pointer and is called by the AbstractBuilder’s initialization methods to return the appropriate builder object.

public static class Builder extends AbstractBuilder {
    public Builder() {
        this.inventoryItem = new InventoryItem();
    }

    // more constructors can go here as needed
    ………

    public InventoryItem build() {
        return inventoryItem.clone();
    }

    @Override
    protected Builder subThis() {
        return this;
    }
}

public static class NestedBuilder extends AbstractBuilder {
    private Account.Builder.InventoryListBuilder parent;

    NestedBuilder(Account.Builder.InventoryListBuilder parent,
                  InventoryItem inventoryItem) {
        this.parent = parent;
        this.inventoryItem = inventoryItem;
    }

    public Account.Builder.InventoryListBuilder endInventoryItem() {
        return parent;
    }

    @Override
    protected NestedBuilder subThis() {
        return this;
    }
}

8.3 Multiple NestedBuilders

Conceivably, a class may be used in serveral different places of multiple complex data structures.  For this case, parameterize the NestedBuilder class with the type of the parent builder.

public static class NestedBuilder extends AbstractBuilder {
    private T parent;

    NestedBuilder(T parent, InventoryItem inventoryItem) {
        this.parent = parent;
        this.inventoryItem = inventoryItem;
    }

    public T endInventoryItem() {
        return parent;
    }

    @Override
    protected NestedBuilder subThis() {
        return this;
    }
}

Then, in the definition of the start…() method in the parent builder, parameterize the NestedBuilder return type with the parent builder’s type.

InventoryItem.NestedBuilder startInventoryItem() {
    InventoryItem inventoryItem = new InventoryItem();
    inventoryList.add(inventoryItem);
    return new InventoryItem.NestedBuilder
            (this, inventoryItem);
}

9 Cookbook

This essay presents a broad cookbook of builder features and how to implement them. While the implementation of some of these features is complex, and some of the features can easily be abused to create fragile code, I have found these ideas to be very powerful and useful in building complex data structures in a succinct and easy-to-read manner.

They are especially useful in writing tests because lots of similar objects with minor variations are often needed.  For example in a test for a REST service, a number of similar resources might be built for POSTing.  The result that is returned from the REST service may then consist of what was posted, plus some additional bits and pieces like a generated primary key and HATEOAS links.  The expected results to compare against can be built from the POSTed resources with some additional initialization.

Pragmatic Unit Testing, Part 2

4 Integration Testing REST Services

Now, we’ll look at using junit to do integration testing of a full REST service.  This will work much like functional tests in that we’ll first test creating resources, then test the rest of the CRUD operations by operating on such created resources.  But first some preliminaries.

Continue reading “Pragmatic Unit Testing, Part 2”

Pragmatic Unit Testing, Part 1

1  Extreme Programming Extremism

Dark chocolate is good for you.  It’s good for your heart and may prevent cancer.  But no one should take consumption of dark chocolate to the extreme and eat it for breakfast, lunch, and dinner. That wouldn’t just be silly, it would be decidedly unhealthy.  Yet a current fad in software development goes under the moniker of “extreme programming” where otherwise good ideas are taken to extremes.  Code reviews are good, so have someone sitting next you continuously code reviewing every keystroke you type.  Or, the subject of this essay, testing code in isolation is good and using mocks is useful, so obsessively test every class in isolation while mocking every other class it depends on.  It is the goal of this essay to argue for a more pragmatic, moderate approach to unit testing. Continue reading “Pragmatic Unit Testing, Part 1”