The Ideal Domain-Driven Design Aggregate Store?

At the 2014 DDD eXchange in NYC, a park bench discussion developed around storing Aggregates. The consensus among the DDD leadership was against Object-Relational Mapping (ORM) and the desire to come up with a better way to store Aggregates. There were comments about ORM in general being an antiquated approach. While some developers are still new to ORMs, the technology of shoehorning objects into relational databases is more than 20 years old. In 20+ years, why haven’t we found a better way to store Aggregates?

During the park bench discussion I promoted the idea of serializing Aggregates as JSON and storing them in that object notation in a document store. A JSON-based store would enable you to query the object’s fields. Central to the discussion, there would be no need to use an ORM. This would help to keep the Domain Model pure and save days or weeks of time generally spent fiddling with mapping details. Even more, your objects could be designed in just the way your Ubiquitous Language is developed, and without any object-relational impedance mismatch whatsoever. Anyone who has used ORM with DDD knows that the limitations of mapping options regularly impede your modeling efforts.

When thinking of a JSON-based store, no doubt your mind is immediately drawn to MongoDB. That’s just how MongoDB works. While true, MongoDB still falls short of filling the needs of DDD Aggregates in one very important way. In our park bench discussion I noted how MongoDB was close to what I wanted, but that you could not use MongoDB to both update an Aggregate’s state to one collection in the store and append one or more new Domain Events to a different collection in the same operation. In short, MongoDB doesn’t support ACID transactions. This is a big problem when you want to use Domain Events along with your Aggregates, but you don’t want to use Event Sourcing. That is, your Domain Events are an adjunct to your Aggregate state, not its left fold. Hopefully I don’t have to explain the problems that would occur if we successfully saved an Aggregate’s state to MongoDB, but failed to append a new Domain Event to the same storage. That would simply make the state of the application completely wrong, and no doubt would lead to inconsistencies in dependent parts of our own Domain Model and/or those in one or more other Bounded Contexts.

Rumor has it that MongoDB will at some future time support ACID transactions. In fact there is now a branch of MongoDB that supports ACID transactions. It’s the TokuMX project. Although you may personally feel comfortable using this product, it didn’t excite me. Frankly, it could be a huge challenge to get a given enterprise to support MongoDB in the first place, let alone trying to convince every stakeholder to support a branch of MongoDB that is delivered by a lesser known third party. It seems to me that the best chance to use MongoDB with ACID transactions in your project is when you can finally download it from MongoDB.org.

For me this meant looking elsewhere, and boy, I am glad I did. I believe that I have found the truly ideal DDD Aggregate store in PostgreSQL 9.4. Here are the main reasons why I think this is so:

  • PostgreSQL 9.4 supports both text-based JSON (json datatype) and binary JSON (jsonb datatype). The binary JSON type is a higher performing datatype than the text-based datatype.
  • You can query directly against the JSON, and create indexes on specific JSON object fields/attributes.
  • PostgreSQL is, of course, a relational database and supports ACID transactions.
  • PostgreSQL is a very mature open source product and comes with support tools such as the Postgres Enterprise Manager and the like.
  • You can get both community and commercial support for PostgreSQL, and you have a choice among multiple support vendors.
  • PostgreSQL is fast. I mean, PostgreSQL is seriously fast. In benchmarks around version 9.4, PostgreSQL can perform database writes at or near 14,000 transactions per second. You will be hard pressed to find many projects that will need to perform anywhere near that fast or faster. I don’t have the comparison benchmarks handy, but I believe that is significantly faster than MongoDB (without ACID transactions). In my experience most likely PostgreSQL 9.4 (and later versions) could address the performance needs of probably something like 97% of all enterprise projects globally. Of course your mileage may vary, but I regularly poll developers for performance numbers. The majority need (far) less than 1,000 transactions per second, and only a few require anywhere near 10,000 transactions per second.
  • Using PostgreSQL’s JSON support is just plain easy.

What I will do next is step through how easy it is to use PostgreSQL to create DDD Aggregate storage.

Developing a PostgreSQL JSON Repository

If you are familiar with my book, Implementing Domain-Driven Design, you recall the Core Domain named the Agile Project Management Context. In that Bounded Context we model a project management application for Scrum-based Products. A Product is an Entity that serves as the Root of the Aggregate:

public class Product extends Entity {

    private Set<ProductBacklogItem> backlogItems;
    private String description;
    private ProductDiscussion discussion;
    private String discussionInitiationId;
    private String name;
    private ProductId productId;
    private ProductOwnerId productOwnerId;
    private TenantId tenantId;
    ...
}

I am going to create a Repository to persist Product instances and find them again. Let’s first take a look at the basic means for persisting Product instances, and then we will look at querying for them. Here is the Repository declaration and the methods used to save and remove Product instances:

public class PostgreSQLJSONProductRepository
   extends AbstractPostgreSQLJSONRepository
   implements ProductRepository {
   ...
   @Override
   public ProductId nextIdentity() {
      return new ProductId(UUID.randomUUID().toString().toUpperCase());
   }
   ...
   @Override
   public void remove(Product aProduct) {
      this.deleteJSON(aProduct);
   }

   @Override
   public void removeAll(Collection<Product> aProductCollection) {
      this.deleteJSON(aProductCollection);
   }

   @Override
   public void save(Product aProduct) 
      this.saveAsJSON(aProduct);
   }

   @Override
   public void saveAll(Collection<Product> aProductCollection) {
      this.saveAsJSON(aProductCollection);
   }
   ...
}

That’s pretty simple. The bulk of the work is in the abstract base class, AbstractPostgreSQLJSONRepository. The only method that must be overridden and implemented by the concrete sub-class is the tableName(), which allows the abstract base class to know the name of the table in which the concrete type is stored:

public class PostgreSQLJSONProductRepository
      extends AbstractPostgreSQLJSONRepository
      implements ProductRepository {
   ...
   @Override
   protected String tableName() {
      return "tbl_products";
   }
   ...
}

Let’s take a look inside that base class:

public abstract class AbstractPostgreSQLJSONRepository {

   private ObjectSerializer serializer;
   ...
   protected AbstractPostgreSQLJSONRepository() {
      super();

      this.serializer = ObjectSerializer.instance();
   }

   protected void close(ResultSet aResultSet) {
      if (aResultSet != null) {
         try {
            aResultSet.close();
         } catch (Exception e) {
            // ignore
         }
      }
   }
	
   protected void close(Statement aStatement) {
      if (aStatement != null) {
         try {
            aStatement.close();
         } catch (Exception e) {
            // ignore
         }
      }
   }
	
   protected Connection connection() throws SQLException {
      Connection connection =
            PostgreSQLPooledConnectionProvider
                  .instance()
                  .connection();

      return connection;
   }

   protected void deleteJSON(Identifiable<Long> anAggregateRoot) {
      try {
         Connection connection = this.connection();

         this.deleteJSON(connection, anAggregateRoot);

      } catch (Exception e) {
         throw new RuntimeException("Cannot delete: " + anAggregateRoot + " because: " + e.getMessage());
      }
   }

   protected void deleteJSON(
         Collection<? extends Identifiable<Long>> anAggregateRoots) {
		
         try {
            Connection connection = this.connection();

            for (Identifiable<Long> root : anAggregateRoots) {
               this.deleteJSON(connection, root);
            }

         } catch (Exception e) {
            throw new RuntimeException("Cannot delete: " + anAggregateRoots + " because: " + e.getMessage());
         }
   }
	
   protected <T extends Object> T deserialize(String aSerialization, final Class<T> aType) {
      return this.serializer.deserialize(aSerialization, aType);
   }

   ...

   protected String serialize(Object anAggregate) {
      return this.serializer.serialize(anAggregate);
   }

   protected abstract String tableName();

   protected void saveAsJSON(Identifiable<Long> anAggregateRoot) {
      if (anAggregateRoot.isUnidentified()) {
         this.insertAsJSON(anAggregateRoot);
      } else {
         this.updateAsJSON(anAggregateRoot);
      }
   }
	
   protected void saveAsJSON(Collection<? extends Identifiable<Long>> anAggregateRoots) {
      try {
         Connection connection = this.connection();

         for (Identifiable<Long> aggregateRoot : anAggregateRoots) {
            if (aggregateRoot.isUnidentified()) {
               this.insertAsJSON(connection, aggregateRoot);
            } else {
               this.updateAsJSON(connection, aggregateRoot);
            }
         }
	        
      } catch (Exception e) {
         throw new RuntimeException("Cannot save: " + anAggregateRoots + " because: " + e.getMessage());
      }
   }

   private void deleteJSON(
         Connection aConnection,
         Identifiable<Long> anAggregateRoot)
   throws SQLException {
		
      PreparedStatement statement = null;
		
      try {
         statement = aConnection.prepareStatement(
               "delete from "
               + this.tableName()
               + " where id = ?");

         statement.setLong(1, anAggregateRoot.identity());
         statement.executeUpdate(); 

      } finally {
         this.close(statement);
      }
   }

   private void insertAsJSON(Identifiable<Long> anAggregateRoot) {
      try {
         Connection connection = this.connection();

         this.insertAsJSON(connection, anAggregateRoot);

      } catch (Exception e) {
         throw new RuntimeException("Cannot save: " + anAggregateRoot + " because: " + e.getMessage());
      }
   }

   private void insertAsJSON(
         Connection aConnection,
         Identifiable<Long> anAggregateRoot)
   throws Exception {

      PreparedStatement statement = null;

      try {
         String json = this.serialize(anAggregateRoot);
			
         PGobject jsonObject = new PGobject();
         jsonObject.setType("json");
         jsonObject.setValue(json);

         statement = aConnection.prepareStatement(
               "insert into "
               + this.tableName()
               + " (data) values (?)");

         statement.setObject(1, jsonObject); 
         statement.executeUpdate(); 

      } finally {
         this.close(statement);
      }
   }

   private void updateAsJSON(Identifiable<Long> anAggregateRoot) {
      try {
         Connection connection = this.connection();

         this.updateAsJSON(connection, anAggregateRoot);

      } catch (Exception e) {
         throw new RuntimeException("Cannot update: " + anAggregateRoot + " because: " + e.getMessage());
      }
   }
	
   private void updateAsJSON(
         Connection aConnection,
         Identifiable<Long> anAggregateRoot)
   throws SQLException {

      PreparedStatement statement = null;

      try {
         String json = this.serialize(anAggregateRoot);

         PGobject jsonObject = new PGobject();
         jsonObject.setType("json");
         jsonObject.setValue(json);

         statement = aConnection.prepareStatement(
               "update "
               + this.tableName()
               + " set data = ?"
               + " where id = ?");

         statement.setObject(1, jsonObject);
         statement.setLong(2, anAggregateRoot.identity());
         statement.executeUpdate();

      } finally {
         this.close(statement);
      }
   }
}

Here are the highlights from the abstract base class with regard to saving and removing Aggregates to and from the store:

  • We use an ObjectSerializer to serialize Aggregate instances to JSON, and to deserialize them from JSON back to their Aggregate instance state. This ObjectSerializer is the same one I used in my book, which is based on the Google Gson parser. The biggest reason I use this JSON parser is because it works be introspection and reflection on object fields rather than requiring objects to support the JavaBean specification (yuk!).
  • There are special methods that help close ResultSet and PreparedStatement instances.
  • Each Repository gets a JDBC Connection to the database using PostgreSQLPooledConnectionProvider. All of the operations are simple, lightweight JDBC operations. As indicated by its name, the PostgreSQLPooledConnectionProvider provides pooled Connections that are thread bound using ThreadStatic.
  • You can delete and insert one or many Aggregate instances in one operation. This supports remove(), removeAll(), save(), and saveAll() in the concrete sub-classes.
  • All communication via JDBC uses the PGobject type to carry the JSON payload to and from the database. The PGobject type in this code is “json” and the value is a JSON String object. You can easily switch the code to the more efficient “jsonb” type.

Note another detail. All Aggregate Root Entities are passed into the abstract base class as Identifiable instances. This enables the base class Repository to determine whether the instances have already been saved to the data store on prior operations, or if this is the first time. For first time persistence the Repository uses an INSERT operation. For subsequent saves after having read the Aggregate instances from the store the operation will be an UPDATE. The Entity type in the Agile Project Management code base implements the Identifiable interface:

public interface Identifiable<T> {
   public T identity();
   public void identity(T aValue);
   public boolean isIdentified();
   public boolean isUnidentified();
}

public abstract class Entity implements Identifiable<Long> {
    ...
    private Long surrogateIdentity;

    public Entity() {
        super();

        this.identity(0L);
    }
    ...
    @Override
    public Long identity() {
       return this.surrogateIdentity == null ? 0:this.surrogateIdentity;
    }

    @Override
    public void identity(Long aValue) {
       this.surrogateIdentity = aValue;
    }

    @Override
    public boolean isIdentified() {
       return identity() > 0;
    }

    @Override
    public boolean isUnidentified() {
       return identity() <= 0;
    }
    ...
}

Supporting this interface enables the various saveAsJSON() methods to interrogate each Aggregate instance for its surrogate identity. If the surrogate identity is not yet set, it knows that the Aggregate instance is new and must be inserted. If the surrogate identity is set, the Repository knows that it is a preexisting instance that must be updated to the data store. The surrogate identity is stored as the row’s primary key in the table.


Follow the Aggregate Rule of Thumb: Reference Other Aggregates By Identity Only

Following this rule is very important as it makes your Aggregate instance simple to serialize. If instead you use a graph of Aggregate instances, don’t expect fabulous things from the JSON serializer.


Speaking of database, here is a simple database SQL script used to create the database and tables used by the solution:

drop database if exists agilepm;
create database agilepm owner postgres;

create table tbl_events
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_publishednotificationtracker
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_timeconstrainedprocesstrackers
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_backlogitems
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_productowners
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_products
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_releases
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_sprints
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_teammembers
(
    id             bigserial primary key,
    data           json not null
);

create table tbl_teams
(
    id             bigserial primary key,
    data           json not null
);

As you can see, these are all very simple tables. The JSON is stored in the column named data. The bigserial column type is a bigint (8 bytes) that has a backing sequence. As you insert new rows into one of the tables, its sequence is used to auto-increment the primary key. As you can see, the tbl_events that holds each Domain Event published by the Bounded Context (see Chapter 8 of my book) has a primary key also. This serial bigint primary key serves as the unique notification identity for messaging notifications that are published inside and outside the Bounded Context.

Finally let’s take a look at how Aggregate instances stored as JSON inside the database are found. Note that we will be querying inside the data column of each database table. We use simple -> and ->> notation to navigate from data down into each JSON object. For example, here are the three finder methods found in the Repository for Products, the PostgreSQLJSONProductRepository:

public class PostgreSQLJSONProductRepository
      extends AbstractPostgreSQLJSONRepository
      implements ProductRepository {
   ...
   @Override
   public Collection<Product> allProductsOfTenant(TenantId aTenantId) {
      String filter = "data->'tenantId'->>'id' = ?";

      return this.findAll(Product.class, filter, "", aTenantId.id());
   }

   @Override
   public Product productOfDiscussionInitiationId(
         TenantId aTenantId,
         String aDiscussionInitiationId) {

      String filter = "data->'tenantId'->>'id' = ? and data->>'discussionInitiationId' = ?";

      return this.findExact(Product.class, filter, aTenantId.id(), aDiscussionInitiationId);
   }

   @Override
   public Product productOfId(TenantId aTenantId, ProductId aProductId) {
      String filter = "data->'tenantId'->>'id' = ? and data->'productId'->>'id' = ?";

      return this.findExact(Product.class, filter, aTenantId.id(), aProductId.id());
   }
   ...
}

From the data column we filter using a WHERE clause. The full SELECT statement is found in the abstract base class, which we will examine in a moment. To keep the finder interfaces very simple I only require the client Repository to provide the actual matching parts, such as seen in the code snippet above. There are several tokens in each filter. The data token refers to the data column in the given row. The other tokens such as ‘tenantId’, ‘id’, and ‘productId’ are the JSON field names. So, to match on the tenant identity in the JSON you use data->’tenantId’->>’id’ = ? as part of the WHERE clause. Note that -> is used to navigate above the actual target field, while ->> points to the final target field.

You can findAll() or findExact(), which find a Collection of a specific type or find a single instance of a specific type, respectively:

public abstract class AbstractPostgreSQLJSONRepository {
   ...
   protected <T extends Identifiable<Long>> List<T> findAll(
         Class<T> aType,
         String aFilterExpression,
         String anOrderBy,
         Object ... anArguments) {

      List<T> aggregates = new ArrayList<T>();
      PreparedStatement statement = null;
      ResultSet result = null;

      String query =
            "select id, data from "
            + this.tableName()
            + " where "
            + aFilterExpression
            + " "
            + anOrderBy;

      try {
         Connection connection = this.connection();

         statement = connection.prepareStatement(query);

         this.setStatementArguments(statement, anArguments);

         result = statement.executeQuery();

         while (result.next()) {
            Long identity = result.getLong(1);

            String serialized = result.getObject(2).toString();
            	
            T aggregate = this.deserialize(serialized, aType);
            	
            aggregate.identity(identity);

            aggregates.add(aggregate);
         }

      } catch (Exception e) {
         throw new RuntimeException("Cannot find: " + query + " because: " + e.getMessage());
      } finally {
         this.close(statement);
         this.close(result);
      }

      return aggregates;
   }
	
   protected <T extends Identifiable<Long>> T findExact(
         Class<T> aType,
         String aFilterExpression,
         Object ... anArguments) {

      T aggregate = null;

      List<T> aggregates = this.findAll(aType, aFilterExpression, "", anArguments);

      if (!aggregates.isEmpty()) {
         aggregate = aggregates.get(0);
      }

      return aggregate;
   }
   ...
   private void setStatementArguments(
         PreparedStatement aStatement,
         Object[] anArguments)
   throws SQLException {

      for (int idx = 0; idx < anArguments.length; ++idx) {
         Object argument = anArguments[idx];
         Class<?> argumentType = argument.getClass();

         if (argumentType == String.class) {
            aStatement.setString(idx+1, (String) argument);
         } else if (argumentType == Integer.class) {
            aStatement.setInt(idx+1, (Integer) argument);
         } else if (argumentType == Long.class) {
            aStatement.setLong(idx+1, (Long) argument);
         } else if (argumentType == Boolean.class) {
            aStatement.setBoolean(idx+1, (Boolean) argument);
         } else if (argumentType == Date.class) {
            java.sql.Date sqlDate = new java.sql.Date(((Date) argument).getTime());
            aStatement.setDate(idx+1, sqlDate);
         } else if (argumentType == Double.class) {
            aStatement.setDouble(idx+1, (Double) argument);
         } else if (argumentType == Float.class) {
            aStatement.setFloat(idx+1, (Float) argument);
         }
      }
   }
   ...
}

The backbone of the finders is implemented in findAll(), which findExact() reuses. Note that when the ResultSet is obtained we iterate over each entry. Using findAll() you can both filter and order the outcome by a specific column or JSON field.

We obtain both the surrogate identity and the JSON serialization payload. Once the JSON is used to deserialize to the Aggregate instance, we set the surrogate identity as the identity of the Identifiable. This prepares the Aggregate instance for updating should the client decide to modify the instance and call save() on the Product Repository.

Well, that’s pretty much it. Every concrete Repository implemented using the AbstractPostgreSQLJSONRepository is very simple and straightforward. I intend to push the implementation to its Github repository as soon as possible. That should give you everything you need to implement this in your own project.

An Approach to Composing Aggregate Boundaries

For complete coverage of this topic, you should see my book: Domain-Driven Design Distilled

Modeling Aggregates with DDD and Entity Framework

For everyone who has read my book and/or Effective Aggregate Design, but have been left wondering how to implement Aggregates with Domain-Driven Design (DDD) on the .NET platform using C# and Entity Framework, this post is for you.

[NOTE: As expected, this article has within hours of posting received some criticism for the approach used to O-R mapping with Entity Framework. Actually the article received much more praise than criticism, but… I want to just point out that I am purposely not attempting to win any guru award in Entity Framework mapping. If you browse through this post too quickly some of the key words of wisdom and my intent may be lost on your speed reading. I am purposely avoiding some of the expert guidance that is typically given with a view to deep understanding of Entity Framework mappings. In fact, you may not realize the purpose of the article unless you begin reading with the assumed attitude that “I hate O-R mapping.” The O-R mapping tooling is actually something like 20+ years old, and it is time that we come up with more practical solutions to storing objects as objects. In the meantime we should just do as little O-R mapping as we can get away with. So, thanks for your words of advice, but I have done everything below with precise intent.]

Definition of Aggregate

To start off, let’s recap the basic definition of DDD Aggregate. First and foremost the Aggregate pattern is about transactional consistency. At the end of a committed database transaction, a single Aggregate should be completely up to date. That means that any business rules regarding data consistency must be met and the persistence store should hold that consistent state, leaving the Aggregate correct and ready to use by the next use case. Figure 1 illustrates two such consistency boundaries, with two different Aggregates.

Aggregates

Figure 1. Two Aggregates, which represent two transactional consistency boundaries.

The problem that many have with designing Aggregates is that they don’t consider the true business constraints that require data to be transactionally consistent and instead design Aggregates in large clusters as shown in Figure 2. Designing Aggregates in this way is a big mistake if you expect them (1) to be used by many thousands of users, (2) to perform well, and (3) to scale to the demands of the Internet.

LargeCluster

Figure 2. A poorly designed Aggregate that is not conceived on according to true business consistency constraints.

Using an example from my book, a set of well-designed Aggregates are shown in Figure 3. These are based on true business rules that require specific data to be up-to-date at the end of a successful database transaction. These follow the rules of Aggregate, including designing small Aggregates.

FourSmallAggregates

Figure 3. Some well-designed Aggregates that adhere to true consistency rules.

Still, the question arises, if BacklogItem and Product have some data dependencies, how do we update both of them. This points to the another rule of Aggregate design, to use eventual consistency as shown in Figure 4. Of course, there’s a bit more involved when you consider the overall architecture, but the foregoing points out the high-level composition guidance of Aggregate design.

EventualConsistency

Figure 4. When two or more Aggregates have at least some dependencies on updates, use eventual consistency.

Now with this brief refresher on the basics of Aggregate design, let’s see how we might map the Product to a database using Entity Framework.

KISS with Entity Framework

So, we have four prominent Aggregates in our Scrum project management application: Product, BacklogItem, Release, and Sprint. We need to persist the state of these four small Aggregates and we want to use Entity Framework to do so. Here’s a possible surprise for you. I am not going to recommend that you need to become an Entity Framework guru. Nope, just the opposite in fact. I am going to suggest that you allow the Entity Framework development team to be the gurus, and you just focus on your specific application. After all, your Core Domain is where you want to put your creative energies, not in becoming an expert in Entity Framework.

What I am recommending is that you allow Entity Framework to take control of doing what it does best and we just stay out of its way. Entity Framework has a certain way of mapping entities into the database, and that’s just how it works. As soon as you try to step outside the basics and go to some extremes of esoteric mapping techniques in ways that Entity Framework was not meant to be used, you are going to experience a lot of pain. Still, we can get quite a bit of mileage out of Entity Framework in the midst of DDD and be quite happy with the way it all works out. To do so we are going to use just a few basic mapping techniques. If you follow my KISS guidance you can mostly ignore your Entity Framework documentation and how-to books. Just allow Entity Framework to map entities and get back to what will make a difference in this competitive world: your market-distinguishing application.

We are going to implement the Product Aggregate using two approaches. One approach uses a Separated Interface with an implementation class, and the other uses a domain object backed by a state object. The whole point of these examples is to stay as far out of Entity Framework’s way as possible.

Using a Separated Interface and Implementation Class

For the first example I create a Separated Interface that is implemented by a concrete domain object. Figure 5 shows you the basic intention of this approach.

EntityFramework1

Figure 5. The Separated Interface named IProduct is implemented by a concrete domain object. Clients directly use only IProduct.

It is pretty typical when programming with C# and .NET to name your interfaces with an “I” prefix, so we will use IProduct:

interface IProduct
{
  ICollection<IBacklogItem> AllBacklogItems();
  IProductBacklogItem BacklogItem(BacklogItemId backlogItemId);
  string Description { get; }
  string Name { get; }
  IBacklogItem PlanBacklogItem(BacklogItemId newBacklogItemId, string summary,
      string story, string category, BacklogItemType type, StoryPoints storyPoints);
  void PlannedProductBacklogItem(IBacklogItem backlogItem);
  ...
  ProductId ProductId { get; }
  ProductOwnerId ProductOwnerId { get; }
  void ReorderFrom(BacklogItemId id, int ordering);
  TenantId TenantId { get; }
}

With this interface we can create a concrete implementation class. Let’s call it Product:

public class Product : IProduct
{
  [Key]
  public string ProductKey { get; set; }
  ...
}

The point of the concrete class Product is to implement the business interface declared by IProduct and to also provide the accessors that are needed by Entity Framework to map the object into and out of the database. Note the ProductKey property. This is technically the kind of primary key that Entity Framework wants to work with. However, it is different from the ProductId, which when combined with the TenantId is the business identity. Therefore, internally the ProductKey must be set to a composite of TenantId as a string and ProductId as a string:

ProductKey = TenantId.Id + ":" + ProductId.Id;

I think you get the idea. We create an interface that we want our client to see and we hide the implementation details inside the implementing class. We make the implementation match up to really basic Entity Framework mappings. We purposely try to keep our special mappings, as with ProductKey, to a minimum. This helps keep the DbContext very simple by registering the implementation classes:

public class AgilePMContext : DbContext
{
  public DbSet<Product> Products { get; set; }
  public DbSet<ProductBacklogItem> ProductBacklogItems { get; set; }
  public DbSet<BacklogItem> BacklogItems { get; set; }
  public DbSet<Task> Tasks { get; set; }
  ...
}

Rather than fully fleshing out the details of this approach, there is enough detail already to make some judgments. I’d like to discuss the fundamental flaws that I see in it:

  1. The Ubiquitous Language is not really reinforced by using interfaces such as IProduct, IBacklogItem, etc. IProduct and IBacklogItem are not in our Ubiquitous Language, but Product and BacklogItem are. Thus, the client facing names should be Product, BacklogItem, and the like. We could accomplish this simply by naming the interfaces Product, BacklogItem, Release, and Sprint, but that would mean we would have to come up with sensible names for the implementation classes. Let’s just pause there and move on to the second and related issue.
  2. There is really no good reason to create a Separated Interface. It would be very unlikely that we would ever create two or more implementations of IProduct or any of the other interfaces. The best reason we have for creating a Separated Interface is when there could be or are multiple implementations, and is just not going to happen in this Core Domain.

Based on these two points alone I would personally choose to abandon this approach before going any further with it. When using Domain-Driven Design the most important and overarching principle is the adhere to the Ubiquitous Language, and from the get-go this approach is driving us away from business terminology rather than toward it.

Domain Object Backed By a State Object

The second approach uses a domain object backed by state objects. As shown in Figure 6, the domain object defines and implements the domain-driven model using the Ubiquitous Language, and the state objects hold the state of the Aggregate.

EntityFramework2

Figure 6. The domain object that models the Aggregate behavior is backed by a state object that holds the model’s state.

By keeping state objects separate from the domain-driven implementation objects, it enables very simple mappings. We let Entity Framework to do what it knows how to do by default to map entities to and from the database. Consider Product, which is backed by the ProductState object. We have two Product constructors; a public business constructor for normal clients and a second internal constructor that is used only by internal implementation components:

public class Product
{
  public Product(
      TenantId tenantId,
      ProductId productId,
      ProductOwnerId productOwnerId,
      string name,
      string description)
  {
    State = new ProductState();
    State.ProductKey = tenantId.Id + ":" + productId.Id;
    State.ProductOwnerId = productOwnerId;
    State.Name = name;
    State.Description = description;
    State.BacklogItems = new List<ProductBacklogItem>();
  }

  internal Product(ProductState state)
  {
    State = state;
  }
  ...
}

When the business constructor is invoked we create a new ProductState object and initialize it. The state object has a simple string-based identity:

public class ProductState
{
  [Key]
  public string ProductKey { get; set; }

  public ProductOwnerId ProductOwnerId { get; set; }

  public string Name { get; set; }

  public string Description { get; set; }

  public List<ProductBacklogItemState> BacklogItems { get; set; }
  ...
}

The ProductKey is actually encoded with two properties, the TenantId as a string and the ProductId as a string, with the two separated by a ‘:’ character. Including the TenantId in the ProductKey ensures that all data stored in the database is segregated by tenant. We must still support client requests for TenantId and ProductId from the Product:

public class Product
{
  ...
  public ProductId ProductId { get { return new ProductId(State.DecodeProductId()); } }
  ...
  public TenantId TenantId { get { return new TenantId(State.DecodeTenantId()); } }
  ...
}

The ProductState object must support both DecodeProductId() and DecodeTenantId() methods. We could also choose to design the state object to redundantly hold whole identities separate of the ProductKey:

public class ProductState
{
  [Key]
  public string ProductKey { get; set; }

  public ProductId ProductId { get; set; }

  public ProductOwnerId ProductOwnerId { get; set; }

  public string Name { get; set; }

  public string Description { get; set; }

  public List<ProductBacklogItemState> BacklogItems { get; set; }

  public TenantId TenantId { get; set; }
  ...
}

This could be well worth the slight memory overhead if converting to identities had a heavy performance footprint. All of the identity types, including ProductOwnerId, are Value Objects and are flattened and mapped into the same database row that ProductState occupies:

[ComplexType]
public class ProductOwnerId : Identity
{
  public ProductOwnerId()
      : base()
  {
  }

  public ProductOwnerId(string id)
      : base(id)
  {
  }
}

The [ComplexType] attribute marks the Value Object as a complex type, which is different from an entity. Complex types are non-scalar values that do not have keys and cannot be managed apart from their containing entity, or the complex type within which they are nested. Marking a Value Object with the Entity Framework [ComplexType] causes the data of the Value Object to be saved to the same database row as the entity. In this case, ProductOwnerId would be saved to the same database row as the ProductState entity.

Here are the base types for all Identity types of Value Objects:

public abstract class Identity : IEquatable<Identity>, IIdentity
{
  public Identity()
  {
    this.Id = Guid.NewGuid().ToString();
  }

  public Identity(string id)
  {
    this.Id = id;
  }

  public string Id { get; set; }

  public bool Equals(Identity id)
  {
    if (object.ReferenceEquals(this, id)) return true;
    if (object.ReferenceEquals(null, id)) return false;
    return this.Id.Equals(id.Id);
  }

  public override bool Equals(object anotherObject)
  {
    return Equals(anotherObject as Identity);
  }

  public override int GetHashCode()
  {
    return (this.GetType().GetHashCode() * 907) + this.Id.GetHashCode();
  }

  public override string ToString()
  {
    return this.GetType().Name + " [Id=" + Id + "]";
  }
}

public interface IIdentity
{
  string Id { get; set; }
}

So, the ProductState object stands on its own when it comes to persisting the state of the Product. However, the ProductState also holds another collection of entities; that is, the List of ProductBacklogItemState:

public class ProductState
{
  [Key]
  public string ProductKey { get; set; }
  ...
  public List<ProductBacklogItemState> BacklogItems { get; set; }
  ...
}

This is all well and good because we keep the database mappings really simple. Yet, how do we get a ProductBacklogItemState object, or the entire List collection for that matter, into a format that we can allow clients to consume? The ProductBacklogItemState is an internal implementation details—just a data holder. This points to the need for a few simple converters, which are used by the Product Aggregate root:

public class Product
{
  ...
  public ICollection AllBacklogItems()
  {
    List all =
        State.BacklogItems.ConvertAll( 
            new Converter<ProductBacklogItemState, ProductBacklogItem>(
                ProductBacklogItemState.ToProductBacklogItem));

    return new ReadOnlyCollection(all);
  }

  public ProductBacklogItem BacklogItem(BacklogItemId backlogItemId)
  {
    ProductBacklogItemState state =
        State.BacklogItems.FirstOrDefault(
            x => x.BacklogItemKey.Equals(backlogItemId.Id));

    return new ProductBacklogItem(state);
  }
  ...
}

Here we convert a collection of ProductBacklogItemState instances to a collection of ProductBacklogItem instances. And when the client requests just one ProductBacklogItem, we convert to one from a single ProductBacklogItemState with the matching identity. The ProductBacklogItemState object must only support a few simple conversion methods:

public class ProductBacklogItemState
{
  [Key]
  public string BacklogItemKey { get; set; }
  ...
  public ProductBacklogItem ToProductBacklogItem()
  {
    return new ProductBacklogItem(this);
  }

  public static ProductBacklogItem ToProductBacklogItem(
        ProductBacklogItemState state)
  {
    return new ProductBacklogItem(state);
  }
  ...
}

Should the client ask repeatedly for a collection of ProductBacklogItem instances the Product could cache the collection after the first time it is generated.

In the end our goal is to stay out of the way of Entity Framework and make it super simple to map state objects in and out of the database. I think when you consider the DbContext for this solution you will conclude that we have a really simple approach:

public class AgilePMContext : DbContext
{
  public DbSet<ProductState> Products { get; set; }
  public DbSet<ProductBacklogItemState> ProductBacklogItems { get; set; }
  public DbSet<BacklogItemState> BacklogItems { get; set; }
  public DbSet<TaskState> Tasks { get; set; }
  public DbSet<ReleaseState> Releases { get; set; }
  public DbSet<ScheduledBacklogItemState> ScheduledBacklogItems { get; set; }
  public DbSet<SprintState> Sprints { get; set; }
  public DbSet<CommittedBacklogItemState> CommittedBacklogItems { get; set; }
  ...
}

Creating and using a ProductRepository is easy as well:

public interface ProductRepository
{
  void Add(Product product);

  Product ProductOfId(TenantId tenantId, ProductId productId);
}

public class EFProductRepository : ProductRepository
{
  private AgilePMContext context;

  public EFProductRepository(AgilePMContext context)
  {
    this.context = context;
  }

  public void Add(Product product)
  {
    try
    {
      context.Products.Add(product.State);
    }
    catch (Exception e)
    {
      Console.WriteLine("Add() Unexpected: " + e);
    }
  }

  public Product ProductOfId(TenantId tenantId, ProductId productId)
  {
    string key = tenantId.Id + ":" + productId.Id;
    var state = default(ProductState);

    try
    {
      state = (from p in context.Products
               where p.ProductKey == key
               select p).FirstOrDefault();
    }
    catch (Exception e)
    {
      Console.WriteLine("ProductOfId() Unexpected: " + e);
    }

    if (EqualityComparer<ProductState>.Default.Equals(state, default(ProductState)))
    {
      return null;
    }
    else
    {
      return new Product(state);
    }
  }
}

// Using the repository
using (var context = new AgilePMContext())
{
  ProductRepository productRepository = new EFProductRepository(context);

  var product =
        new Product(
              new ProductId(),
              new ProductOwnerId(),
              "Test",
              "A test product.");

  productRepository.Add(product);

  context.SaveChanges();
  ...
  var foundProduct = productRepository.ProductOfId(product.ProductId);
}

Taking this approach will help us to stay focused on what really counts the most, our Core Domain and its Ubiquitous Language.

Effective Aggregate Design

This is a three-part series about using Domain-Driven Design (DDD) to implement Aggregates. Clustering Entities and Value Objects into an Aggregate with a carefully crafted consistency boundary may at first seem like quick work, but among all DDD tactical guidance, this pattern is one of the least well understood. This essay is the basis for Chapter 10 of my book, Implementing Domain-Driven Design.

The documents are available for download as three PDFs and are licensed under the Creative Commons Attribution-NoDerivs 3.0 Unported License.

Original English Edition

Effective Aggregate Design: Part 1
Effective Aggregate Design: Part 2
Effective Aggregate Design: Part 3

French Translation

Conception Efficace des Aggregates 1 ere Partie

 

Naming “Shadow” Concepts Across Bounded Contexts

I was asked why the concept named Tenant is used in all three Bounded Contexts for the example code for my book, Implementing Domain-Driven Design. Is it because a Tenant is the exact same thing in all three Contexts? Actually this is a good example of when concepts in two or more Bounded Contexts have the same name but have different meanings and uses. Let’s consider how they differ.

In the Identity and Access Context a Tenant is an Aggregate that has a life cycle. It can be disabled and can be used to invite new registrations, and even carry out new registrations. If that Tenant is disabled, no users registered under it will be able to authenticate. (See the discussion in Chapter 5.) Here a Tenant also has a globally (SaaS wide) unique identity.

In the Collaboration Context there is a concept named Tenant, while in the Agile Project Management Context there is a TenantId. In both cases—Tenant and TenantId—these are modeled as Value Objects and used only for identity. One important reason for these identities is to ensure that the only users within a specific tenancy (an organization that rents/hires use of the SaaS services and stores its related data) can use objects within that tenant.

Why do you think the two teams that respectively work on the Collaboration Context and the Agile Project Management Context didn’t use different names? After all, the teams need not be tied to the original name, Tenant, just because they hold the identity of the Tenant originating in the Identity and Access Context.

True, the teams could have chosen to use the name Subscriber or Company in the two consuming Contexts. They certainly could have done so since the Ubiquitous Language of each team is formed by the team. But what would be the merit in that when Tenant is clearly understood throughout the entire SaaS organization (SaaSOvation)? If new team members are added over the life of each project, it’s just one more detail to have to explain why the Tenant over there is represented as a Subscriber over here, and why our team thinks it is important to distinguish the concept as a Subscriber in our own Context.

In practice, using Subscriber or Company or some other name seems to have little or no justification. In Collaboration, for example, the team doesn’t care anything at all about subscriptions. Further, Company isn’t fitting either, because perhaps the Collaboration subscriber is a school and not a commercial organization. In the end, Tenant works just fine everywhere, and actually in this case everyone who works at SaaSOvation will know exactly what it means.

Still, this specific example in no way indicates that concepts in any given consuming Bounded Context that “shadow” a concept in another Bounded Context must reuse the name of the originating concept. The concept name, the identity, and the limited number of attributes consumed by the foreign Context must be carefully chosen by the team that is responsible for how the local concept is modeled.

It’s Not Just About Authorization, It’s About the Ubiquitous Language

Recently someone asked questions about the IDDD sample code and authorization. Basically the question was, why can’t I authorize the user in the model to start a new Forum Discussion? The developer would pass in a “session” object to the Forum startDiscussion() and have the Forum double dispatch to the “session” to check for authorization:

public class Forum ... {
    ...
    public Discussion startDiscussion(Session aSession, String anAuthorId, ...) {
        aSession.assertAuthorizedDiscussionAuthor(this.tenant(), anAuthorId);
        ...
    }
    ...
}

Take a look at IDDD pages 75-79, and specifically at the refactored code on pages 78-79. It’s not that it is absolutely wrong to in essence authorize the user in the model. Rather, it’s that the authorization is done for free when obtaining the necessary Author instance by way of the CollaboratorService.

Further, it’s a matter that the Author is an essential part of the Ubiquitous Language, while the “session” is not. If anything, rather than pass in a “session” object (or worse yet, SecuritySession), pass in CollaboratorService instead, and double dispatch on the CollaboratorService to get the Author from the Forum’s own Tenant and parameter String anAuthorId:

public class Forum ... {
    ...
     public Discussion startDiscussion(
        CollaboratorService aCollaboratorService,
        String anAuthorId,
        ...) {
        Author author = aCollaboratorService.authorFrom(this.tenant(), anAuthorId);
        if (author == null) {
            throw new IllegalArgumentException("Not a valid author.");
        }
        ...
    }
    ...
}

In my sample code I show the Forum Application Service obtaining the Author via CollaboratorService. This is not wrong because normally the API would serve as the natural client of the model. When using the Hexagonal (Ports and Adapters) architecture, requests from disparate user agent types would all be adapted to use a single API call. This is based on use case, not on user agent type. But you could pass in CollaboratorService to the Forum if you prefer.

Implementing Domain-Driven Design Workshop In Colombia

Bogota

Here I am on a Friday evening in Bogotá, Colombia. This week I taught my Implementing Domain-Driven Design Workshop here. We had a nice group of students, and as usual the class had their eyes opened to what it means to implement with DDD. This was a slightly condensed Workshop, being only two days rather than the normal three. Still, we were able to cover all the material as well as many of the Workshop problems and coding exercises. It is a lot of material to cover in two days, and most student brains were filled to the brim by the time we finished. All of the students were quite pleased after experiencing several strategic and tactical “ah ha!” moments along the way.

Naturally I got some good feedback that I am putting into the slides and code exercises. It was suggested I provide a basic set of components for both Java and .NET developers on which they can work their exercises. I think I will do this, but I also don’t want the next classes to be too heavily focused on architectural mechanisms, but to be more concerned with the resulting domain model and accompanying tests.

Colombian Startup Incubator and Col 3.0

After the class was completed I also participated in Colombian startup incubator initiatives. I performed assessments in both product and technical architecture, advised and mentored. For three of the top projects, we stepped through questions to identify their Core Domain (yes, DDD). I only used the term DDD with one of the teams, purposely avoiding it with the other two. I wanted the teams to focus only on the competitive advantages of their product offerings and not get distracted by the details of Domain-Driven Something-Or-Another.

It worked very well. One of the three teams was already quite advanced in their product vision, so there wasn’t a lot of Core model design still lacking. Yet, for the other two teams we dug out some missing Core Domain features. One of the incubator leads later told me that the third team was “ecstatic” with the results. They were missing a really important part of the Core that will influence subscribers to invest their intellectual capital in the service and remain subscribers.

Colombia is on a mission to innovate en mass, which has opened a path for many startups. Some of the incubator programs will soon be leading their teams to the US to obtain first-round funding. Apparently much of the initial seed money comes from industries involved in natural resources (oil, minerals, etc.), and represents around 6% of all such revenues. I am rather certain that at “only 6%” it is no small change. The funding, administered through the government, is not limited to startup seed capital. Actually, the Colombian government is setting up open office space along with high-end computing resources all over the country. All you have to do is show up and use the resources, which makes it very easy for Colombians to try their hand at innovation.

It is quite interesting to see this initiative, called Col 3.0, unfolding. As they say, “watch this space.”

This is not an entirely new approach to growth by innovation. For another perspective, see Singapore’s A*Star (Agency for Science, Technology, and Research).

Guaranteed Anemia with Dozer

Today I was inspired by Scott Hanselman to get my blogging act together. It’s been awhile, maybe nine months or more. It’s way overdue.

I’ve been helping a colleague on a (currently confidential) project for the NYSE. The goal is to introduce Domain-Driven Design in some incremental steps. He posed a few questions today. The discussion starts with this background: “PROJECT_NAME has both service ‘domain’ objects and data ‘domain’ objects (mostly called the same names and they give it the domain namespace, not I). The transport mechanism delivers “service” objects from the UI and other services. When they are ready to be Hibernate persisted they get transformed via Dozer to a data ‘domain’ object. For instance, while some service is dealing with an order object, it is a service ‘order’ object. When the service throws it over the wall to the Repository, the service Dozers it to a data ‘order’ object and shovels it into the queue channel for delivery to the DAL.”

He continues: “If I understand you properly, as a rule of thumb for a rich domain model given this architectural world-view, we would generally replace setters on the ‘domain’ objects with business-type methods (what we’ve been doing for ages). But the Hibernated data ‘domain’ objects, well, I don’t see much need for them to have getters and setters either given that Hibernate no longer needs them. That is, given Dozer.”

In a nutshell, this is his question: “What is your world-view without Dozer? Just one domain model with Hibernate annotations also (given that the client is using annotations and not xml)? Or do you split the model as they do and just get-set? Or other?”

This is a pretty typical question among those unfamiliar with the DDD “world view.” So don’t feel shy about having similar quandaries. If you are familiar with DDD, what might your response be? Here’s what I had to say…

[Begin reply.]

You gave the right answer. It’s just dozing data around. So in the end, what really is the difference between the “service object” and the “domain object”? Probably not much difference. It’s just mapping attributes around. And Dozer covers over any business intent what so ever. Everything is reduced to anemia. Like my Chapter 1 [of Implementing Domain-Driven Design] says, it’s “Anemia Everywhere.”

So two months from now, bring in someone who knows nothing about PROJECT_NAME and tell them to go explain to you what the business value is of a given use case. It will probably take them hours to explain it because they will have to start at the source of the data, in some other system, and trace it all around. In fact they may never be able to tell you exactly because there may be deep insight hidden in the data mappings that only a conversation with a specific group of desk traders could reveal.

However, if using DDD you would have one simple place in the model where you’d go and look at a single line of code and say: “Oh, that’s what’s going on here.” It’s because the model would capture exactly the intent — not of the developer — but of the specific group of desk traders who originally spec’d the system that they wanted.

That’s the difference. In the end you may choose not to use getters and/or setters of any kind, because as you state, Hibernate doesn’t need them.

If you really need to cover Chapter 1 again to understand that clearly, I’d highly recommend that that’s where the real answer to your question is found. It’s not really clearly Chapter 7 that reveals that. Domain Services are just one tactical tool to help you model in a specific situation. But it’s the Ubiquitous Language in and explicitly Bounded Context (an application or business service with a domain model) that addresses this.

[End reply.]

If my answer highlights the need to refresh your resolve for stick with the DDD Ubiquitous Language rather than technical solutions that usually lead to Anemia Everywhere, take a refresher here: Implementing Domain-Driven Design; Chapter 1, Getting Started with DDD.