What's an Entity?

Entity is a fundamental concept to model a domain. It has a lifecycle, changes and is continuous through time with its fixed identity
·
DDD's Entity is the second fundamental concept, along Value, to define a domain model. Entity represents concept of the domain that has a lifecycle and where the controlled changes to the entity are a domain matter. An entity has an immutable identity and a mutable state represented with a set of values.
Table of Contents

Definition

An Entity is a thing of significance about which the organization wishes to hold information. When we say "hold" informations we means control its lifecycle (from its creation, changes, consumption until retirement). This may be a tangible thing like a product or a customer, or it may be an intangible thing like a transaction or a role. More formally, an entity is a thing of significance for the organization that has a lifecycle: hence an identifier is needed to have a thread of continuity associated to different states, composed of values.
An entity type is the definition of a set of entities. That is, an entity gets its attributes from the definition of its entity types. We are more concerned with entity types than with specific occurrences of entities, in the rest of this article "entity" means "entity type". We will discuss the definitions of entities as kinds of things, and the word "instance" describe the individual examples of entity.
An Entity is a thing defined by a thread of continuity thanks to an Identity, and whose state is described with a set of values that can change over time.
Entity instance has an identifier: To ensure an entity instance can be reliably refered to during its whole lifetime and thus have a thread of continuity, every Entity has an invariable Identity or Identifier.
Entity instance has a state, represented by a set of Values, that change over time. Those state changes are triggered by commands, reflecting the reality of the domain the entity is representing. Every change to the entity means a change to the values that represents the state.
Entity instance can have relationships to other entity instances. But concerning that characteristic, every relationships should be considered with care: is it really an entity? or can it be modeled as a value. Here the question to ask ourselves is: do we consider the whole lifecyle of that thing? or just a snapshot in time, for instance an Address for a shipment is useful in the life of that shipment and is therefore just a value.
Example: a bank account is an entity, its identity is the account number - e.g. IE12 BOFI 9000 0112 3456 78 - and its state is the set of transactions, some metadata (opening date, etc.) and the reference to the customer, which is itself an entity. Bank Account's transactions are a perfect fit for value as they are immutable by design (in double-entry accounting, an error to a transaction implies a compensation with a new transaction and not a change to the faulting transaction) and holds

Modeling a domain with entity

When something is distinguished and referred to through its identity by the domain expert or users, rather than by its attributes, make this concept an entity in the domain model. Keep the definition simple and focus on the lifecycle and identity.
An entity has an identity and a state, so the following questions are worth asking to ourselve:
  • Categories. What is it? In what categories do we perceive the thing to be? What categories do we acknowledge? How well defined are they?

  • Sameness, what does it means to be the same thing? this question is particularly useful to detect duplicates during entity's creation

  • Oneness, what is one thing? How to distinguish between two different occurence of the thing.

  • What's the expected lifecyle of the entity? how the entity is changing over time? What's the past participles that are used to describe its state?

Example with the training course domain

Identity and identifier

Identity is a means of distinguishing each entity instance regardless of its state or history.
An identity is a unique value that serve to define the uniqueness of an entity. An identity must be unique within a Bounded Context for an entity type. An identity is not something intrinsic to a thing in the world, it’s a meaning superimposed on something because it’s useful to distinguish entities
Identity is often significant outside the software system (ex: bank account number, Social wellfare number, etc.)
Entity is assigned an identity at its creation, therefore it’s important to answer the question "what it means to be the same thing?"" given the initial values used for creation to avoid duplicates. The key point is to ask yourself the question of the impact of duplicate entity instances related to the real world. For example, in the context of an e-commerce system, the customer identifier can be its email address, whether a real person creates several accounts with different email addresses has no impact on the business afterwards.
Then there are different logic used at entity creation:
  • Does an entity instance already exist?

  • If not, what's the identifier generation logic? is there a natural key or does the entity can be assigned an UUID? concerning natural key I invite you to read this article about the reality of identifying things in real life. The conclusion is excellent:

Nature doesn’t come with identifiers. Humans give labels to things. We’re the ones who give things names and codes and symbols. They’re all arbitrary, unnatural, human inventions. When doing schema design, the question isn’t “should I use a natural key or a surrogate key?”. There are only your surrogate keys or someone else’s surrogate keys. The question you should be asking yourself is “do I really want to build this table around an unenforced foreign key?”.
Another quote I like very much is from Rich Hickey in the article about State on Clojure's website:
Identities are mental tools we use to superimpose continuity on a world which is constantly, functionally, creating new values of itself. By identity we mean a stable logical entity associated with a series of different values over time
Here are some best practices to consider when dealing with identifiers:
  • Use unique and immutable identifiers: Ensure that each entity or aggregate instance has a unique identifier that remains constant throughout its lifecycle particularly if the entity is exposed outside of the system. This allows for easy tracking and referencing of domain objects. Typically, GUIDs (Globally Unique Identifiers) or UUIDs (Universally Unique Identifiers) are used for this purpose, better you can use URN to identify entity instance but this technique if not very widespread except in AWS.

  • Be careful with meaningful identifiers: always question their origin and who's controlling their creation and assignment to an entity.

  • Encapsulate identifier generation: particularly if the generation relies on values given at the creation. This ensures consistency in identifier generation and keeps the domain model clean and cohesive.

  • Be careful when exposing internal identifiers: Whenever you expose an identifier to the outside world you inroduce a coupling between your system and the systems that are consuming your data. Whenever possible, avoid exposing the internal identifiers of your entities or aggregates to external systems.

  • Favor value for identifier instead of primitive type: Consider using value objects to represent identifier, as they can help enforce identifier invariants and provide a more expressive domain model.

  • Enforce identifier uniqueness at the infrastructure level: While the domain model should ensure identifier uniqueness, it's also important to enforce this constraint at the infrastructure level (e.g., in your database or persistence layer). This provides an additional level of protection against duplicate identifiers and data integrity issues.

  • Keep identity separate from other attributes: Don't mix the identity of an entity or aggregate with its other attributes. The identity should be independent of the other attributes, which can change over time.

Example: a credit card or a bank account are perfect entitites example as both has an identifier that doesn't change (it's an invariant) during their whole lifetime.

State and its transitions

An entity is just an identity associated to a state and operations to change its state. In software engineering, "state" refers to the information maintained by a computer program for tracking its condition or situation at a given moment, usually in memory through the concept of variable.
The state of an entity is all the values of its attributes apart from its identity at a given moment.
Each attributes of the entity are values and the state of the entity is the set of values at a particular instant. The figure 1 is an illustration of an entity over time.
Figure 1. "State representation of a person over time"
Entity state transitions is all about time! time is measured through cyclic and periodic change, as each state transition occurs time is involved. Time involvement brings complexity to entities.
State transitions of an entity involve the most important business rules. In a nusthell value absorbs surface complexity (validation of value) so that entities only have to manage the deep complexity. To bring a functional programming point of view to the DDD's entity point of view: each state transition is just a function that take the current version of the state of the entity plus the command and return the "N+1" version. The figure 2 represents the mental model about entity state management:
Figure 2. "Entity, State and Function"
  • Command triggers state transition and can be considered as a function with the signature: function(entity,command) -> entity-N+1

  • An event is published whenever an entity lands on a new state

  • Every State is the sum of the entity values at a given moment

  • Identity never change

The figure 3 is an example of an Order entity and its different state transition, from its creation through payment:
Figure 3. "Order entity and its different states and transitions
The attributes of the order entity are values, particularly regarding the order use-case, for instance we could decide that product lines a just product ids associated to a quantity, but such a decision would introduce a strong coupling between the "Order management System" and the "Product master data system". The problem is that whenever the product change, it would impact the order (or the product could handle different versions and an history but the runtime dependency would still be there). A better design decision that increase autonomy and decoupling of the "Order Management System" would be to design product line as values and copy the product data when the Order is submitted and stored as-is.

Operations and state transitions: The Command / Query / Event tryptic

State transitions are done by operations that are indeed functions. There are missing concepts in our mental model to paint the whole picture of the entity design: Command, Query and Event.
An entity's state transition is triggered by a Command, the entity current state is retrieved by a Query, and whenever the entity lands on a new state it publishes an Event.
Figure 4. "The Command / Query / Event tryptic and its relation to Time"

Command

A command is a request to do something
  • A command represents an intention

  • The result of a command can be either success or failure, the result is an event

  • In case of success, state change(s) must have occured somewhere (otherwise nothing happened)

  • Commands are named with a verb, in present tense of infinitive and a nominal group coming from the domain (entity of aggregate type)

Query

« A Query is a request asking to retrieve some data about the current state of a system »
  • A query ask a system for data in a specific model

  • Query never change the state of the system (they are idempotent)

  • The query processing is often synchronous

  • The query contains fields with some value to match for or an identifier

  • Query can results in success or failure (not found) and long results can be paginated

  • Queries are named with: “get” something (with identifier as arguments) or “find” something (with values to match as arguments)

Event

« Something happened that domain experts care about » Eric Evans
  • An Event represents a fact about the domain from the past

  • Events are raised on every state transition that acknowledged the new fact as data in our system

  • Event can reference the command or query identifier that trigger it

  • Events can be ignored, but can’t be retracted or deleted, only a new event can invalidate a previous one

  • Events are named with a past participle

  • There are internal and external event:

    • Internal Events: the ones we raised and control in our Bounded Context

    • External Events: the ones from other upstream BC we subscribed to

Entity attributes

One or more attributes describe an entity, and the values of those attributes describe occurrences of the entity. If an entity is a thing of significance about which an enterprise wishes to hold information, then an attribute defines one of the pieces of information held. For example, when designing a training management application we could have two Entities: "Training Course" and "Session" with the following attributes:
Entity: Training Course
AttributeExample Value
CourseID1001
CourseTitle"Introduction to Clojure Programming"
CourseDescription"This course covers the basics of Clojure programming language"
Instructor"John Doe"
CourseDuration30 (hours)
StartDate2023-07-01 (YYYY-MM-DD)
EndDate2023-08-15 (YYYY-MM-DD)
CourseFee200.00 (USD)
A Training Course can contain multiple Sessions.
Entity: Session
AttributeExample Value
SessionID5001
CourseID1001
SessionTitle"Clojure application programming"
SessionDescription"This session covers Clojure's libraries for application concerns : REST API, Persistence, Messaging, Security, Hexagonal Architecture"
SessionInstructor"John Doe"
SessionStartTime09:00
SessionEndTime17:00
SessionDate2023-07-02 (YYYY-MM-DD)

Relationships

Some concepts can be represented by attributes or relationships. Deciding whether some concepts are best represented as entity classes or relationships is subjective. Is a marriage better described as a relationship between two people, or as “something we need to keep information about?”. in the latter case, we "reify" an abstract concept into something more concrete.
There is almost always an element of choice in how data is classified into entity classes. Should a single entity class represent all employees or should we define separate entity classes for part-time and full-time employees? Should we use separate entity classes for insurance policies and cover notes, or is it better to combine them into a single Policy entity class?

Entity naming

Each entity name is a noun written in the singular, to show that, in fact, references to an entity are to the object that constitutes a representative occurrence of that entity. Its naming should use a nominal group that refers to something or a role.
A bank account is composed of values, the account's transactions (and there can be a lot of transactions after some time). Moreover a bank account is an example of an event-driven system. Transactions are immutable and can't be changed, if there is a transaction in error, the bank issues a new transaction that will compensate the previous one. A transaction can be a credit (money moves in) or a debit (money moves out) and the sum of all the transactions from the beginning of the account to now gives the account's balance. The monthly bank account statement is indeed a snapshot of the transactions as values that occured during the month.

Is this domain concept an Entity or a Value? tips for selecting the right building block

Of course it depends of the intended usage, let me remind you of the banknote example used when talking about Value Type: the banknote is a value for the customer and retailer as they are interested in the face value of the banknote and the identifier written on it doesn't matter at all, but for a central bank this is an entity as they are interested in the lifecycle of that particular banknote (date of introduction on the market, retirement, is this a duplicate? so one is counterfeit, etc.).
Another example: A person ticket on a plane is an Entity because each seat has an identifier and on that particular trip, that person is associated with that seat. A person in a bus is a Value because her identity doesn’t matters, what matters is only the fact that her ticket is valid and stamped during the trip.
The following questions and characteristics are worth asking:
  • Do we need to track the lifecycle over time?

    • Ex: banknote, entity for central bank as the id is needed (fraud, etc.), value for the end user as the usage is the same whatever the note id is

  • Is the identity needed in some business logic?

    • Ex: seat id when issuing ticket (plane) or not (metro/bus)

  • Does someone do the tracking for us?

    • Ex: postal address management made by postal services that track the changes to an address (thus it's an entity)

  • Do we have to distinguish the same thing (same attributes) but created at different time?

  • What does it means to be equal? To be different?

Comparing the characteristics of Entity and Value

CharacteristicValueEntity
MutabilityShould be immutableUsually mutable with a properly defined lifecycle
EqualityCompare the value of all attributesCompare with identity only (attribute comparison is a smell)
ScaleTypically small in scaleOf any size
SerializabilityShould be serializable; may be parseableNeed not be serializable
DefaultsOften have a natural default: the number 0, or “today »No natural default
ComparabilityOften have a natural order: 1, 2, 3. . .Typically no natural sort order
ClosureOften have a closed set of operations: 2 + 2 = 4Unlikely to have any closed set of operations
IdentityMay represent an identifier, such as an SSNSomething that is identified, such as a Person
User InterfaceHave dedicated widgets: checkbox, calendar. . .Coded by hand

Conclusion

The concept of Entity is one of the most important one when designing an application as it manages the application state. Managing this state effectively is a critical aspect of software design, as poor state management lead to a lot of issues (bug, poor maintainability, etc.). Distinction between Entity and Value is crucial to differentiate the logic and keeping apart the core logic of state transition and the surface rules of the value types.