Snapshot is a caching pattern that is used with Event Sourcing Pattern. It is used for caching or creating a snapshot of data at a certain moment (either an event or time) to allows faster read of data from an aggregate’s state. This pattern should only be used when there’s a reason to do so; most of the time it will be a very long process or retrieving data because aggregate has 10000+ events that it needs to read to get the current state.

Typical flow

A typical flow of a snapshot is as follows:

  • command is received
  • latest snapshot of an aggregate is read from snapshot store
  • if snapshot found, aggregate is set from the snapshot, the aggregate version is set to the snapshot version
  • remaining events are read from event store, starting from the aggregate version
  • state is updated with remaining events
  • command is handled as usual

When to take Snapshots

Snapshots can be taken at different time, it all depends on the use case

Snapshot after each event

You can take snapshots after each event. With this, you can always base your business logic on the latest snapshot. When creating a snapshot, it is important to to select whether snapshot happens at the same time as the process of storing an event or asynchronously in the background.

Every N number of events

You may choose to store snapshots after certain number of events. While doing so, you have to also read all the events that happened after the snapshot was created.

Snapshot when specified event type was stored

You may choose to create a snapshot when a certain event was created. For example, when the cashier closes the shift, you may store a snapshot.

Every selected period

You may schedule storing the snapshot at a certain time (every hour, once a day etc). However, using this comes with a risk of spikes in the event processing when they will try to create snapshots at the same time.

Disadvantages of Snapshots

Schema changes

When you would like to make changes to a model, the risk is greater, especially when you are using snapshots for reading data from. If your system heavily depends on snapshots, it adds a lot of complexity and coupling may be hard, if not impossible, to untangle.