Snapshot is a caching pattern that is used with Event Sourcing Pattern. It is used for caching or creating a snapshot of data at a certain moment (either an event or time) to allows faster read of data from an aggregate’s state. This pattern should only be used when there’s a reason to do so; most of the time it will be a very long process or retrieving data because aggregate has 10000+ events that it needs to read to get the current state.
Typical flow
A typical flow of a snapshot is as follows:
- command is received
- latest snapshot of an aggregate is read from snapshot store
- if snapshot found, aggregate is set from the snapshot, the aggregate version is set to the snapshot version
- remaining events are read from event store, starting from the aggregate version
- state is updated with remaining events
- command is handled as usual
When to take Snapshots
Snapshots can be taken at different time, it all depends on the use case
Snapshot after each event
You can take snapshots after each event. With this, you can always base your business logic on the latest snapshot. When creating a snapshot, it is important to to select whether snapshot happens at the same time as the process of storing an event or asynchronously in the background.
Every N number of events
You may choose to store snapshots after certain number of events. While doing so, you have to also read all the events that happened after the snapshot was created.
Snapshot when specified event type was stored
You may choose to create a snapshot when a certain event was created. For example, when the cashier closes the shift, you may store a snapshot.
Every selected period
You may schedule storing the snapshot at a certain time (every hour, once a day etc). However, using this comes with a risk of spikes in the event processing when they will try to create snapshots at the same time.
Disadvantages of Snapshots
Schema changes
When you would like to make changes to a model, the risk is greater, especially when you are using snapshots for reading data from. If your system heavily depends on snapshots, it adds a lot of complexity and coupling may be hard, if not impossible, to untangle.