Wednesday, March 20, 2013

Eventsourced for Akka - A high-level technical overview

Eventsourced is an Akka extension that adds scalable actor state persistence and at-least-once message delivery guarantees to Akka. With Eventsourced, stateful actors
  • persist received messages by appending them to a log (journal)
  • project received messages to derive current state
  • usually hold current state in memory (memory image)
  • recover current (or past) state by replaying received messages (during normal application start or after crashes)
  • never persist current state directly (except optional state snapshots for recovery time optimization)
In other words, Eventsourced implements a write-ahead log (WAL) that is used to keep track of messages an actor receives and to recover its state by replaying logged messages. Appending messages to a log instead of persisting actor state directly allows for actor state persistence at very high transaction rates and supports efficient replication. In contrast to other WAL-based systems, Eventsourced usually keeps the whole message history in the log and makes usage of state snapshots optional.

Logged messages represent intended changes to an actor's state. Logging changes instead of updating current state is one of the core concept of event sourcing. Eventsourced can be used to implement event sourcing concepts but it is not limited to that. More details about Eventsourced and its relation to event sourcing can be found here.

Eventsourced can also be used to make message exchanges between actors reliable so that they can be resumed after crashes, for example. For that purpose, channels with at-least-once message delivery guarantees are provided. Channels also prevent that output messages, sent by persistent actors, are redundantly delivered during replays which is relevant for message exchanges between these actors and other services.

Building blocks

The core building blocks provided by Eventsourced are processors, channels and journals. These are managed by an Akka extension, the EventsourcingExtension.

Processor

A processor is a stateful actor that logs (persists) messages it receives. A stateful actor is turned into a processor by modifying it with the stackable Eventsourced trait during construction. A processor can be used like any other actor.

Messages wrapped inside Message are logged by a processor, unwrapped messages are not logged. Logging behavior is implemented by the Eventsourced trait, a processor's receive method doesn't need to care about that. Acknowledging a successful write to a sender can be done by sending a reply. A processor can also hot-swap its behavior by still keeping its logging functionality.

Processors are registered at an EventsourcingExtension. This extension provides methods to recover processor state by replaying logged messages. Processors can be registered and recovered at any time during an application run.

Eventsourced doesn't impose any restrictions how processors maintain state. A processor can use vars, mutable data structures or STM references, for example.

Channel

Channels are used by processors for sending messages to other actors (channel destinations) and receiving replies from them. Channels
  • require their destinations to confirm the receipt of messages for providing at-least-once delivery guarantees (explicit ack-retry protocol). Receipt confirmations are written to a log.
  • prevent redundant delivery of messages to destinations during processor recovery (replay of messages). Replayed messages with matching receipt confirmations are dropped by the corresponding channels.
A channel itself is an actor that decorates a destination with the aforementioned functionality. Processors usually create channels as child actors for decorating destination actor references.

A processor may also sent messages directly to another actor without using a channel. In this case that actor will redundantly receive messages during processor recovery.

Eventsourced provides three different channel types (more are planned).
  • Default channel
    • Does not store received messages.
    • Re-delivers uncomfirmed messages only during recovery of the sending processor.
    • Order of messages as sent by a processor is not preserved in failure cases.
  • Reliable channel
    • Stores received messages.
    • Re-delivers unconfirmed messages based on a configurable re-delivery policy.
    • Order of messages as sent by a processor is preserved, even in failure cases.
    • Often used to deal with unreliable remote destinations.
  • Reliable request-reply channel
    • Same as reliable channel but additionally guarantees at-least-once delivery of replies.
    • Order of replies not guaranteed to correspond to the order of sent request messages.
Eventsourced channels are not meant to replace any existing messaging system but can be used, for example, to reliably connect processors to such a system, if needed. More generally, they are useful to integrate processors with other services, as described in another blog post.

Journal

A journal is an actor that is used by processors and channels to log messages and receipt confirmations. The quality of service (availability, scalability, ...) provided by a journal depends on the used storage technology. The Journals section in the user guide gives an overview of existing journal implementations and their development status.

References