SEDA: An architecture for well-conditioned scalable internet services

A service is well-conditioned if it behaves like a simple pipeline: as the offered load increases, the delivered throughput increases proportionally until the pipeline is full and the throughput saturates; additional load should not degrade throughput. Moreover, after saturation, the response time should not increase faster than linearly with the number of clients. Hence, the key property of a well-conditioned service is graceful degradation: as offered load exceeds capacity, the service maintains high throughput with a linear response-time penalty. Unfortunately, that is not the typical web experience; as load increases beyond saturation point throughput decreases and response time increases dramatically, creating the impression that the service has ground to a halt.

SEDA enables services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. In other words, SEDA makes it easy to do load-shedding and to sacrifice the Q in the DQ principle when saturation is reached.

SEDA draws from (1) thread-based concurrency approach for ease of programming and (2) event-based approach for extensive concurrency. We briefly discuss the two before presenting the SEDA architecture.

Thread-based approach
Threads are too blackbox. Transparent resource virtualization prevents applications from making informed decisions, which are vital to managing excessive load. Using the thread-based approach, the application is unable to inspect the internal request stram to implement a load-shedding policy that identifies and drops expensive requests. As a result, throughput increases until the number of threads grows large, after which throughput degrades substantially. Response time becomes unbounded as task queue lenghts increase. Thread approach does not have a mechanism to drop Q easily and beyond saturation the entire system comes to a crashing halt.
Event-driven approach
Event-driven systems are robust to load, with little degradation in throughput as offered load increases beyond saturation. Event-driven approach performs inherent admission control. Excess tasks are absorbed in the server's event queue, the throughput remains constant across a huge range in load, with the latency of each task increasing linearly.
The event-driven approach implements the processing of each task as a finite state machine, where transitions between states in the FSM are triggered by events. This way, the server maintains its own continuation state for each task rather than relying upon a thread context. The key problem is the complexity of the event scheduler, which must control the execution of each finite state machine. The choice of an event scheduling algorithm is often tailored to the specific application, and introduction of new functionality may require the algorithm to be redesigned. This also makes modularity difficult to achieve.

SEDA architecture
In SEDA, applications/services are constructed as a network of stages, each with an associated incoming event queue. Each stage consists of a thread pool and an application-supplied event handler. The stage's operation is managed by the controller, which adjusts resource allocations and scheduling dynamically. Each stage can be individually conditioned to load by thresholding or filtering its event queue.
In addition, making event queues explicit allows applications to make informed scheduling and resource-management decisions, such as reordering, filtering, or aggregation of requests. SEDA makes use of dynamic resource throttling at the stage level (including thread pool sizing and dynamic event batching/scheduling) to control the resource allocation and scheduling of application components, allowing the system to adapt to overload conditions.

The introduction of a queue between stages decouples their execution by introducing an explicit control boundary. Introducing a queue between two modules provides isolation, modularity, and independent load management (the tradeoff is increased latency compared to a function call). The decomposition of application into stages and explicit event delivery mechanism facilitates inspection, debugging, and performance analysis of applications.

Evaluation
SEDA is evaluated through two applications: a high-performance HTTP server and a packet router for the Gnutella peer-to-peer file-sharing network. The performance and scalability results for these applications show that SEDA achieves flexibility of control and robustness over huge variations in load, and outperforms other service designs.Discussion
We have seen that SEDA makes it easy to do load-shedding and to sacrifice the Q in the DQ principle when saturation is reached. Can SEDA also be made to reduce D as well? I guess this is doable. But, in contrast to dropping Q by load-shedding, the dropping of D is inherently a whitebox operation and involves dealing with the application logic.

Does it make sense to partition/distribute stages across more than one server? SEDA makes this easy to do so by decoupling stages with event queues. I guess this could be useful for horizantally scaling-out a large sophisticated service. (What is an example of that?) The first stage of admitting requests can run on one machine and pass its output to the next stages running on other machines through the event queues.

With very large number of servers at our disposal in a large-scale datacenter do we still have a need/use for SEDA? Large number of servers is useful only to the face of embarassingly parallel applications, like map-reduce framework forges applications into. SEDA can still be useful for horizontally scaling-out large sophisticated applications as we mentioned above.

Additional links

Comments

Regu said…
Murat,

SEDA is a fascinating architecture pattern and we have used it quite successfully to do what you mention here : "I guess this could be useful for horizantally scaling-out a large sophisticated service. (What is an example of that?) The first stage of admitting requests can run on one machine and pass its output to the next stages running on other machines through the event queues."

I am referring to the Enrolment Server that we built for Aadhaar (http://uidai.gov.in/). More on this architecture is available here : http://www.slideshare.net/regunathbalasubramanian/aadhaar-at-5thelephantv3

Thanks for this article. It nicely explains the motivation and design principles behind SEDA.

Cheers,
Regu

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Hints for Distributed Systems Design

Foundational distributed systems papers

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

The demise of coding is greatly exaggerated

SIGMOD panel: Future of Database System Architectures

Dude, where's my Emacs?

There is plenty of room at the bottom