One of the goals of the Liberty platform is to scale well in modern multi-core environments. In support of this, work management, scheduling, and dispatching will be centralized in the threading and event services to allow for greater control over how work is distributed and executed within a server.
In the classic application server, thread pools were everywhere. There were thread pools for the web container, thread pools for the orb, thread pools for asynch beans, thread pools for JMX notification, thread pools for messaging, thread pools for DRS, thread pools for DCS, etc. This proliferation of thread pools led to a proliferation of threads with, in turn, led to significant resource consumption above what was needed.
In addition the increased resource consumption, each of these pools attempted to manage its own work without any coordination. This generally resulted in sub-optimal dispatching policies.
To help alleviate the issues (real or imagined) with having so many thread pools, components within Liberty are discouraged from (read that as "don't do it") explicitly creating threads or thread pools. The goal is to move components from a model where they "own" threads to a model where they submit tasks for execution and rely on the runtime to handle the mechanics.
The current implementation of the scheduler is a basic thread pool that employs work stealing. As with a standard thread pool, a global queue is available to hold work that is submitted for execution. In addition to that global queue, each thread in the pool maintains its own stack of work. The scheduler is non-preemptive.
When work is submitted to the scheduler the new work can either be added to the global work queue or it can be pushed to the bottom of a double-ended queue owned by the thread. Threads within the pool will first look at their own work pile for work. If work is found, it will be executed. If no work is found on the local work pile, the thread will then look for work on the global work queue. If no work is found on the global work queue, the thread will look at the other active threads in the pool and will attempt to steal work from another victim thread.
The data structure used to maintain the local work piles is a double-ended queue or deque. The thread that owns the deque is adds and removes work from one end while thieves take it from the other end.
Since a deque is owned by a single thread, the push and pop operations can be done without any synchronization until the top and bottom of the list are within some threshold. When synchronization is required, it is done via an atomic compare-and-swap operation that is non-blocking.
There are three types of deques implemented in the threading code. Each of the implementations consists of a circular buffer to hold work, an index to the bottom of the list, and a composite object that holds the top index, and largest steal size.
There are several benefits to a work stealing scheduler.
Work submitted to the executor from a thread that is not part of the pool can either be pushed to a foreign deque or added to the global work queue. If work is pushed to a foreign deque, the scheduling policy is strict work stealing.
In this mode local threads will have to move work generated by threads outside of the pool to their own deques during steal operations. While this policy avoids locking and hot cache lines, it takes more instructions to manage the work as it will always end up on at least two queues before execution.
The Liberty Event Engine is built on top of the threading services and can
be used to easily implement most pipeline or continuation work flows. If the
basic services provided by the event engine are insufficient, components may
directly submit tasks for execution to an ExecutorService
instance made available by the threading code.
Outside of the event engine, there are two ways to acquire a reference to
an executor. The first is to resolve a reference to the default scheduler.
This scheduler is bound in the service registry behind the
java.util.concurrent.ExecutorService
. The second option is for
a component to request its own named instance of the service. This is
accomplished by calling the getStage
method on the
com.ibm.ws.threading.WorkStageManager
service bound in the
service registry.
In both cases, the object that is returned implements the Java
java.util.concurrent.ExecutorService
contract. This will allow
various styles of Callable
and Runnable
scheduling.
Please see the ExecutorService
javadoc for details.