Sunday, December 7, 2008

Java 5 Concurrency: The Executor framework

The Java platform has always provided support for multi-threaded/concurrent programming. However, prior to Java 5, the support was in the form of primitive constructs in the programming language itself. Java 5 steps up and provides concurrency utility frameworks and data structures in the java.util.concurrent package. One of the utilities provided is the task scheduling framework better known as the Executor framework. The JVM runs as a process and our application is one of the threads in the JVM. There are various other "system" threads running to do tasks like garbage collection, memory management etc. But from the application's perspective, there is the single "main" thread to begin with. The application, in itself, can spawn a number of threads to perform various helper tasks for various reasons like performance etc. Prior to Java 5 , spawning a new thread to perform a task was most commonly done as follows, although you could also do by extending the Thread class, but it is not highly recommended:

public void mainMethod() {
 HelperTask task = new HelperTask(); // Step 1: Create an object representing the task
 Thread t = new HelperThread(task);  // Step 2: Create a new thread for executing the task
 t.start();                        //  Step 3: Start the new thread
}

public class HelperTask implements Runnable {
public void run() {
      doHelperTask();
}
}
We have the following issues:
  1. Most of the code related to thread creation and task delegation to the thread is a part of the application itself. We need a way to abstract the above steps away from the application.
  2. Also what if we have multiple helper tasks or a scenario where every single user action requires a new Thread to be spawned to process? Creating a lot many threads with no bounds to the maximum threshold can cause out application to run out of memory.
  3. Secondly although threads are light-weight (but only as compared to the process) , creating them utilizes a lot of resources. In such a situation, having a ThreadPool is a better solution so that only fixed number of Threads are created and re-used later.
  4. Another short-coming of the Runnable interface's void run() method is that the task executed within run() has no way of returning any result back to the main thread. So work-arounds designed around that would be that the asynchronous task either updates certain database table(s) or some file(s) or some such external data structure(s) to communicate the result to the main thread.
The Executor framework addresses all the above issues and in addition also provides additional life-cycle management features for the threads. The Executor framework consists of the following important interfaces:

  1.  Callable: This interface is similar in concept to Runnable interface, ie it represents the asynchronous task to be executed. The only difference is that its call() method returns a value, ie the asynchronous task will be able to return a value once it is done executing. 
  2. Executor, ExecutorService and ScheduledExecutorService - Each of these interfaces adds more functionality to the previous one in thread and their life-cycle management. The Executor abstracts the Thread creation (as seen in Step 1 above) and executes all Runnable tasks. The ExecutorService extends the Executor and is able to execute Callable tasks in addition to Runnable tasks. It also contains life cycle management methods. The ScheduledExecutorService allows us to schedule the asynchronous tasks thereby adding support for delayed and periodic task execution. 
  3. Future: This interface represents the result of the asynchronous task which itself could be represented as Callable. The ExecutorService which can execute Callable tasks returns a Future object to return the result of the Callable task.