Saturday, April 11, 2015

ThreadPoolTaskExecutor and HTTPConnectionManager

I was working on a project where it is a requirement to read some tasks from queue and process them.Let's say the queue name is fileSendQueue.There are multiple threads to get the tasks from the the queue   and process the tasks.Here ThreadPoolTaskExecutor of spring comes into picture.I have 10 threads in the thread  pool.Each thread takes a task from the queue process the task.After the completion of the task,the thread is returned to the pool.My Thread pool configuration is like  below

              <bean id="fileSenderTaskExecutor"
        class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
        <property name="corePoolSize" value="10" />
        <property name="maxPoolSize" value="10" />
        <property name="threadGroup" value="fileSenderThreadGroup" />

       </bean >
               
               
               
       




Here core pool size is 10 , max pool size is 10.
The job of each thread is to send a file to a remote server by httpclient.That is it  was a multipart post request.Soon after  deploying the application it is noticed that the task queue (fileSendQueue) is empty.
It looks like all the tasks in the queue are completed successfully.So far so good , so happy now, pretty cool.As expected my thread pool is working properly and it is emptying the queue and processing the tasks. And also successfully completing the task.But Soon after packing my bag for a short leave , I got a call , that the remote servers on the other sides are waiting for the files from this server.Ohh what happened, here my 
fileSendQueue size is 0.It indicates the tasks are processed from the queue.So the queue is empty now.But after logging a little more and analyzing I found the tasks are removed from the queue but it is not processed, it is stuck some where in between.Where did it  stuck?And from logging a little more it is clear that the active thread count is equal to the core pool size that is 10.Actually it is in the ThreadPoolTaskExecutor queue.
Just it removed from one queue and entered in another queue.But Why the tasks are in ThreadPoolTaskExecutor's queue?Since the number of active threads are equal to the core pool size.So the
ThreadPoolTaskExecutor queueed the tasks in its own queue.It is according to the documentation of  ThreadPoolTaskExecutor.But at first it seems ThreadPoolTaskExecutor is not working properly and it is not releasing the threads to the pool after the completion of tasks.So the active thread count is 10.But my assumption  was wrong.

The Bottleneck is not thread pool.It is with HTTPConnectionManager.The httpclient is taking too much time to send a file to remote server.So the next doubt is on HTTPConnectionManager.Perhaps it is not getting time out for bad connection as given in configurations of manager.The SOTimeout is defined as 30 seconds.That is the inactive time between two consecutive packet receive.Perhaps it is not obeying the configuration.And there is some issue with HTTPConnection manager.

After analyzing  a little more  I found  Http Manager is working expected.So what is happening here.Puzzled,confused?Actually the issue is with the file size and bidirectional connectivity.Assume That I have 100 tasks in the queue initially and more tasks are coming at run time.But the file need to send to remote server is of size like 150 MB or more and it is trying to send the file to remote server but taking too much time.Sometimes a thread is blocked for 1 hour or  more  to send a file.But  due to connectivity problem , some times at the end we are getting socket time out exception.

 java.net.SocketTimeoutException: Read timed out

So in this case thread was blocked for more than 1 hour and in the end did not do  anything useful as exception occurs at last moment due to connectivity issue.And it was the issue with all the tasks.The files size was so huge and connectivity was not smooth.

Solution to this problem: Make the file size small enough to send it in case bad connectivity.And make the task of the thread asynchronous.So that the thread from ThreadPoolTAskExecutor will not block till the completion of the task.It will not create load on the system as sooner or later the asynchronous task will complete and the threads will be released.