Saturday, June 27, 2015

Modify an unmodifiable Collection

The Java Collection framework provides an elegant solution to create an unmodifiable Collection like Lists,Sets,Maps from an existing one.Sounds too good.Will it server our purpose?One day was debugging an issue about mutable objects and unmodifiable list.Got a very peculiar behavior.

Lets consider below code snippet.
.
public class UnmodifibleList {
    public static void main(String[] args) {
        String s1=    "Good";  
        String s2=   "Morning";  
        final List modifiableList = new ArrayList();
        modifiableList.add(s1);  
        modifiableList.add(s2);  
        final List unmodifiableList =    Collections.unmodifiableList(modifiableList);
        System.out.println("Before modifying: " + unmodifiableList );
        modifiableList .add("nice");
        modifiableList .add("day"); 
        System.out.println("After modifying: " + unmodifiableList);
   }
}


And the Output as follows:

Before modifying: [Good, Morning]
After modifying: [Good, Morning, nice, day]

Is it seems  strange? Unmodifiable list gets modified after the method call  Collections.unmodifiableList()

What the java doc says:

Returns an unmodifiable view of the specified list. This method allows modules to provide users with "read-only" access to internal lists. Query operations on the returned list "read through" to the specified list, and attempts to modify the returned list, whether direct or via its iterator, result in an UnsupportedOperationException.
The returned list will be serializable if the specified list is serializable. Similarly, the returned list will implement RandomAccess if the specified list does.
Parameters:
list the list for which an unmodifiable view is to be returned.
Returns:
an unmodifiable view of the specified list.
But it does not say if we modify the underlying collection the returned unmodifiable list will also be modified.
Steps to avoid this:
 public class UnmodifibleList {
    public static void main(String[] args) {
        String s1=    "Good";  
        String s2=   "Morning";  
        final List modifiableList = new ArrayList();
        modifiableList.add(s1);  
        modifiableList.add(s2);  
        final List unmodifiableList =      Collections.unmodifiableList(new ArrayList(modifiableList));
        System.out.println("Before modifying: " + unmodifiableList );
        modifiableList .add("nice");
        modifiableList .add("day");  
        System.out.println("After modifying: " + unmodifiableList);
   }


Please look at the line in the above code where  we are calling 
 Collections.unmodifiableList().Here we are creating a brand new ArrayList and passing the original list inside it.


And the Output as follows:

 Before modifying: [Good, Morning]
After modifying: [Good, Morning]


Mutable Object and unmodifiable Collection:
  
Suppose We have a  mutable object and our collection will contain the mutable objects.Please follow the below code snippets to get the better understanding.


public class UnmodifibleMutableList {
    public static void main(String[] args) {
        StringBuffer s1= new StringBuffer("Good");   
        StringBuffer s2= new StringBuffer("Morning");   
        final List modifiableList = new ArrayList();
        modifiableList.add(s1);   
        modifiableList.add(s2);   
        final List unmodifiableList = Collections.unmodifiableList(modifiableList);
        System.out.println("Before modification: " + unmodifiableList );
        s1.replace(0, 3, "ba");
      System.out.println("After modification: " + unmodifiableList);}
}

 

Here notice that  StringBuffer is a mutable class .After calling  Collections.unmodifiableList(); 
we are doing some manipulation with the StringBuffer s1.And that will be reflected in unmodifiable list.

 And the Output as follows:

Before modification: [Good, Morning]
After modification: [bad, Morning]

We have to be conscious that if we modify any of the objects within any of the lists, then all the lists containing the same object will observe the modification.But this is not the case in case of String.As String is immutable and thus cannot be changed once created.

Why do we double check for null instance in Singleton lazy initialization

The algorithm for getting a singleton object of a class by using double checking  for null instances as follows

  1.  public class SigletonTest {
  2.  private static volatile SigletonTest instance = null;
  3.           // private constructor
  4.         private SigletonTest() {
  5.         }
  6.          public static SigletonTest getInstance() {
  7.             if (instance == null) {
  8.                 synchronized (SigletonTest.class) {
  9.                     // Double check
  10.                     if (instance == null) {
  11.                         instance = new SigletonTest();
  12.                     }
  13.                 }
  14.             }
  15.             return instance;
  16.         }
  17.   }

But The question is why the  instance is null checked twice in line no 7 and line no 10.It  seems it is sufficient to check  the instance as null once after synchronization block.The code as follows

public class SigletonTest {
        private static volatile SigletonTest instance = null;

        private SigletonTest() {
        }
         public static SigletonTest getInstance() {
                synchronized (SigletonTest.class) {
                    // single check
                    if (instance == null) {
                        instance = new SigletonTest();
                    }
                }
            return instance;
        }
  }

But In case of the above code ,however, the first call to getInstance() will create the object and all the  threads trying to access it during that time need to be synchronized; after that all calls just get a reference to the member variable. Since synchronizing a method could in some extreme cases decrease performance . The overhead of acquiring and releasing a lock every time this method is called seems unnecessary. Once the initialization has been completed, acquiring and releasing the locks would appear unnecessary.

So the first version is more efficient than second one.

Saturday, April 11, 2015

ThreadPoolTaskExecutor and HTTPConnectionManager

I was working on a project where it is a requirement to read some tasks from queue and process them.Let's say the queue name is fileSendQueue.There are multiple threads to get the tasks from the the queue   and process the tasks.Here ThreadPoolTaskExecutor of spring comes into picture.I have 10 threads in the thread  pool.Each thread takes a task from the queue process the task.After the completion of the task,the thread is returned to the pool.My Thread pool configuration is like  below

              <bean id="fileSenderTaskExecutor"
        class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
        <property name="corePoolSize" value="10" />
        <property name="maxPoolSize" value="10" />
        <property name="threadGroup" value="fileSenderThreadGroup" />

       </bean >
               
               
               
       




Here core pool size is 10 , max pool size is 10.
The job of each thread is to send a file to a remote server by httpclient.That is it  was a multipart post request.Soon after  deploying the application it is noticed that the task queue (fileSendQueue) is empty.
It looks like all the tasks in the queue are completed successfully.So far so good , so happy now, pretty cool.As expected my thread pool is working properly and it is emptying the queue and processing the tasks. And also successfully completing the task.But Soon after packing my bag for a short leave , I got a call , that the remote servers on the other sides are waiting for the files from this server.Ohh what happened, here my 
fileSendQueue size is 0.It indicates the tasks are processed from the queue.So the queue is empty now.But after logging a little more and analyzing I found the tasks are removed from the queue but it is not processed, it is stuck some where in between.Where did it  stuck?And from logging a little more it is clear that the active thread count is equal to the core pool size that is 10.Actually it is in the ThreadPoolTaskExecutor queue.
Just it removed from one queue and entered in another queue.But Why the tasks are in ThreadPoolTaskExecutor's queue?Since the number of active threads are equal to the core pool size.So the
ThreadPoolTaskExecutor queueed the tasks in its own queue.It is according to the documentation of  ThreadPoolTaskExecutor.But at first it seems ThreadPoolTaskExecutor is not working properly and it is not releasing the threads to the pool after the completion of tasks.So the active thread count is 10.But my assumption  was wrong.

The Bottleneck is not thread pool.It is with HTTPConnectionManager.The httpclient is taking too much time to send a file to remote server.So the next doubt is on HTTPConnectionManager.Perhaps it is not getting time out for bad connection as given in configurations of manager.The SOTimeout is defined as 30 seconds.That is the inactive time between two consecutive packet receive.Perhaps it is not obeying the configuration.And there is some issue with HTTPConnection manager.

After analyzing  a little more  I found  Http Manager is working expected.So what is happening here.Puzzled,confused?Actually the issue is with the file size and bidirectional connectivity.Assume That I have 100 tasks in the queue initially and more tasks are coming at run time.But the file need to send to remote server is of size like 150 MB or more and it is trying to send the file to remote server but taking too much time.Sometimes a thread is blocked for 1 hour or  more  to send a file.But  due to connectivity problem , some times at the end we are getting socket time out exception.

 java.net.SocketTimeoutException: Read timed out

So in this case thread was blocked for more than 1 hour and in the end did not do  anything useful as exception occurs at last moment due to connectivity issue.And it was the issue with all the tasks.The files size was so huge and connectivity was not smooth.

Solution to this problem: Make the file size small enough to send it in case bad connectivity.And make the task of the thread asynchronous.So that the thread from ThreadPoolTAskExecutor will not block till the completion of the task.It will not create load on the system as sooner or later the asynchronous task will complete and the threads will be released.  

Thursday, October 23, 2014

Spring Batch- A case study

In our company , we had a requirement to process some mobile numbers , to send them sms and after some configurable time of sending sms, send them a call to give information about some product.Here I would like to describe the implementation of  the  use case by use of  Spring batch.Before I start describing the use case, like to brief what Spring batch is.

 Usage of SpringBatch:

A batch application read  huge number of records from a source(generally database or file system),process them in some required pattern and write it back in  some source(might be in different netowrk/source).

 Use case of Spring Batch:
  •  Suppose we have a large number of data and we want to read,process and write the data in batch or in chunk and want to commit it in batch or chunk .
  •  Suppose we have a job where we want to perform some tasks in parallel within the batch environment.
  •  Want to restart the job manually or with the help of a scheduler.This might be a fresh restart from the beginning or a resume from where we left.
  •  Suppose we have a requirement to execute a step1 and depending on the result of step 1 next action will be taken.On success of step1 step2 will be executed and step3  will be executed on failure of step1.
  •  Suppose we need to skip records purposefully at the time of processing based on some condition.
  • Combination of all the above.
  
 Spring Batch Architecture:   

      The Spring Batch framework consist of three layer.
  • Application layer  represents the business logic  we write by using the spring batch.
  • Core layer represents the components that is necessary to control a batch job.It consist of classes such as Job Launcher,Job,Step.
  • Infrastructure layer represents Item reader,Item writer and classes to handle things like job recovery and job restart.
   
 Spring Batch Terminology: 

  •   Job:- A batch job is a combination of steps in a predefined order to execute as part of a task..It is on the top of the batch hierarchy.   
  •   JobParameters:- A set of parameters used to start a batch job.Suppose we have a job that is   interacting with our customers by sending sms or email.If the job is scheduled with parameter sms ,then it will send sms to the specific customers those are in base.If it is scheduled with the parameter email, then it will send email to the specific customers those are in base.Here "sms" and "email" are different job parameters.
  • JobInstance:The execution of a  job with the unique set of parameters is called a JobInstance of the same job.If the same job is running  with parameter sms and email at the same time,then we say that two instances of the same job  are running.
  • Step:-A Step is an entity  that encapsulates an independent, sequential phase of a batch job.
  • ExecutionContext:It represents a store to persist key/value pair (analogs to one-to-one mapping)data that can be used by step or job at the time of execution.
  • JobRepository:It is the storage mechanism for all the details of job and step executions.When a Job is first launched, a JobExecution is obtained from the repository.
  • JobLauncher:-It is the mechanism which is used to launch the job with the given set of job parameters.
  • Item Reader:ItemReader is a mechanism that retrieves  the input for a Step with one record at a time.
  • Item Processor:Item Processor is a mechanism which processes one record at a time and determines whether the record is valid or not.If if it is invalid it will skip that record.
  • Item Writer:Item Writer is a mechanism which writes the processed records of one batch or chunk at a time.
  • Listener: A listener  is something that is waiting an event to occur and intercept that with some custom requirement.Similarly batch job allows the use of listeners to do some additional stuff by hijacking an event.We can use listeners in batch job in two levels ie. job level and step level.
  • Job level listeners:-If we want to send a email/sms at the start of the job or end of the job,then job level listener is the right candidate.Job level listeners are
    1.JobExecutionListener.
  • Step level listeners:-If we want to do so some customized task inside a step , we can do it with step level listeners.Step level listeners are
    1.StepExecutionListener
    2.ChunkListener
    3.ItemReadListener
    4.ItemWriteListener
  Configuration for the job: 

 We can configure the job in different ways like in programmatic   way and in  xml  way.Here we describe      the xml configuration for the job.In our current scenario  we will use the following tags to define our job in xml.
  1.  job:It is the parent element of job configuration.The  sub elements will be in use are step and split.
  2.  Step: It is a stage in a batch job.There may be many stages associated with a batch job.A step requires either a chunk definition, a tasklet reference.The sub elements will be in use are tasklet and next.
  3.  Split:  It declares that the job should split into two or more subflows.
  4.  Tasklet: The Tasklet strategy can be implemented directly by  configuring a reference to the Tasklet interface or  by configuring a chunk .The sub elements will be in use are chunk.
  5.  Chunk:Chunk declares that the owner of the chunk ie step which contains the chunk will perform chunk oriented processing.The sub elements will be in use are reader,processor,writer,listeners.continuing..............