Thursday, October 23, 2014

Spring Batch- A case study

In our company , we had a requirement to process some mobile numbers , to send them sms and after some configurable time of sending sms, send them a call to give information about some product.Here I would like to describe the implementation of  the  use case by use of  Spring batch.Before I start describing the use case, like to brief what Spring batch is.

 Usage of SpringBatch:

A batch application read  huge number of records from a source(generally database or file system),process them in some required pattern and write it back in  some source(might be in different netowrk/source).

 Use case of Spring Batch:
  •  Suppose we have a large number of data and we want to read,process and write the data in batch or in chunk and want to commit it in batch or chunk .
  •  Suppose we have a job where we want to perform some tasks in parallel within the batch environment.
  •  Want to restart the job manually or with the help of a scheduler.This might be a fresh restart from the beginning or a resume from where we left.
  •  Suppose we have a requirement to execute a step1 and depending on the result of step 1 next action will be taken.On success of step1 step2 will be executed and step3  will be executed on failure of step1.
  •  Suppose we need to skip records purposefully at the time of processing based on some condition.
  • Combination of all the above.
  
 Spring Batch Architecture:   

      The Spring Batch framework consist of three layer.
  • Application layer  represents the business logic  we write by using the spring batch.
  • Core layer represents the components that is necessary to control a batch job.It consist of classes such as Job Launcher,Job,Step.
  • Infrastructure layer represents Item reader,Item writer and classes to handle things like job recovery and job restart.
   
 Spring Batch Terminology: 

  •   Job:- A batch job is a combination of steps in a predefined order to execute as part of a task..It is on the top of the batch hierarchy.   
  •   JobParameters:- A set of parameters used to start a batch job.Suppose we have a job that is   interacting with our customers by sending sms or email.If the job is scheduled with parameter sms ,then it will send sms to the specific customers those are in base.If it is scheduled with the parameter email, then it will send email to the specific customers those are in base.Here "sms" and "email" are different job parameters.
  • JobInstance:The execution of a  job with the unique set of parameters is called a JobInstance of the same job.If the same job is running  with parameter sms and email at the same time,then we say that two instances of the same job  are running.
  • Step:-A Step is an entity  that encapsulates an independent, sequential phase of a batch job.
  • ExecutionContext:It represents a store to persist key/value pair (analogs to one-to-one mapping)data that can be used by step or job at the time of execution.
  • JobRepository:It is the storage mechanism for all the details of job and step executions.When a Job is first launched, a JobExecution is obtained from the repository.
  • JobLauncher:-It is the mechanism which is used to launch the job with the given set of job parameters.
  • Item Reader:ItemReader is a mechanism that retrieves  the input for a Step with one record at a time.
  • Item Processor:Item Processor is a mechanism which processes one record at a time and determines whether the record is valid or not.If if it is invalid it will skip that record.
  • Item Writer:Item Writer is a mechanism which writes the processed records of one batch or chunk at a time.
  • Listener: A listener  is something that is waiting an event to occur and intercept that with some custom requirement.Similarly batch job allows the use of listeners to do some additional stuff by hijacking an event.We can use listeners in batch job in two levels ie. job level and step level.
  • Job level listeners:-If we want to send a email/sms at the start of the job or end of the job,then job level listener is the right candidate.Job level listeners are
    1.JobExecutionListener.
  • Step level listeners:-If we want to do so some customized task inside a step , we can do it with step level listeners.Step level listeners are
    1.StepExecutionListener
    2.ChunkListener
    3.ItemReadListener
    4.ItemWriteListener
  Configuration for the job: 

 We can configure the job in different ways like in programmatic   way and in  xml  way.Here we describe      the xml configuration for the job.In our current scenario  we will use the following tags to define our job in xml.
  1.  job:It is the parent element of job configuration.The  sub elements will be in use are step and split.
  2.  Step: It is a stage in a batch job.There may be many stages associated with a batch job.A step requires either a chunk definition, a tasklet reference.The sub elements will be in use are tasklet and next.
  3.  Split:  It declares that the job should split into two or more subflows.
  4.  Tasklet: The Tasklet strategy can be implemented directly by  configuring a reference to the Tasklet interface or  by configuring a chunk .The sub elements will be in use are chunk.
  5.  Chunk:Chunk declares that the owner of the chunk ie step which contains the chunk will perform chunk oriented processing.The sub elements will be in use are reader,processor,writer,listeners.continuing..............