Saturday, October 10, 2015

Volatile keyword in Java

Usually we discuss a lot about how volatile works in java,but still it is not clear about the scope of volatile keyword.Recently one of my colleague asked me about the scope of volatile and its usefulness.I would like to use this space to  make it clear as possible.Although I described about the volatile keyword  in one of my post compiler reordering.But still it has some more directions to discuss.

Some points to know about JMM(JAVA Memory Model )
  1. Each thread has separate memory space.
  2. We need some special trick to enforce the communication between different threads.
  3. Some times memory writes can leak so that other threads may read the updated value.But this is not guaranteed means of communication between two threads.

Role Of Volatile in Thread communications :

  • Volatile modifier is a mechanism by which communication between different threads are guaranteed.
  • When second thread see the value written in volatile variable by the first thread.,then it is the guarantee that second thread will see all the contents of the (first threads memory space ) memory space written by first thread just before writing into the volatile variable.
  • We call this principle as  happens before principle in JMM.
 Let's try to understand it with the help of an example.

Consider we have an scenario  like this.We have an int variable named result which is not volatile and an Boolean variable named flag which is volatile.And we have two threads Thread1 and Thread2.Suppose Thread1 started and make the value of result as 30 and the value of flag as true.

Thread1                                                                             
______

 result=30;
  flag= true;


   Thread2
________

if(flag)
System.out.println(result); 


Then Thread2 comes and reads from flag and sees the value written to it by Thread1 .Because this communication happens, all of the memory space seen by Thread 1, just before it wrote to flag , must be visible to Thread2, after it reads the value true for flag.

So here Thread2 will print the value of result as 30.This is guaranteed due to the volatile modifier of flag.

Here if  you follow one of my  blog on double check for null instance , we have used the volatile modifier in line 2.
Just for convinece i am writing the snippet here

 public class SigletonTest {
 private static volatile SigletonTest instance = null;
          // private constructor
        private SigletonTest() {
        }
         public static SigletonTest getInstance() {
            if (instance == null) {
                synchronized (SigletonTest.class) {
                    // Double check
                    if (instance == null) {
                        instance = new SigletonTest();
                    }
                }
            }
            return instance;
        }
  }
Because If one thread creates an object, it has to convey or communicate the contents of its memory to another thread.Otherwise the newly created object will just remain in it's own memory space.But we need to communicate this message to other threads also, so that our purpose of single object creation can be achieved.That's why we used volatile modifier in line 2.

Some people argue that  , since the lock in form of synchronized block  also follows this happens before relationship , is the volatile modifier is necessary in line no 2?

The answer is yes,because here only the writting thread is performing the locking mechanism.But not the reader thread.If you see in line 7 null check of the instance is performed outside the synchronized block which is done by the reader thread.

Synchronization by itself would be enough in this case if the first check was within synchronized block.But we have kept it outside the synchronized block to save the synchronization cost, when the object is already created as discussed in my previous blog double check null instance.

Without explicit communication  with the help of volatile variable , the reader thread will not be able to see the fully constructed object created by the writer thread.

Immutability in Java

Generally we are asked questions about immutable in java.When a class is said to be immutable?Can you say whether the given class  is immutable?Lots of discussion.Here I want to summarize my knowledge about Immutable with the help of  some examples.

Immutable:

Definition: Informally we say an object is immutable if the state can not be modified after construction.That is the invariants defined by the constructor always hold even after any  steps of  the constructor creation.

Now let's see some example of immutable classes.

1. String:

As we all know that string class is immutable,let's start with String class. we see string class is final and it has  4 instance variables which form the object state.They are char array,offset,count and hash.Out of them 3 is final and hash is not final.Now let's check the invariants  here.The invariant is that at any time after the string construction the value of the charcter array will remain the same. 
This invariant will hold for the String as the character array is final.So once we assign some value to it in constructor , we can't assign any value afterwards.

why hash is not final:

Let's see the hashcode method defined in String below

public final class String {
 private final char value[];
private final int offset;>
private final int count;
private int hash;

public int hashCode() {
         int h = hash;
         if (h == 0) {
             int off = offset;
             char val[] = value;
             int len = count;

             for (int i = 0; i < len; i++) {
                 h = 31*h + val[off++];
             }
             hash = h;
         }
         return h;
     }
}
Now let's think about hash.Assume It as  a cache to store the value of the hashcode. If  we don't call hashcode method , the value for it will not be set. It could have been set during the creation of the string, but that will lead to longer creation time, for a feature we might not need at all.On the other hand, it would be unnecessary to calculate the hash each time its  required.So it is stored in a non final field.Just see inside the hashcode method if  hash  is assigned to  h.If h is not  0 , then same value is returned  immediately.So no need to recalculate  hash again and again.

The fact that there's a non-final field which  gives us the perception that the invariants may not hold. It's an internal implementation detail, which has no effect on the invriants we defined above.

 As hash is non final,so it might lead us to the doubt that ,it may change in future in some cases.
 Let's be specific about it.Here hash is used to contain the hashcode of the string.But in most of
 the cases it is never required to compute the hashcode till the lifespan of the String.But it is needed   sometimes to compare two strings to check the content equality.In that case hashcode is required.So string computes hashcode lazily but not in the constructor, as it is not immediately required.Also if we look at the hashcode method , we see that it is dependent on three parameters that is offset,count and value which themselves are final and can't change.so  every calculation of the hash give us the same result.As the method hascode is not synchronized , so it is possible that two threads are accessing the the method simultaneously and setting value in hashcode.So hash should be non final.

Why string class is final: 

 Let's assume String class is not final and see what misshapening can occur.
           
   Let's create a class MutableString which extends String.
           
     public class MutableString extends String {
     private String text;

     public MutableString(String value) {
         super(value);
         text = value;
     }

     public int getText() {
         return text;
     }
     public void setText(String newValue) {
         text = newValue;
     }
}

Now  the class MutableString  can be passed everywhere  where String is needed ,because it is of type String.consider the below case.Here we have a method ,that is verifies the password.If it does not pass certain criteria  it  discards the password, otherwise  forward  it for  the next step.

public String verifyPassword(String password) {
     if (!password.contains("some charcter"))
         throw SecurityException("The password is not a valid one");
     //Here in betwwen a thread come along and change the password value ,But now this password just      //changed to some new one is not verified and may contains invalid characters but still return by the method.
     return password;
 }
 Thread1
_______
 MutableString password="secret"
 verifyPassword(password)
 password.setText("secret1")

Here in between the verification process complete and before returning the password  a thread came along and changed the password, as described in the above code snippet.
But if String class was final,such type of scenario wouldn't have happened because MutableString class could not have extended the String class.

 So from these discussion,we reached in the conclusion that String class should be final and hash variable being not final has no impact in the invariant of the String class.

2.ThreeStoges:

 In the famous example of ThreeStoges.java from Brian Goetz page no 32.

 public final class ThreeStooges {
    private final Set stooges = new HashSet();
    public ThreeStooges() {
        stooges.add("Moe");
        stooges.add("Larry");
        stooges.add("Curly");
    }
    public boolean isStooge(String name) {
        return stooges.contains(name);
    }
}

Notice here that Set that stores the names is mutable and it is final.But just follow the argument here that the ThreeStooges class is immutable.

Let's consider the invariant for this class.
The invariant is the instance variable set should not be changed after the construction of the object finished.

Now  let's argu that we can change it after construction.
But here the stooges reference is final.So once it is assigned and initialized in the constructor it can not be changed.If there was a method to modify the set stooges  after construction or if a reference of this class was  escaped outside to some other thread before the construction is complete,then there would have been a chance that our invariant would not hold even after object construction.

So from this we reached in the conclusion that our assumption that we can change it after object construction is wrong.

But if we argue that if the line
 ThreeStooges ts = new ThreeStooges()
 is thread safe.Is it possible that one thread can can see the uninitialized object of the ThreeStooges where the intialization process already is in progress by another thread.
 Yes it is thread safe and no such thing will happen as it is guaranteed by the final keyword by JMM.See my   blog on compiler reordering : final and volatile

 Hence ThreeStoges class is immutable.

For more details please refer  Brian Goetz java concurrency in practice page 31.

3.Unmodifiable HashMap:



public class unmodifiableHashMap implements Map {
private final Map map;
public unmodifiableHashMap(Map map) {
this.map = new HashMap(map);
 }

  @Override
public V get(Object key) {
return map.get(key);
 }

  @Override
public V put(K key, V value) {
throw new UnsupportedOperationException();

similarly we can override all other getter and mutator(modifier/changer) methods.In this way we can  change a hashmap to an immutable hashmap.Here this immutable hashmap can be shared with multiple threads  safely.This guarantee is given by the final keyword used  in line no 2 of the above code snippet.

Conclusion:

 From the above three examples and discussions we conclude that
  •  An object is immutable if , It's state can't be modified after construction.
  •  An immutable object is thread safe.

 Ways to achieve it:

 1.All fields should final.
 2.the this reference should not escape during construction to any outside thread or client.
 3.There should not be any mutator method that can change the value of the instance variable after object construction.

Thursday, October 1, 2015

Compiler Reordering: final and volatile

Usually when we write a statement like Object o=new Object(); it is a three step process of CPU instruction
  1. Allocate space for new object
  2. Store the unconstructed object in variable
  3. Initialise object
Although the above steps are not exact,some similar steps happen  at the time of creating an object.
Let's see an example

class MyClass {
  int i;
  MyClass () {
    i = 1;
  }
}

When we write something like MyClass clazz=new MyClass();

The following steps should ideally  happen as per our assumption

  1. var= Allocate space for MyClass
  2. var.i = 1;
  3. clazz= var;
 But the compiler might do it in a different ordering.For optimization purpose  the above line of code can be written by compiler in a different manner like below snippet.

  1. clazz= Allocate space for MyClass 
  2. clazz.i = 1;
 But something different ordering happened in contrary to our assumption,We can call this as compiler reordering of the statements.

But the reordering of statements by compiler affects the thread safety.Assume that one thread is in the process of creating the MyClass object and it just completed the step 1.Now another thread came and saw the object is not null because of thread 1 completed step 1.And tried to clazz.i  and will get the wrong value,since thread1 has not completed step 2 yet.

Thread 1:
MyClass clazz = new MyClass ();

Thread 2:
if (clazz != null) {
  System.out.println(clazz .i);
}

So there is no guarantee that thread 2 will print 1.

Here this is a concern of thready safety.

Prevent Compiler Reordering:

1.final
 If  we redesign our class like 

class MyClass {
final  int i;
  MyClass () {
    i = 1;
  }
}

Here note that we changed the modifier of  the variable i as final.Now we can say this class is thread safe.
Without the final modifier, the compiler and JVM are allowed to move the write to i so that it occurs after the reference to the new object is written to clazz.But the final modifier will  restrict the compiler to do such type of reordering.

 2.volatile 
If you refer one of my series double check locking for singleton you will see in line number 2 we have used the keyword volatile for our singletonTest  instance.Without the volatile keyword this code will not work in java.The basic rule is that compiler reordering can change the code  so that the code in the SingletonTest constructor  occur after the write to the  instance variable in line number 11.If this will happen then there will be thread safety issue.

Just assume we have two threads Thread1 and Thread2.Now Thread1  will come and see  instance is null in getInstance method and  proceed to execute line 11 , but as we know line 11 is not an atomic operation , so just after  assigning to instance variable and  before constructing the SingletonTest object completely   , Thread2 can come along and read the instance before Thread1 finished the construction in line number 7 of getInstance method..

If we make the instance field volatile in line no 2 , the actions that should  happen before the write to instance  in the code must actually happen before the write to instance .No compiler  reordering is allowed.