Source: https://dzone.com/articles/happens-before-in-java-or-how-to-write-thread-safe
Happens-Before In Java Or How To Write a Thread-Safe Application
This article explains the notion of happens-before in Java, including ways to install it, what guarantee it gives, advantages it brings, and how to use it.
Multithreading is the most complex part of Java. Happens-before is a relation that gives a guarantee of allowing the writing of predictable code in multithreading a reality. Such code is also known as thread-safe code. Unfortunately, Oracle Java documentation about this notion is hard to read. So in this article, I'll mainly explain what happens-before is in human language and provide detailed examples.
Happens-Before Solves the Main Multithreading Problem
Before we start learning the happens-before notion, we have to understand the reason for creating it. Once multithreading comes to the scene, your code might become inconsistent because shared objects between threads might have different and unpredictable values. Let's review a simple example. In that example, we will update values in one thread and read and print them in another.
Expected Behavior (False Expectations)
Now let's think about what values we might see in the second thread while printing. Depending on progress in the first thread, we might expect 3 situations:
- X and y are not initialized, so 0 and 0 will be printed.
- X is set and y is not set yet; then, 1 and 0 will be printed.
- X and y are set; then, 1 and 1 will be printed.
Now let's run the program a million times and see what values have been printed.
Among a million prints, there were only 3 different options printed (with my hardware, Windows OS, and my JDK). For some reason, the B case never appeared because it's not as confusing as an 0,1 pair, which looks impossible. The only thing we have to learn here is that results are unpredictable.
Why Shared Memory Might Be Inconsistent
This is a pretty big topic and it's not the main subject of this article, so I will just briefly introduce the most known reasons.
Reordering or JVM Optimizations
JVM can change the order of the instructions which won't make any significant changes in the program. For example, it might change the order of variable initialization.
CPU Memory Cache
Each thread can be executed in a separated CPU. CPU has its own cache with different variables copied. When one thread updates the value in the CPU, another thread will still have an outdated value. We can illustrate it like this:
Both threads read data from shared RAM memory after processing its data with resynchronization. By default, nothing forces the CPU to update the shared RAM memory value.
What Is Actually Important: Thread-Safe Code
Everything explained earlier is only valuable for passing interviews; but in practice, all developers need is a way to predictably and consistently code in a multithreading reality, also known as thread-safe code. Fortunately, Java offers ways to write such code.
Happens-Before Gives a Guarantee of Visibility of Same Fields in Different Threads
The happens-before relationship is described in Oracle Documentation, Chapter 17.4.5. If between two threads, we installed the happens-before relation, then the second (ending thread) will see all changes that happened in the first thread. Therefore, the second thread execution will be similar to a single thread application.
Install Happens-Before Relationship: Making Fields Visible (Using a Volatile Keyword)
The first way to install the happens-before relation is to mark a shared variable with a volatile keyword. If a variable is volatile, then the happens-before relation will be installed between every write and subsequent reads.
A write to a
volatile
field (§8.3.1.4) happens-before every subsequent read of that field.
This means that:
- Once Y is updated in the first thread, it will be visible in the second thread.
- When Y became visible in the second thread, all fields are set before Y.
Considering this, we have to make the next changes in order to have x and y consistent:
- Make Y volatile.
- Make X read after Y in the second thread.
- Check that Y is set to 1.
Considering all the changes, we rewrite our code in this way next:
This might be a bit complicated. One thing you have to pay attention to is that the blue portal means that the happens-before relation is installed (starting point) and the yellow portal means that the happens-before relation is completed (and all data "passed" through is received with the latest, up-to-date state).
Install Happens-Before Relationship: Making Ordered Access to Fields (Using Synchronized Monitor)
Java provides a way to organize ordered access to the instructions known as a Java monitor. I have described what it is and how it works in a set of three articles:
- Multithreading Java and Interviews Part 1: An Introduction
- Multithreading Java and Interviews Part 2: Mutex, the Java Monitor Model
- Multithreading Java and Interviews Part 3: Wait and Notify All
This subject is pretty complex. I also recommend reading these articles:
The happens-before relationship is installed between any lock acquire and lock release. So, if we rewrite our code using locks, we also have a happens-before guarantee:
As you can see, both fields are not volatile; but when the first thread updates the value and releases the lock, then in the second thread when the lock is acquired, the value also gets updated due to the happened-before relation. This rule applied to all locks from java.util.concurrent that use a Java monitor under the hood.
Install Happens-Before Relationship: Thread Start and Join
This might be the easiest case compared to the previous two. Java specification says:
- A call to
start()
on a thread happens-before any actions in the started thread.- All actions in a thread happen-before any other thread successfully returns from a
join()
on that thread.
When the thread starts, it sees the correct state of shared variables
of the parent thread. This is backward when the parent thread is
released after join()
: it receives the latest state of the child thread.
Let's rewrite our example and illustrate how bean data is passed to the child thread when we call the start()
method:
The example explains that values set in the main thread are
successfully passed to the child thread due to the installation of the
happens-before relation, and thanks to start()
method.
Now let's illustrate the same logic, but when data is passed from the children to parent on join()
method:
In this example, we pass data from the child thread to the main awaiting thread. Once a child thread ends its run and releases join
, then values are passed correctly due to the existence of the happens-before relationship.
Are Examples Production-Ready and Recommended To Use?
No, no, no! The examples that I provided served only one purpose: to explain what happens-before is, and how it works. In practice, you can't be too careful in a multithreading environment. To better avoid things I have shown:
- In the first case with visibility (volatile keyword), I marked only one field as volatile. The code will work as expected; but in practice, you can change the order of reading and the "broken" happens-before relationship.
- In the case of the synchronization monitor, I didn't mark fields as volatile because it's not necessary. Don't do this. Keep fields volatile even if they are changed inside the lock.
Also, it is worth mentioning that in the first example (with volatile), I didn't mention when what happens-before relation is not enough when you take care of consistency. Volatile keywords give you safe reads, but not simultaneous writes (leads to race conditions). This subject is outside of this article.