There are two main reasons why threads have data visibility issues. First is related to compiler optimization. Second is created by cpu and caches optimizations.
Lets start with JIT optimization
Java code is compiled to bytecode and executed on JVM. To improve performance JVM will monitor execution and if method is executed specific number of times it will be optimized.
We can modify compilation threshold with:
-XX:CompileThreshold=10000 (for client 1,500)
you can use
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version
to get all your default options.
As soon as method is hot it will be compiled to native code and replaced. There are two compilers C1 and C2 and number of optimizations. In our case we will focus on inlining.
Inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function
We can experience inlining effect with multi-thread execution in simple program. We will try to stop one thread, with busy loop using other thread and boolean flag.
Lack of proper synchronization is intentional. The JIT will compile our loop method and replace it with optimized version similar to:
We can use JITWatch and validate our optimized code.
The JIT optimization prevents any updates on the flag to be visible by loop thread.
If we execute the program it never terminates
I'm in the loop
Flag has been changed
To fix it we can execute the code with -XX:-Inline to disable optimization.
java -XX:-Inline jitoptimization.JITOptimization
I'm in the loop
Flag has been changed
I'm outside of the loop
This is one of the examples how JIT optimization can create out of order execution and why we have to care about data visibility in our multi thread applications.
If the program is correctly synchronized, e.g. flag is volatile, compiler prevents any optimization which affects data visibility. You can validate impact of volatile on optimized code with JITWatch volatile log.
JCStress is one of the openJDK test frameworks. It is designed to validate Java Memory Model (JMM) requirements on JVM.
Latest release supports only Java9 but older updates also works with java8. The API is very simple and it is annotation based.
To get java8 version you have to clone mercurial repository and update it to older release:
hg clone http://hg.openjdk.java.net/code-tools/jcstress/ hg up 223:bda9fbee58c8 mvn clean install
The main annotations are:
@JCStressTest - Class with test
@State - Class with test state (in most cases the same class as @JCStressTest)
State class requires non argument constructor
@Outcome - Validates test results. Following arguments are allowed:
id - expected results e.g. for IntResults2 "[1, 0]"
expect - Expct enum values most common Expect.ACCEPTABLE or Expect.FORBIDDEN
desc - result description
@Description - describe text content
Inside test class you can select methods to be executed by thread with @Actor.
If you require results after all threads have finished you can use @Arbiter on method. This method will be executed after all @Actor methods
The results are collected by classes annotated with @Result which are passed as arguments to @Actor methods or @Arbiter methods.
There are number of predefined @Result classes e.g. BooleanResult1, IntResult3, DoubleResult1, StringResult1.
Now it is time to write first test. It is an example of StoreLoad reordering on modern cpus (lack of proper synchronization is intentional).
You can download git project JMMPresentation and build it (you have to build JCStress first to have jar in your .m2 local repository)
To start a test you have to execute java -jar target/jmm_jcstress.jar
There are number of command line options. The most useful are
-l list all tests
-v verbose output (use it to have test result on console)
-t <test name> execute test which match provided regex pattern
After execution, framework will generate html file with reports inside results folder. For command line use it is better to add -v for verbose output
To run StoreLoad test
java -jar target/jmm_jcstress.jar -v -t StoreLoad
Reading the results back... Generating the report... (ETA: n/a) (R: 0.00E+00) (T: 1/0) (F: 1/1) (I: 1/5) [OK] jcstress.StoreLoad Observed state Occurrences Expectation Interpretation [1, 1] 41,346 ACCEPTABLE Both actors have finished in the same time [0, 1] 34,507 ACCEPTABLE First Actor have finished before second [1, 0] 39,358,089 ACCEPTABLE Second Actors have finished before first [0, 0] 5,978 ACCEPTABLE_SPEC Intel can reorder Stores with Load
Now we can run the same test and set cpu affinity to single core
The StoreLoad reordering [0,0] should not be visible on single CPU as both threads will have access to the same caches and store buffer.
Reading the results back... Generating the report... (ETA: n/a) (R: 0.00E+00) (T: 1/0) (F: 1/1) (I: 1/5) [OK] jcstress.StoreLoad Observed state Occurrences Expectation Interpretation [1, 1] 14 ACCEPTABLE Both actors have finished in the same time [0, 1] 0 ACCEPTABLE First Actor have finished before second [1, 0] 146 ACCEPTABLE Second Actors have finished before first [0, 0] 0 ACCEPTABLE_SPEC Intel can reorder Stores with Load