13.04.2023 - Markus Seidl - 7 min read New Random Generators in Java 17

Java 17 introduced new random number generators. These provide longer cycles, improved statistical uniform distribution, improved threading support, and Streams API integration. This article provides a short introduction to random number generators and an overview of the improvements made in Java 17.

How do Pseudo Random Number Generators work?

All pseudo random number generators, abbreviated RNGs, have an internal state that is initialized upon creation with a user provided seed. When a new random number is requested, the internal state is then updated by an RNG-type specific function to produce the next state and the actual random number output.

$$ \text{state} := RNG_\text{init}(\text{seed}) $$ $$ (\text{state}_\text{new}, \text{output}) := RNG(\text{state}) $$

All RNGs included in Java need a “good” seed to produce a random state cycle that is different from previous instances. A “good” seed has to be different between every instance of an RNG, different from zero and provide enough not zero bits for the RNG function to work.

One very simple RNG implementation function, called Linear Congruential Generator (LCG), is shown below. Both the old and new Java Random use LCGs, but in different ways. The simplified formula is: $$ \text{state}_\text{new} := \text{state} \cdot \text{multiplier} + \text{addend} $$ The multiplier and addend are fixed for an RNG and are static final constants in Java. These values are integral to the RNG and are carefully chosen by the RNG designer. The LCG uses very large multipliers that overflow the underlying data type (usually long) to produce a pseudo random sequence.
Depending on the initial state, generated by the provided seed, the new state is deterministically determined by updating to the new state with the multiplication (and addition). The cycle ends, when the last generated state equals the initial state. The length of the cycle is the period of RNG. Starting from the same initial state an RNG will produce the same number sequence in its state cycle.
RNGs with higher periods can be necessary, depending on the application. The length of the period does not determine the quality of the RNG. The quality is how uniformly distributed the generated numbers are. Uniform distribution, loosly speaking, can be explained as: No number should be generated as output more often than any other number for all numbers possible. To determine the quality of the uniform distribution of an RNG there exists a large number of tests (PractRand, TestU01 among others). These are beyond the scope of this article.

Internal State and Multithreading

The internal state is updated with each generation of a new random number. The previous Java Random implementation is thread safe by protecting the internal state with typical thread synchronisation methods (AtomicLong). This provides ease of use, but trades performance with thread safety and synchronization.
In contrast, the new RNG implementation in Java 17 will use a new seed derived from the initial seed to generate non-overlapping state cycles for each thread. This method is called splitting. This method has been more thoroughly tested and tuned with the actual RNG algorithm to provide non-overlapping sequences than the previous SplittableRandom implementation. Another method is called jumping. A jumping algorithm can advance the state of an RNG usually to the equivalent of $2^{64}$ calls to generate a new random number. This is garanteed to produce non-overlapping state cycles. These two methods are faster for multi-threading than the classical Random class and provide independent random numbers dynamically for as many threads as needed.

Java 17s new LXM RNGs

Java 17 added new RNGs which are called LXM. These RNGs are developed by Sebastiano Vigna et al.1 Sebastiano Vigna does a lot of work on Xoshiro, Xoroshiro, and related RNGs included in almost all standard libraries in many programming languages (Rust, .NET, JavaScript, Julia and others).
LXM is an abbreviation for:

  • L: Linear Congruential Generators (a variant from the example above)
  • X: Xor-Based Generator (XBG, like Xoshiro and Xoroshiro)
  • (a simple combining operation for L and X, usually addition or xor)
  • M: Mixing Function (A special hash function)

The new LXM is therefore a combination of two RNGs (LCG and XBG) combined with a specially crafted hash function to produce random numbers that do not have the weakness of either LCGs or XBGs. This allows the use of smaller data types to preserve performance and quality. Additionally, the two RNGs can be executed out-of-order on modern CPUs. LXMs pass all statistical tests in specific test suites with high probability, whereas the old Java Random is known to fail many.
The following types are available in Java 17:

  • L32X64MixRandom (default new RNG)
  • L64X128MixRandom, L64X128StarStarRandom, L64X256MixRandom and L64X1024MixRandom
  • L128X128MixRandom, L128X256MixRandom and L128X1024MixRandom
  • Xoroshiro128PlusPlus and Xoshiro256PlusPlus (alternative algorithms)

The name LXM consists of L, X, and M interweaved with

  • The output bit and LCG size (L…)
  • The size of the XBG (X…)
  • and the Mixing function.

The RNG L32X64MixRandom does have a 32 bit output size, 64 bits for the XBG, and is using the “Mix” hash function to improve the output. The output size of an LXM has to be chosen depending on which random numbers are generated the most in the application. The default RNG L32X64MixRandom is good enough as the default for nextInt() and nextFloat(), but if the main use case is to generate nextDouble() or nextLong() a L64X... or higher class should be used.
The period of these RNGs depends on the size of the LCG and XBG. The L32X64 variant has a period of $\approx 2^{96}$ and the next larger variant L64X128 provides $\approx 2^{192}$. The period can be calculated, but it’s easier to look at the JavaDoc of each variant as it’s documented there.2 The period is the total amount of different states an RNG can generate. Ideally, the total amount of combinations needed for an application is calculated or estimated. The RNG variant should be chosen in a way, that it has a period at least as large as the amount of combinations needed. For example, if the application should generate all possible shuffles of a deck of 52 cards, a period of at least $2^{225}$ ($52 \cdot 51 \cdot 50 \cdot \ldots = 52!$) should be used. For this application, the default RNG is too small ( L32X64MixRandom $\approx 2^{190}$) and not all card shuffle combinations can be reached. Theoretically, at least an L64X256MixRandom (or higher, like X1024) is needed.

RNG Usage in Java 17

Java is an object-oriented language with deep roots in good object-oriented design. If Java improves something, there will also be a new and easier way to use class design that goes along with it. This also holds true for the new RandomGenerator interface introduced.

Starting with Java 17 RNGs can be created by

var rng = RandomGeneratorFactory.getDefault().create();

The rng can then be used in the same way as the old Random class in single-threaded applications.
Every Java-compatible runtime has to provide these interfaces, but it does not specify which specific algorithms have to be implemented and which algorithm is the default. To instantiate specific variants of an RNG algorithm specify them with a string like this:

var rngf = RandomGeneratorFactory.of("L64X128StarStarRandom").create();
assert !rng.isDeprecated();

The method of("L64X128StarStarRandom") will produce an IllegalArgumentException if the type is not implemented in the JVM. The RandomGeneratorFactory can be queried for algorithms and provides methods for introspection, for example period(), isJumpable() and isSplittable(). Algorithms can also be deprecated, so an additional check for deprecation can be added for safety.
The new RandomGenerator instances are not thread-safe and can’t be used in multi-threaded scenarios correctly (the state-update will fail to update correctly). The LXM has to be splitted for each thread, which provides each thread an independent RNG.

var srng = (RandomGenerator.SplittableGenerator)rng;
var independentRng = srng.split();

The independentRng implements the RandomGenerator interface and can be used independently from another thread. The SplittableGenerator interface also supports the Stream API:

// Stream API
var someRandomNumbers = srng.splits(100)
	.map(RandomGenerator::nextLong)
	.collect(Collectors.toSet()).parallelStream();

A minor, but noteable, improvement has been implemented for nextGaussian, which uses a recent optimization to generate normal distributed random numbers. In detail it switched from the classic Marsigla-Polar (which is additionally synchronized) to an optimized Ziggurat algorithm. This method is not only faster by multiple factors, but also doesn’t use complex mathematic functions (in most cases).

Conclusion

The new RandomGenerator provides a drop-in replacement supported by an interface abstraction and improved performance not only in single-thread use but also in multi-threading use. The new default algorithm is faster and provides a better statistical uniform quality than the old implementation. There is technically no reason to switch to the new algorithm after Java 17 even more so, as the RandomGenerator interface implements all the methods of the old Random class with the same method signatures.

Credits

Title image by Kseniia Belka on Shutterstock

References


  1. Original LXM publication paper Link ↩︎

  2. Java Package Documentation for the new Random Link ↩︎

Markus Seidl

Managing Technical Consultant