Java: towards JVM integrity by default

Java: towards JVM integrity by default

This article first appeared in Programmez! Hors série #16 (in french only).

The Java Virtual Machine (JVM) is an execution environment that enables programs written in Java (or other languages compiled into Java bytecode) to run on different operating systems and hardware architectures.

From the begining, the JVM was designed to be dynamic: it can execute code not present at compile time by code hot-loading. It can also call native libraries, and supports numerous monitoring features.

Java code can also be called dynamically using the Reflection API, and the Unsafe class can even be used for memory access, bypassing Java’s memory allocation mechanisms.

All these features have made the JVM one of the platforms of choice for enterprise application development.

But since its creation, security principles have evolved, and the risks inherent in application security and its impact are becoming ever greater, so the JVM had to evolve to limit its surface exposure to risks, while retaining the functionalities that have made it such a success. These changes have been underway for a very long time, at least since Java 9 and the modularization of the JVM, but it’s only recently that a global reflection has taken place within the Java Enhancement Proposal (JEP) Integrity by Default, which is still in draft: JEP draft: Integrity by Default. This JEP defines integrity by default, explains the reasons for it and lists the JEPs involved. It is an umbrella JEP that brings together many other JEPs.

What is integrity by default?

Here’s what JEP has to say about it:

In the context of a computer program, integrity means that the constructs from which we build the program — and ultimately the program itself — are both whole and sound.

Behind this lies a very simple principle: the JVM specification must describe precisely what is required for a program to be valid, and its implementation must obey it. For example: the array specification defines that an array can only be accessed within the limits defined when it was created; this constraint is guaranteed by the JVM, which throws an exception if it is violated.

The benefits of integrity are that:

  • Code is predictable: Variables always have a defined value before they are used, and operations on data are always valid.
  • Memory is securely managed: The risk of crashes due to poor memory management is minimized.
  • Multi-threaded programs are stable: Objects maintain a consistent state, even in multi-tasking environments.

Encapsulation is one of the fundamental principles of JVM integrity.

Encapsulation consists in grouping data and the methods that manipulate it within a single entity, usually a class. This protects data from external access and unauthorized modification, by ensuring that it can only be manipulated via well-defined interfaces.

Encapsulation brings many advantages: program accuracy, maintainability, scalability, security and performance.

Let’s focus on those directly related to JVM integrity:

  • Accuracy: Encapsulation ensures that data is accessed and modified in a controlled way, preventing edge effects and unexpected errors.
  • Security: By restricting access to data, encapsulation helps protect sensitive information from unauthorized access.

However, there are APIs in the Java Development Kit (JDK) that can bypass encapsulation:

  • AccessibleObject::setAccessible(boolean): This method enables deep reflection, allowing access to private fields and methods, even if normally inaccessible.
  • sun.misc.Unsafe: This class provides methods for accessing private methods and fields, as well as final fields.
  • Java Native Interface (JNI): JNI allows native code to interact with Java objects without respecting encapsulation boundaries.
  • Instrumentation API: This API allows agents to modify the bytecode of methods, which can bypass encapsulation.

The integrity by default of the JVM therefore requires restricting the operation of these APIs by default. The idea is not to eliminate them, but to reduce their scope or to force the developer to knowingly authorize them, so as to control their use.

Restricting deep reflection

Since Java 9 and the introduction of modularity (JEP 261: Module System), it has been possible to restrict deep reflection using Java modules, as the AccessibleObject::setAccessible(boolean) method respects module boundaries: a class in one module cannot modify the accessibility of a field in a class in another module.

This change, initiated with Java 9 and the modularity of the JDK, was implemented progressively, with unauthorized access first discouraged by issuing a warning at application launch, then prohibited in Java 16. It is still possible to authorize deep reflection either globally(--illegal-access=permit) or on a case-by-case basis via module opening (--add-opens).

Restricting the use of Unsafe

The sun.misc.Unsafe class includes methods that perform a variety of low-level operations without any security checks.

A component that uses Unsafe compromises the integrity of the JVM.

Unsafe is rarely used by a Java application, but many frameworks rely on it, as do many Java agents.

Over the years, numerous replacements via supported APIs have emerged, making the use of Unsafe less and less necessary.

Low-level manipulation of objects in the heap can now be carried out more safely via the VarHandle API , and manipulation of data in memory outside the heap can now be carried out more safely via the MemorySegment API .

Since Java 23 and the JEP 471: Deprecate the Memory-Access Methods in sun.misc.Unsafe for Removal, Unsafe memory-access methods have been deprecated, but their use is still permitted. It is possible to restrict their use via the command line option: --sun-misc-unsafe-memory-access.

The use of these methods will be progressively restricted in future versions of Java. From Java 24 onwards, they emit a warning in the JVM logs the first time they are used.

Restricting the use of JNI

With Java 24 and the JEP 472: Prepare to Restrict the Use of JNI, access to native libraries will be restricted for both JNI and the new Foreign Function and Memory (FFM) API.

It was already the case for the FFM API since Java 22.

A native library does not respect the integrity of the JVM, because it can :

  • Have an undefined behavior, which can cause the JVM to crash.
  • Exchange data via direct byte buffers.
  • Access fields and methods without access control.
  • Call JVM functions incorrectly.

Authorizing access to native libraries requires the use of the --enable-native-access command line option or the Enable-Native-Access manifest attribute, which can either be the name of a module or ALL-UNNAMED to authorize the whole classpath.

For the time being, the use of an unauthorized JNI library will issue a warning when the application is launched, but their use will be progressively restricted in future versions of Java.

Restricting the use of the instrumentation API

An agent is a component that can modify the code of an application while it is running.

It can therefore compromise the integrity of the JVM in various ways.

Since Java 21 and JEP 451: Prepare to Disallow the Dynamic Loading of Agents, dynamic loading of Java agents has been restricted.

The command line option -XX:+DisableAttachMechanism controls dynamic agent loading, and is currently set to true by default.

For the time being, the dynamic loading of unauthorized Java agents will issue a warning when the application is launched, but their use will be progressively restricted in future versions of Java. It will then be necessary to declare all Java agents when launching the JVM via the --agent command-line option.

Conclusion

Application security is becoming increasingly important, and it’s a good thing that Java is evolving to provide greater security by default. It also puts more power in the hands of developers, who will be able to better control which library or module can perform which action (reflection, native library loading, etc.).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.