What is the JVM? Introducing the Java virtual machine

0
Javaworld > JVM / JDK / JRE explainer series > Java Virtual Machine > debug + optimize

The Java virtual machine is a program whose purpose is to execute other programs. It’s a simple idea that also stands as one of our greatest examples of coding kung fu. The JVM upset the status quo for its time and continues to support programming innovation today.

What does the JVM do?

The JVM has two primary functions: to allow Java programs to run on any device or operating system (known as the “write once, run anywhere” principle), and to manage and optimize program memory. When Java was released in 1995, all computer programs were written to a specific operating system, and program memory was managed by the software developer. The JVM was a revelation.

Javaworld > JVM / JDK / JRE explainer series > Java Virtual Machine > Overview JavaWorld / IDG

Figure 1. A high-level view of the JVM.

Having a technical definition for the JVM is useful, and there’s also an everyday way that software developers think about it. Let’s break that down:

  • Technical definition: The JVM is the specification for a software program that executes code and provides the runtime environment for that code.
  • Everyday definition: The JVM is how we run our Java programs. We configure the settings and then rely on the JVM to manage program resources during execution.

When developers talk about the JVM, we usually mean the process running on a machine, especially a server, that represents and controls resource usage for a Java application. Contrast this to the JVM specification, which describes the requirements for building a program that performs those tasks.

JVM languages

While it was once only for Java, the JVM is flexible and powerful enough to support many other languages today. Among the most popular are Scala, used for real-time, concurrent applications, and Groovy, a dynamically typed scripting language. Another prominent example is Kotlin, which delivers a blend of object-oriented and functional styles. All of these are considered JVM languages, meaning that, even though they are not coding in Java, the programmer retains access to the vast ecosystem of Java libraries.

Garbage collection

The most common interaction with a running JVM is to check the memory usage in the heap and stack. The most common adjustment is performance-tuning the JVM’s memory settings.

Before Java, all program memory was managed by the programmer. In Java, program memory is managed by the JVM. The JVM manages memory through a process called garbage collection, which continuously identifies and eliminates unused memory in Java programs. Garbage collection happens inside a running JVM.

In the early days, Java came under a lot of criticism for not being as “close to the metal” as C++, and therefore not as fast. The garbage collection process was especially controversial. Since then, a variety of algorithms and approaches have been proposed and used for garbage collection. With consistent development and optimization, garbage collection has vastly improved. (Automatic memory management also caught on and is a common feature of other modern languages like JavaScript and Python.)

The three parts of the JVM

It could be said there are three aspects to the JVM: specification, implementation and instance. Let’s consider each of these.

The JVM specification

First, the JVM is a software specification. In a somewhat circular fashion, the JVM spec highlights that its implementation details are not defined within the spec, in order to allow for maximum creativity in its realization:

To implement the Java virtual machine correctly, you need only be able to read the class file format and correctly perform the operations specified therein.

J.S. Bach once described creating music similarly:

All you have to do is touch the right key at the right time.

So, all the JVM has to do is run Java programs correctly. Sounds simple, and might even look simple from the outside, but it’s a massive undertaking, especially given the power and flexibility of the Java language.

JVM implementations

Implementing the JVM specification results in an actual software program, which is a JVM implementation. In fact, there are many JVM implementations, both open source and proprietary. OpenJDK’s HotSpot is the JVM reference implementation. It remains one of the most thoroughly tried-and-tested codebases in the world.

HotSpot may be the most commonly used JVM, but it is by no means the only one.  Another interesting and popular implementation is GraalVM which features high performance and support for other, traditionally non-JVM languages like C++ and Rust via the LLVM spec. There are also domain-specific JVMs like the embedded robotics JVM, LeJOS

Typically, you download and install the JVM as a bundled part of a Java Runtime Environment (JRE).  The JRE is the on-disk part of Java that spawns a running JVM.

A JVM instance

After the JVM spec has been implemented and released as a software product, you may download and run it as a program. That downloaded program is an instance (or instantiated version) of the JVM.

Most of the time, when developers talk about “the JVM,” we are referring to a JVM instance running in a software development or production environment. You might say, “Hey Anand, how much memory is the JVM on that server using?” or, “I can’t believe I created a circular call and a stack overflow error crashed my JVM. What a newbie mistake!”

How the JVM loads and executes class files

We’ve talked about the JVM’s role in running Java applications, but how does it perform its function? In order to run Java applications, the JVM depends on the Java class loader and a Java execution engine.

The Java class loader

Everything in Java is a class, and all Java applications are built from classes. An application could consist of one class or thousands. In order to run a Java application, a JVM must load compiled .class files into a context, such as a server, where they can be accessed. A JVM depends on its class loader to perform this function.

When you type java classfile, you are saying: start a JVM and load the named class into it.

The Java class loader is the part of the JVM that loads classes into memory and makes them available for execution. Class loaders use techniques like lazy-loading and caching to make class loading as efficient as it can be. That said, class loading isn’t the epic brain-teaser that (say) portable runtime memory management is, so the techniques are comparatively simple.

Every Java virtual machine includes a class loader. The JVM spec describes standard methods for querying and manipulating the class loader at runtime, but JVM implementations are responsible for fulfilling these capabilities. From the developer’s perspective, the underlying class loader mechanism is a black box.

The execution engine

Once the class loader has done its work of loading classes, the JVM begins executing the code in each class. The execution engine is the JVM component that handles this function. The execution engine is essential to the running JVM. In fact, for all practical purposes, it is the JVM instance.

Executing code involves managing access to system resources. The JVM execution engine stands between the running program—with its demands for file, network, and memory resources—and the operating system, which supplies those resources.

System resources can be divided into two broad categories: memory and everything else. Recall that the JVM is responsible for disposing of unused memory, and that garbage collection is the mechanism that does that disposal. The JVM is also responsible for allocating and maintaining the referential structure that the developer takes for granted. As an example, the JVM’s execution engine is responsible for taking something like the new keyword in Java, and turning it into an operating system-specific request for memory allocation.

Beyond memory, the execution engine manages resources for file system access and network I/O. Since the JVM is interoperable across operating systems, this is no mean task. In addition to each application’s resource needs, the execution engine must be responsive to each operating system environment. That is how the JVM is able to handle in-the-wild demands.

JVM evolution: Past, present, future

Because the JVM is a well known runtime with standardized configuration, monitoring, and management, it is a natural fit for containerized development using technologies such as Docker and Kubernetes.  It also works well for platform-as-a-service (PaaS), and there are a variety of serverless approaches. Because of all of these factors, the JVM is well-suited to microservices architectures.

Another important feature on the horizon is Project Loom, which looks to introduce virtual threads to the JVM. Virtual threads are capable of concurrency at higher abstraction on top of operating system processes. Virtual threads are able to share memory across them for potentially huge improvements to coding idioms and performance.

Conclusion

In 1995, the JVM introduced two revolutionary concepts that have since become standard fare for modern software development: “Write once, run anywhere” and automatic memory management. Software interoperability was a bold concept at the time, but few developers today would think twice about it. Likewise, whereas our engineering forebears had to manage program memory themselves, my generation grew up with garbage collection.

We could say that James Gosling and Brendan Eich invented modern programming, but thousands of others have refined and built on their ideas over the following decades. Whereas the Java virtual machine was originally just for Java, today it has evolved to support many scripting and programming languages, including Scala, Groovy, and Kotlin. Looking forward, it’s hard to see a future where the JVM isn’t a prominent part of the development landscape.

Copyright © 2022 IDG Communications, Inc.

Leave a Reply