Hi there,
You may think that this is completely insane. We have C, right? You know, that high-level language which might still be popular come the end of the year?
Yes, but then that’s hardly the point. I want to know how to do the same thing in assembly. So, with that in mind, here goes. First, the Java class we’re going to use to do this:
HelloWorld.java
public class HelloWorld {
static {
System.loadLibrary("hello");
}
public static void main(String[] args) throws Exception {
= new HelloWorld();
HelloWorld hw .requestGreeting();
hw}
native void requestGreeting();
void sayHello() {
System.out.println("Hello, World!");
}
}
This version differs from the previous ones in that it includes the method sayHello()
, which we’ll get the library to call.
Makefile
-------------------------------8<-------------------------------
all: HelloWorld.class libhello.so opt3_libhello.so
HelloWorld.class: HelloWorld.java
javac -cp . HelloWorld.java
libhello.so: HelloWorld.o
ld -fPIC -shared -o libhello.so HelloWorld.o
HelloWorld.o: HelloWorld.s
as -gstabs -o HelloWorld.o HelloWorld.s
opt3_libhello.so: HelloWorld.c
${JAVA_HOME}/include -I${JAVA_HOME}/include/linux HelloWorld.c -o opt_libhello.so
gcc -gstabs -O3 -shared -fPIC -I
clean:
rm *.o *.class *.so
And here’s what we want the assembler to do:
HelloWorld.c
#include "HelloWorld.h"
void JNICALL Java_HelloWorld_requestGreeting(JNIEnv* env, jobject hw_obj) {
JNIEXPORT = (*env)->FindClass(env, "HelloWorld");
jclass clazz = (*env)->GetMethodID(env, clazz, "sayHello", "()V");
jmethodID mid (*env)->CallVoidMethod(env, hw_obj, mid);
}
I’ve omitted the creation of the HelloWorld.h
header file for now, since it’s not the focus of the post.
So - here’s the assembler which does the same thing (with quite a few comments along the way). I’m sure that there are more efficient ways of doing this, but was reasonably heartened to see that the output of gcc -O3 ... HelloWorld.c
appeared to do something similar.
HelloWorld.s
.section .data
clazz_name:
"HelloWorld"
.asciz void_sig:
"()V"
.asciz method_name:
"sayHello"
.asciz
.section .text
.globl Java_HelloWorld_requestGreeting , @function
.type Java_HelloWorld_requestGreeting
Java_HelloWorld_requestGreeting:
#
# Prologue
#
%rbp
pushq %rsp, %rbp # Store the two arguments to the function
movq
#
# JNIEXPORT void JNICALL Java_HelloWorld_sayHello(JNIEnv*, jobject);
# Parameters:
# RDI: JNIEnv*
# RSI: jobject
#
# The stack frame we inherit and subsequently intend to set-up will look like this:
# 8 -> return address
# -0(RBP) -> previous RBP
# -8 -> JNIEnv parameter
# -16 -> jobject parameter
# -24 -> The address of the JNI function-table (to be calculated)
# -32 -> The result of the call to FindClass (to be retrieved)
# -40 -> The result of the call to GetMethodID (to be retrieved)
#
$40, %rsp # Reserve the stack frame
subq
%rdi, -8(%rbp) # Store the JNIEnv parameter on the stack
movq %rsi, -16(%rbp) # Store the jobject parameter on the stack
movq
(%rdi), %rax # RAX now contains the starting address of the function-table.
movq %rax, -24(%rbp) # Store function-table address on the stack
movq
#
# Invoke the (*JNIEnv)->FindClass function to look-up the address of the function-table
# for the class "HelloWorld"
# Parameters:
# RDI: JNIEnv*
# RSI: address-of "HelloWorld"
# Returns:
# JClass*
#
# RDI still contains pointer to JNIEnv
(%rip), %rsi # Calculate and store the address of "HelloWorld" in RSI
leaq clazz_name
48(%rax), %rax # Store the address of the 6th element in the function-table in RAX
movq *%rax # Call resulting function-pointer; it returns a pointer to the JClass in RAX
call
%rax, -32(%rbp) # Store result on the stack
movq
#
# Invoke (*JNIEnv)->GetMethodID function
# Parameters:
# RDI: JNIEnv*
# RSI: JClass*
# RDX: address-of "sayHello"
# RCX: address-of "()V", the "void" parameter-list descriptor
# Returns:
# jmethodID*
#
-8(%rbp), %rdi # Retrieve pointer to JNIEnv from the stack and store in RDI
movq %rax, %rsi # Store JClass pointer in RSI
movq (%rip), %rcx # Store address of "()V" in RCX
leaq void_sig(%rip), %rdx # Store address of "sayHello" in RDX
leaq method_name
-24(%rbp), %rax # Look-up the pointer to the function-table from the stack and store in RAX
movq 264(%rax), %rax # Store the address of the 33rd element (GetMethodID) in RAX
movq *%rax # Call function-pointer; it returns a pointer to the jmethodID
call
%rax, -40(%rbp) # Store the jmethodID on the stack
movq
#
# Invoke (*env)->CallVoidMethod
# Parameters:
# RDI: JNIEnv*
# RSI: jobject*
# RDX: jmethodID*
# AL: varargs parameter count
# Returns:
# void
-8(%rbp), %rdi # Retrieve pointer to JNIEnv from the stack and store in RDI
movq -16(%rbp), %rsi # Retrieve pointer to JObject from the stack and store in RSI
movq %rax, %rdx # Store pointer to jmethodID in RDX
movq
-24(%rbp), %rax # Look-up the pointer to the function-table from the stack and store in RAX
movq 488(%rax), %rcx # Store the address of the 61st element (CallVoidMethod) in _RCX_
movq %rax, %rax # Set RAX (and hence AL) to zero. See page 20 of the SysV ABI and notes below
xorq *%rcx
call
#
# Epilogue
#
%rbp, %rsp # forget stack frame
movq %rbp # restore caller's base-pointer
popq
retq
The last call in interesting because it takes a “hidden argument” in the %AL
register. The following appears at page 20 of the x86_64 System V ABI:
For calls that may call functions that use varargs or stdargs (prototype-less calls or calls to functions containing ellipsis (…) in the declaration) %al is used as hidden argument to specify the number of vector registers used. The contents of %al do not need to match exactly the number of registers, but must be an upper bound on the number of vector registers used and is in the range 0-8 inclusive.
Whereas on a 32-bit architecture the varargs method would have to figure out for themselves how many parameters to read from the stack, the 64-bit ABI specifies that the parameter count will be passed in the lower 16 bits of the %RAX
register.
I found the following resources of great help while researching how this works:
The JNIEnv Interface pointer
Invoking assembly language programs from Java
JNIEnv is a pointer that, in turn, points to another pointer. This second pointer points to a function table that is an array of pointers. Each pointer in the function table points to a JNI interface function. The virtual machine is guaranteed to pass the same interface pointer to native method implementation functions called from the same thread. However, a native method can be called from different threads, and therefore may be passed different JNIEnv interface pointers. Although the interface pointer is thread-local, the doubly indirected JNI function table is shared among multiple threads.
In order to call an interface function, we have to determine the value of the corresponding entry in the function table.
[…]
To retrieve the contents of the entry in the function table that corresponds to the function we want to call, we have to multiply the zero based index of the function (see Sheng Liang’s book) by eight, since each pointer is eight bytes long, and add the result to the starting address of the function table which we have formed in RAX earlier.