Harry's Log

RSS

Running Java Jersey with JDO 3.0 on Google App Engine

Why such a horrendous title?

Ha! The title I was gonna use originally was sort of a “spoiler” that gave away endings. :) So, there you go.

I’ve been struggling with JDO 3.0 upgrade for some time. I use Google App Engine as a main platform for most of my (Web application) development. It is PaaS, and it provides lots of benefits in terms of significant reduction in the overall development and operations cost. (Just to be clear, the “hardware” cost of PaaS is generally comparable to that of IaaS such as Amazon Web Services or RackSpace, etc. The real cost saving comes from the “ops”, pat in my view.) This cost reduction comes at a price, however. Google App Engine (GAE) is a platform currently being developed (with new features being added constantly), and on the flipside, there are currently a lot of limitations. If GAE does not provide certain functionality (which you could have relatively easily accomplished if you had a full control over the platform), then in many cases, you have to live with it. There may be a workaround. Then again, there may not be. For example, you cannot create a thread on GAE, and if you are a Java programmer, you’ll realize that this is really a fundamental limitation. You cannot access file system, which will be another big surprise if you are using GAE for the first time.

Google App Engine now supports SQL-based database. But, when I started using GAE a couple of years ago, the only choice I had was Google’s proprietary data storage called, aptly, Data Store (or, Big Table, which is the underlying technology of Data Store). Google Data Store is a No-SQL database. There is some learning curve. I use JDO interface on top of Data Store, which provides at least two benefits for me. The first is portability. If we have to move/migrate the data into SQL based database in the future, using an abstraction layer such as JDO API instead of using the low level Data Store API directly has a big benefit. The second reason is JDO nicely maps to the object layer (whereas JPA, for instance, is closer to SQL-based RDBM), which is a big plus for me. Google Data Store is an object-oriented database (unlike table-based RDBM). In theory, at least conceptually, you can map a Java object to the database entity very easily. There is no need for ORM, etc. (BTW, in case you are not familiar with Google Data Store, it does not allow joins. What?, you might ask. Yes. No joins! This, which some might consider a deal breaker, is more than compensated by this object-orientedness, at least for me.)

In any case, I had been using JDO API for some time, whose version was 2.0, on Google App Engine. As I mentioned, these are all “under-construction” type of technologies. GAE supports certain features of JDO, and it does not, and cannot, support certain other features. Some are due to the limitations of the underlying technology (e.g., no joins!). The newest version of JDO is 3.0. One feature I was looking forward to in JDO 3.0 was "unowned one-to-many relationship". Although I said GAE Data Store was OO-based, due to so many limitations, the benefit was very limited. This “unowned one-to-many relationship” could potentially open up so many possibilities for me.

The GAE DataNucleus enhancer/plugin for JDO 3.0 has been around for a few months as “beta”. I tried it a few times, and I just couldn’t make it work on my development environment. I thought, maybe because it was beta? I was getting so many different kinds of errors, which I wasn’t willing to spend time debugging.

A few days ago, the new version of GAE SDK, version 1.7.1, was released, which included JDO 3.0 support, among other things. The first thing I did when I installed the new SDK was trying out the JDO 3.0 features. Here’s the doc: https://developers.google.com/appengine/docs/java/datastore/jdo/overview-dn2

The doc is somewhat outdated (as of this writing). I spent much time trying to debug the ant task and Datanucleus enhancer tools using various docs, but I just couldn’t make it work. As it turned out, the GAE Eclipse plugin already had JDO3 support (as of today, at least). So, this was cool. After resolving some issues, I was finally getting ready to try out JDO 3.0, only to get stuck with one last problem. This one appeared to be “un-solvable”.

The DataNucleus library (which implements JDO, etc.) uses an external library called ASM. In case you are not familiar with this, it is a Java bytecode manipulation tool. It is often used, among other things, to support “annotations” in Java code to do post-compilation tasks. The new DataNucleus plugin relies on asm version 4.0. The problem was, I was using Java Jersey library for JAX-RS, and the current version of Jersey relies on asm version 3.1. ASM does not provide backward compatibility. These two versions of asm were in conflict, and there appeared to be no way to resolve this. Note that asm is a very “low level” library, which is needed both at compile time and run time. I cannot deploy only one because neither JDO3.0 nor Jersey can be used with a different version of asm which it was originally compiled against. Clearly, building these libraries from source (with a lot of dependencies on their own) was out of the question, even utilizing Maven, etc. Deploying both versions of asm at the same time would not work either unless you could implement some custom class loader which intelligently picks the “proper” version of asm depending on context. Again, I couldn’t come up with a solution that can solve this jar version conflict problem.

I was totally stuck!

I spent many hours googling. Often it is hard to figure out what you are looking for, until you know what you are looking for. It seemed hopeless. But, I inched ahead, and finally found a solution! (Well, you call it a workaround. I call it a solution. :)) BTW, version conflict is a fact of life in software development (until we have a “real” solution). There was an expression like “DLL Hell” in old times. Android now insists that we cannot use shared libraries at all (which I think is like throwing a baby with the bathwater, however), as a “solution” to this problem. Unfortunately, this is the way it is (at least, for now). In many cases, (good) developers tend to be conscious of backward compatibility (and, even “forward compatibility”, if you know what I mean), and minor version differences do not cause major problems in general. The problem with ASM is that its incompatibility appears to be intentional, for whatever reason. Java bycode format changes from time to time (every so slightly). A particular version of ASM works with a particular format and no others. That is it. The libraries complied against a particular version of asm are not compatible with the libraries built with a different version of asm, due to this artificial dichotomy, although in practice the bytecode format difference is so minor that the chances of ending up with different bytecodes for any particular class or library are very small. (BTW, all newer versions of “java” can read and run all Java bytecodes in any then-current or older formats. That’s how we are supposed to write software. BACKWARD-COMPATIBLE! Or, you are supposed to use some internal versioning scheme, etc. Merely using different file names, such as asm-3.1 vs asm-4.0, without real versioning/compatibility considerations is really an anti-pattern.)

Anyways, did I say I found a solution?

Yes!

It’s called “JarJar”. You can rebuild an existing jar file with different source package or class names. This is an amazing tool, which I wasn’t aware of until today. (Search for “Java repackaging”, etc.) So, here’s the solution. Rename the asm package name in one library (with one asm version) so that they are no longer in conflict with another library with a different version of asm. I chose to repackage Jersey. As it turns out, only one jar file in the Jersey distribution has a dependency on asm.

java -jar jarjar-1.3.jar find class jersey-server-1.12.jar asm-3.1.jar

.../AnnotationScannerListener$AnnotatedClassVisitor -> org/objectweb/asm/ClassVisitor
.../AnnotationScannerListener$AnnotatedClassVisitor -> org/objectweb/asm/AnnotationVisitor
.../AnnotationScannerListener$AnnotatedClassVisitor -> org/objectweb/asm/FieldVisitor
.../AnnotationScannerListener$AnnotatedClassVisitor -> org/objectweb/asm/Attribute
.../AnnotationScannerListener$AnnotatedClassVisitor -> org/objectweb/asm/MethodVisitor
.

So, here’s what I did:

java -jar jarjar-1.3.jar process testrules.txt asm-3.1.jar asm-3.1r.jar

java -jar jarjar-1.3.jar process testrules.txt jersey-server-1.12.jar jersey-server-1.12r.jar

where the testrules file include this one line:

rule org.objectweb.asm.**  org.objectweb.asm3.@1

That’s it.

I deployed the repacked Jersey jar files including the modified asm-3.1 along with asm-4.0 and JDO DataNucleus library to Google App Engine, and Voila! Now I have an app that can support “unowned one-to-many relationships”! Tested and verified!

Y-E-S!!!

So, the original title I was thinking of using for this post was

"Jar Jar (Binks) to the Rescue!"

(Corny? Yes. :) BTW, JarJar uses/depends on ASM, Ironic? Yes. :))