JBoss 6.1.0.Final + HornetQ Timing Issue

I had one of those ‘one line fix’ problems on Monday that took most of my day to figure out. It was one of those problems that makes perfect sense in retrospect, but when I was in the trench the solution was anything but obvious.

A big chunk of my time these days goes towards maintaining and extending what would now be considered a ‘legacy’ application. It’s an EJB3 based application that runs on a JBoss application server. The latest version of JBoss AS is 7.1, but we were still running on 5.1.0.GA, so I decided it was time to upgrade. For those that know anything about JBoss – 7.1 is a different beast altogether. It’s a different architecture than all previous versions. After spending about a day digging in, I decided to punt on upgrading to 7.1 and settle for 6.1.0 as an interim step.

For the most part, the upgrade from JBoss AS 5.1 to 6.1 is pretty straightforward. Lots of applications are directly portable — just drop them in and go. The biggest ‘gotchya’ would be porting any message driven beans. JBoss 5.1.0 used JBoss Messaging as the underlying JMS provider, but 6.1.0 uses hornetq. Still, how hard could it be?

I was fooled. I don’t have too much in the way of message driven beans in this application. For the most part they just send emails. I have some integration tests that queue up messages, which should result in a few emails landing in my inbox. After making a few minor tweaks, everything seemed to work. Done! So, I tagged the code and pushed a release into production. Then my problems began. No emails.

The first thing that popped in my head was that ‘mail-service.xml’ on the production JBoss server must be misconfigured — the emails were probably being rejected by the mail server. That wasn’t it. Then I dug around and tried to find any discrepancy at all in the configuration of my development server and the production server. Did I miss a step? I really went out of my way to minimize the amount of configuration needed at all — I want my application to run on a plain vanilla setup. It turns out that, no, I didn’t miss a step. So what gives?

After scratching my head for an hour or so I realized that the problem was interim. Sometimes it worked, sometimes it didn’t. Then, after a little while longer I realized that the problem only seemed to affect a few of the eight applications deployed to that server. Huh? I continued to experiment, and eventually realized the problem had nothing to do with the application at all. If I deployed any one of those eight applications, it would work every time. But with all eight deployed, a few were affected. My thoughts turned to resource contention — perhaps there first few applications were consuming some resource that was in “limited supply,” thereby “starving out” the last few applications? But what? I combed through log files. Finally, I found these lines repeated several times throughout the log:

10:29:28,754 INFO [HornetQActivation] Attempting to reconnect org.hornetq.ra.inflow.HornetQActivationSpec(ra=org.hornetq.ra.HornetQResourceAdapter@64c826 destination=queue/EmailQueue/MEDPLUS destinationType=javax.jms.Queue ack=Auto-acknowledge durable=false clientID=null user=null maxSession=15)
10:29:28,757 INFO [HornetQActivation] awaiting topic/queue creation queue/EmailQueue/MEDPLUS

… and sometime later:

10:29:29,095 INFO [HornetQServerImpl] trying to deploy queue jms.queue.EmailQueue

I also poked around the JMX Console a bit and discovered that the ‘ConsumerCount’ was set to 0 for the applications that weren’t working correctly. The message driven beans were not consuming messages from the JMS queue at all!

It finally dawned on me: this wasn’t a problem with the code, or a configuration issue, or resource contention — this was a timing issue. For applications that deployed BEFORE HornetQ deployed the message queues, the message driven beans would never even receive a message. The applications that were the last to deploy (those that came AFTER HornetQ deployed the message queues) were the ones that were working. So then the question became – since the message driven beans are dependent on these queues, how do I delay their creation until after the queues are deployed? It came down to putting this annotation at the top of the MDB:


Suddenly, everything just worked.

This seems to be a common issue, so I wonder why it isn’t mentioned in the JBoss migration guide. I also have to wonder why it’s even necessary in the first place. Of course a message driven bean is dependent on the queue it receives messages from, so why force the developer to take this step?

Finally, I have to ask myself how my testing set up could have been better so that this problem didn’t make it into production. But honestly, I don’t have an answer for that. I wish I did.

This entry was posted in Software Development and tagged . Bookmark the permalink.