Uploaded image for project: 'RiftSaw'
  1. RiftSaw
  2. RIFTSAW-463

Unexpected errors if the service mex.timeout is greater than the JBoss transaction timeout

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.3.0.Final
    • Fix Version/s: 2.3.8.Final, 2.4.0, 3.2
    • Component/s: Integration, ODE
    • Labels:
      None
    • Environment:

      JBoss 5.1.0, JBoss WS CXF 3.4.0

    • Steps to Reproduce:
      Hide

      Here's the steps used:

      1. Assume that the web service returns a response in 7 minutes, the transaction timeout is set to 5 minutes, the mex.timeout is set to say 15 mins.
      2. The BPEL process invokes the web service.

      3. After 5 minutes there's a transaction rollback warning in the JBoss logs: [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action a282a58:a20:4ec5f30d:3db aborting with 1 threads active!

      4. After 7 minutes (when the invoked web service returns) there's an error from RiftSaw (I assume because the corresponding transaction has been aborted):

      org.hibernate.LazyInitializationException: could not initialize proxy - no Session
      at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:86)
      at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:140)
      at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:190)
      at org.apache.ode.dao.jpa.bpel.ProcessInstanceDAOImpl_$$javassist_22.getInstanceId(ProcessInstanceDAOImpl$$_javassist_22.java)
      at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.continueAsync(PartnerRoleMessageExchangeImpl.java:136)
      at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.reply(PartnerRoleMessageExchangeImpl.java:88)
      at org.jboss.soa.bpel.runtime.ws.WebServiceClient$TwoWayCallable$1.call(WebServiceClient.java:298)
      at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:294)

      5. After 15 minutes (at end of mex.timeout) the process is marked as failed, with the following messages in the JBoss logs:

      [org.apache.ode.bpel.runtime.INVOKE] (ODEServer-3) Failure during invoke: No response received for invoke (mexId=hqejbhcnphr6rgm1w5uw09), forcing it into a failed state.

      6. At this point intermittently I find that the web service is re-invoked, though I haven't found the exact scenario in which this happens I think it's probably because some how the ode_job table is left in an inconsistent state (see next point).

      7. At the end of this process it's not possible to shut down the JBoss server normally using the shutdown.sh command. The last message logged is:

      [org.jboss.soa.bpel.runtime.engine.service.BPELEngineService] (JBoss Shutdown Hook) Stopping JBoss BPEL Engine

      and it keeps waiting for the BPEL Engine to stop (I think there's some lock that's not released correctly). So I have to terminate the JBoss process using kill -9. At this point I think sometimes the ode_job table is left inconsistent and in the next test run I see the web service being re-invoked even though it shouldn't be because I've set faultOnFailure to true.

      I think there should be some check in RiftSaw to detect that the mex.timeout value is being set to a value greater than the JBoss transaction timeout and report it as an error. Also, there's definitelt seems to be a bug with some locks not being released properly that prevents a clean JBoss shutdown.

      Show
      Here's the steps used: 1. Assume that the web service returns a response in 7 minutes, the transaction timeout is set to 5 minutes, the mex.timeout is set to say 15 mins. 2. The BPEL process invokes the web service. 3. After 5 minutes there's a transaction rollback warning in the JBoss logs: [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action a282a58:a20:4ec5f30d:3db aborting with 1 threads active! 4. After 7 minutes (when the invoked web service returns) there's an error from RiftSaw (I assume because the corresponding transaction has been aborted): org.hibernate.LazyInitializationException: could not initialize proxy - no Session at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:86) at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:140) at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:190) at org.apache.ode.dao.jpa.bpel.ProcessInstanceDAOImpl_$$ javassist_22.getInstanceId(ProcessInstanceDAOImpl $$_javassist_22.java) at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.continueAsync(PartnerRoleMessageExchangeImpl.java:136) at org.apache.ode.bpel.engine.PartnerRoleMessageExchangeImpl.reply(PartnerRoleMessageExchangeImpl.java:88) at org.jboss.soa.bpel.runtime.ws.WebServiceClient$TwoWayCallable$1.call(WebServiceClient.java:298) at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(SimpleScheduler.java:294) 5. After 15 minutes (at end of mex.timeout) the process is marked as failed, with the following messages in the JBoss logs: [org.apache.ode.bpel.runtime.INVOKE] (ODEServer-3) Failure during invoke: No response received for invoke (mexId=hqejbhcnphr6rgm1w5uw09), forcing it into a failed state. 6. At this point intermittently I find that the web service is re-invoked, though I haven't found the exact scenario in which this happens I think it's probably because some how the ode_job table is left in an inconsistent state (see next point). 7. At the end of this process it's not possible to shut down the JBoss server normally using the shutdown.sh command. The last message logged is: [org.jboss.soa.bpel.runtime.engine.service.BPELEngineService] (JBoss Shutdown Hook) Stopping JBoss BPEL Engine and it keeps waiting for the BPEL Engine to stop (I think there's some lock that's not released correctly). So I have to terminate the JBoss process using kill -9. At this point I think sometimes the ode_job table is left inconsistent and in the next test run I see the web service being re-invoked even though it shouldn't be because I've set faultOnFailure to true. I think there should be some check in RiftSaw to detect that the mex.timeout value is being set to a value greater than the JBoss transaction timeout and report it as an error. Also, there's definitelt seems to be a bug with some locks not being released properly that prevents a clean JBoss shutdown.

      Description

      If the mex.timeout value is set to a value greater than the JBoss transaction timeout value, any web service invocation that takes longer than the JBoss transaction timeout will fail. RiftSaw should be able to detect that the mex.timeout is longer than the JBoss transaction timeout and report an error.

      Also, after the web service invocation fails, it's not possible to shut down the JBoss server cleanly using shutdown.sh, the process needs to be terminated using kill -9.

      Also, intermittently (I don't have exact steps to recreate this) the web service is re-invoked after the mex.timeout even though the ext:faultOnFailure is set to true.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                objectiser Gary Brown
                Reporter:
                anujbhatia anuj bhatia
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: