Uploaded image for project: 'Transactional Activation'
  1. Transactional Activation
  2. MGNLXAA-100

Locks are never released in some cases and timeout is not configurable

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Neutral
    • None
    • 2.3.2
    • None

    Description

      In rare cases we get:
      Message received from subscriber: Operation not permitted, /xyz is locked
      This happens when for whatever reason activation fails and lock is not released properly
      If one checks the code it is visible that cleanup is not guaranteed to run even if content got locked:

      try {
                  final String utf8AuthorStatus = getHeader(request, BaseSyndicatorImpl.UTF8_STATUS);
                  // null check first to make sure we do not break activation from older versions w/o this flag
                  if (utf8AuthorStatus != null && Boolean.parseBoolean(utf8AuthorStatus) != SystemProperty.getBooleanProperty(SystemProperty.MAGNOLIA_UTF8_ENABLED)) {
                      throw new UnsupportedOperationException("Activation between instances with different UTF-8 setting is not supported.");
                  }
                  final String action = getHeader(request, BaseSyndicatorImpl.ACTION);
                  if (action == null) {
                      throw new InvalidParameterException("Activation action must be set for each activation request.");
                  }
      
                  // verify the author ... if not trusted yet, but no exception thrown, then we attempt to establish trust
                  if (!isAuthorAuthenticated(request, response)) {
                      status = BaseSyndicatorImpl.ACTIVATION_HANDSHAKE;
                      setResponseHeaders(response, statusMessage, status, result);
                      return;
                  }
                  // we do not lock the content on handshake requests
                  applyLock(request);
              } catch (ExchangeException e) {
                  // can't obtain a lock ... this is (should be) a normal threading situation that we are
                  // just reporting back to user.
                  log.debug(e.getMessage(), e);
                  // we can only rely on the exception's actual message to give something back to the user
                  // here.
                  statusMessage = StringUtils.defaultIfEmpty(e.getMessage(), e.getClass().getSimpleName());
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
                  setResponseHeaders(response, statusMessage, status, result);
                  return;
              } catch (Throwable e) {
                  log.error(e.getMessage(), e);
                  // we can only rely on the exception's actual message to give something back to the user here.
                  statusMessage = StringUtils.defaultIfEmpty(e.getMessage(), e.getClass().getSimpleName());
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
                  setResponseHeaders(response, statusMessage, status, result);
                  return;
              }
      

      Cleanup only occurs in second try...catch:

      try {
                  result = receive(request);
                  status = BaseSyndicatorImpl.ACTIVATION_SUCCESSFUL;
              } catch (OutOfMemoryError e) {
                  Runtime rt = Runtime.getRuntime();
                  log.error("---------\nOutOfMemoryError caught during activation. Total memory = "
                          + rt.totalMemory()
                          + ", free memory = "
                          + rt.freeMemory()
                          + "\n---------");
                  statusMessage = e.getMessage();
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
              } catch (PathNotFoundException e) {
                  // this should not happen. PNFE should be already caught and wrapped in ExchangeEx
                  log.error(e.getMessage(), e);
                  statusMessage = "Parent not found (not yet activated): " + e.getMessage();
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
              } catch (ExchangeException e) {
                  log.debug(e.getMessage(), e);
                  statusMessage = e.getMessage();
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
              } catch (Throwable e) {
                  log.error(e.getMessage(), e);
                  // we can only rely on the exception's actual message to give something back to the user here.
                  statusMessage = StringUtils.defaultIfEmpty(e.getMessage(), e.getClass().getSimpleName());
                  status = BaseSyndicatorImpl.ACTIVATION_FAILED;
              } finally {
                  cleanUp(request, status);
                  setResponseHeaders(response, statusMessage, status, result);
              }
      

      There is also no configurable timeout and the default is Long.MAX_VALUE => we need to restart machine or unlock nodes from groovy.
      From what I see value of lock timeout comes from:
      org.apache.jackrabbit.core.lock.XAEnvironment#lock(org.apache.jackrabbit.core.NodeImpl, boolean, boolean)
      where it calls:
      lock(node, isDeep, isSessionScoped, Long.MAX_VALUE, null);

      Checklists

        Acceptance criteria

        Attachments

          Activity

            People

              Unassigned Unassigned
              dwartel David Wartel
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Checklists

                  Bug DoR
                  Task DoD