Uploaded image for project: 'Magnolia'
  1. Magnolia
  2. MAGNOLIA-3390

Prevent OOME and GC load during activation of large data sets

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.3.9, 4.4
    • Component/s: activation
    • Labels:
      None
    • Magnolia Release:
      4.4

      Description

      During activation, the activated data is currently held completely in memory for each activation request sent from author to public.

      The problem

      When e.g. 250MB are activated on an author instance with 4 subscribers, these 250MB are allocated 4 times in a row in RAM and garbage-collected afterwards. Even if no OutOfMemoryError occurs during this, a high load is put on the Garbage Collector, likely forcing the VM to perform "stop the world" full collections, leading to unresponsiveness of the author instance for editors. Given large enough binary data or simultaneous attempts at activating it, any maximum heap size can be exceeded.

      Current implementation

      This seems to be due to the default behaviour of java.net.URLConnection.getOutputStream() used by info.magnolia.module.exchangesimple.Transporter, which returns a subclass of ByteArrayOutputStream that caches the whole GET request in memory. This probably happens in order to determine the content-length before actually sending the request.

      Proposed solution

      The solution is to use "chunked transfer coding", as defined in RFC 2616. This needs to be explicitly enabled by calling java.net.HttpURLConnection.setChunkedStreamingMode(int) prior to getOutputStream(). I verified via debugger that doing so will result in a sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream extends FilterOutputStream instead of sun.net.www.http.PosterOutputStream extends ByteArrayOutputStream.

      Chunking requires the public server to be HTTP/1.1 compliant. In case HTTP/1.1 compliance poses a problem e.g. with proxied public servers or weird HTTP servers, chunking of activation requests should be configurable. There could e.g. be a configuration NodeData "server/activation/subscribers/<subscribername>/useRequestChunking" with default value "true".

        Attachments

          Activity

            People

            • Assignee:
              pbaerfuss Philipp Bärfuss
              Reporter:
              jfrantzius Joerg von Frantzius
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Date of First Response: