[MGNLDMS-159] Caching big documents from dms repository causes OutOfMemoryError Created: 17/Mar/09  Updated: 03/Jul/14  Resolved: 01/Dec/10

Status: Closed
Project: Document Management System (closed)
Component/s: None
Affects Version/s: 1.3
Fix Version/s: 1.5

Type: Bug Priority: Major
Reporter: Henryk Paluch Assignee: Jan Haderka
Resolution: Fixed Votes: 0
Labels: cache, outofmemoryerror
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Magnolia 3.6.3 CE
Environment1: Sun JDK 1.6.0_11, 32bit, RHEL 5.3Beta
Environment2: Sun JDK 1.6.0_03,32bit, Windows 2000


Attachments: XML File config.modules.cache.config.configurations.default.cachePolicy.voters.urls.excludes.dmsNoCache.xml     Text File oom_stacktrace.txt    
Issue Links:
Cloners
is cloned by MAGNOLIA-2677 Caching big content may cause OutOfMe... Closed
relation
is related to MGNLDMS-166 cache: deny url doesn't work because ... Closed
Template:
Acceptance criteria:
Empty
Date of First Response:

 Description   

magnolia-module-cache caches on public instance all data into byte[] array which easily causes OutofMemoryError on large repositories.

NOTE: this error occures on Public instances only, whose has caching enabled!

How to reproduce:
1) Start PUBLIC magnolia instance with Heap smaller than DMS repository size, for example -Xmx256m
2) Upload on PUBLIC instance few large files (for example 3 times 100MB PDF files)
3) Launch new anonymous browser (to ensure, that cache is used)
4) Download (do not interrupt) the 3 large 100MB large files from public instance
5) Usually the 2nd download will cause

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71)
at info.magnolia.module.cache.filter.SimpleServletOutputStream.write(SimpleServletOutputStream.java:53)
(Full stacktrace shall be attached later)

Woraround: disable caching for large repositories (attachment pending)

Possible solution: cache module should stream large files into disk rather than memory (as it did in 3.0.x version). This should prevent OutOfMemory error on large Document and or Website repositories.



 Comments   
Comment by Henryk Paluch [ 17/Mar/09 ]

oom_stacktrace.txt - stacktrace of OutOfMemoryError
when on Documens download from public instance

Comment by Henryk Paluch [ 17/Mar/09 ]

config.modules.cache.config.configurations.default.cachePolicy.voters.urls.excludes.dmsNoCache.xml
Workaround example:
Do not cache items from "dms" repository.

We are using this workaround on our production system: It has about 4GB documents, typical size of one documen is 5MB, some of them are 100MB long. Heap size is 1.5GB
Without this workaround that system stop working with OutOfMemoryError after a while.

Comment by Jan Haderka [ 02/Apr/09 ]

Done as of r24144.

Comment by Philipp Bärfuss [ 08/Jun/09 ]

I have reverted the current fix. The reason is that this voter creates a jcr session per request to the dms. This can be very critical you, like we do in STK, store background images in the dms. This results in opening 20 sessions per page, even witch cache switched on.

Here are some notes for a potential future solution:

a) use a shared session for the read access. Since the voter only reads file sizes we won't run into security issues.
--> we should not use shared sessions as in the past we experienced memory issues because of doing that

b) cache a special entry for big files

  • instead caching the file we just cache a reference (bypass entry) so the test has to be executed only once

c) us a special threshold stream which streams into a temp file after a certain amount of data has bean streamed

Comment by Philipp Bärfuss [ 08/Jun/09 ]

If we re add the voter we definitely have to consider MGNLDMS-166 which was just fixed by reverting the current fix.

Comment by Jan Haderka [ 09/Jun/09 ]

c) us a special threshold stream which streams into a temp file after a certain amount of data has bean streamed

Stream is not the only problem with this solution. Binary items have to be passed to ehCache in CachedPage as Byte[].

Comment by Philipp Bärfuss [ 01/Dec/10 ]

Fixed by MAGNOLIA-2677

Generated at Mon Feb 12 00:49:00 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.