[MAGNOLIA-1998] Potential memory leak: investigation. Created: 11/Jan/08  Updated: 23/Jan/13  Resolved: 09/May/08

Status: Closed
Project: Magnolia
Component/s: None
Affects Version/s: None
Fix Version/s: 3.5.5, 3.6

Type: Task Priority: Critical
Reporter: Vivian Steller Assignee: Philipp Bärfuss
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File 3.5.4.png     PNG File 3.6.png     JPEG File Mag3.0.5Jackrabbit1.3.3_memGraph.JPG     JPEG File Mag3.5Jackrabbit1.3.3_memGraph.JPG     JPEG File Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG     Microsoft Word magnoliaMemLeak_toMag.doc     PNG File memory-leak_magnolia-CE-3.0.5_jackrabbit-1.3.1.png    
Issue Links:
dependency
depends upon MAGNOLIA-2099 context: system jcr sessions are not ... Closed
Template:
Acceptance criteria:
Empty
Task DoR:
Empty
Date of First Response:

 Description   

Several users have reported memory issues. We're creating this issue to collect information, reports and other evidence. Please attach relevant files and leave comments on your experiences. Thanks!

Current status:

  • We're investigating about this issue, but we're lacking evidence of a real leak in Magnolia 3.5 when used with an external database at the moment. (see comments below)
  • Magnolia 3.0 must not be used with jackrabbit 1.3 but with the delivered 1.0 version
  • There have been too many architectural changes in 3.5 to backport the fixes to the 3.0 branch
  • The following definitely helps:
    • Using an external database such as MySQL
    • Using Magnolia 3.5.x (and Jackrabbit 1.3.x)
  • If you're experiencing OutOfMemoryError: PermGen space, you need to increase the -XX:MaxPermSize JVM setting.
  • If you're still having memory issues after applying the above advices, please report them here with the following information:
    • Operating system:
    • Java version:
    • Container (tomcat, jetty, ...) and version:
    • Precise JVM settings (JAVA_OPTS, CATALINA_OPTS, ...)
    • How is your container started:
    • Is there any specific operation that triggers your memory issues?
    • Any stacktraces or relevant log files (in attachment, with your name in the filename, please)


 Comments   
Comment by Yuanhua Qu [ 11/Jan/08 ]

I recorded the mem leak behavious for 3.0.5 and 3.5. Heap still have a growing trend and will eventually cause problem. And the record is only under small testing environment with pretty small load. The mem leak crushed our production server which have hundreds of sites and high load and already given big heap size for JVM. Your fast fix on this issue will surely save us from a catastrophe. Looking forward to your great news on this.

Comment by Magnolia International [ 11/Jan/08 ]

Could you please give us more details about your setup ? java version, JAVA_OPTS and CATALINA_OPTS, the PersistenceManager you use in jackrabbit are 3 informations we need to be able to give perspective to your document.
Thanks for the detailed report !

Comment by Magnolia International [ 11/Jan/08 ]

What I can probably already tell you from this, is that the graphs for 3.5 don't show a real flagrant evidence of a leak. At first sight, there are only minor GC hits. (we can see a major one on page 2 at about 35 minutes, but that's Magnolia 3.0)
Reducing the Xmx parameter would help the test, by forcing more major GC hits. After a handful of such hits, the graph will hopefully (rather not ) show evidence of a leak. Until then, seems like your VM just uses what it's been given.

Comment by Mike Jones [ 11/Jan/08 ]

We've had the exact same issue:

RHEL, JDK 1.6, JAVA_OPTS: -Xmx2048m -Xms512m

Happens with both Magnolia 3.0.5 > BDB & Magnolia 3.0 > Derby

Comment by Yuanhua Qu [ 11/Jan/08 ]

My testing env setup:

tomcat 5.0.28
JAVA_OPTS=-server -Xms512m -Xmx512m
<PersistenceManager class="info.magnolia.state.berkeley.BerkeleyDBPersistenceManager" />
Here are jars we updated and added for 3.0.5:
derby-10.2.2.0.jar
jackrabbit-api-1.3.3.jar
jackrabbit-core-1.3.3.jar
jackrabbit-jcr-commons-1.3.3.jar
jackrabbit-text-extractors-1.3.3.jar
lucene-core-2.0.0.jar
magnolia-bdb-1.2.jar
PooledJNDIDatabasePersistenceManager.jar

For 3.5, I also use magnolia-bdb-1.2.jar same as above.

As for the graphs in the doc, every first graph for each mangolia version is just recording the mem at start up till after about 15 minutes with no hit at all. Page 1 for 3.0.5 and page 5 for 3.5. And you can see that mem usage was about 100M for 3.0.5 and 67M for 3.5;

Page 2 shows mem leak trend for version 3.0.5 when consistant hit starts after 15 minutes of startup till about 50 minutes after startup.
Page 6 shows mem leak trend for version 3.5 when consistant hit starts after about 15 min of startup till about 90min after startup, heap climb up to 210M. The leak certainly grows much slower than version 3.0.5, but still grows. It gets 260M after 100min

I hereby attach a screen shot Mag3.5Jackrabbit1.3.3_memGraph.JPG for mem heap usage when I tested v3.5

Our production is using 3.0.5 so the situation is really bad.

Comment by Yuanhua Qu [ 11/Jan/08 ]

my test use JDK1.6.0_03

Comment by Sean McMains [ 11/Jan/08 ]

Qu should be providing the salient details on our environment.

A few particularly interesting details that our experiments have turned up:

  • Due to our caching architecture, visitor's cookies were getting stripped out, which caused Tomcat to generate a new session for each page load. This wasn't a problem with the old Jackrabbit. Modifying our caching architecture so that sessions are preserved seemed to help slow the rate of memory use considerably.
  • Qu's experiments this morning showed that with Magnolia 3.5, the issue was far less pronounced. As Gregory pointed out, there may not even be a leak there. MAGNOLIA-623 seems to indicate that session use on public was dropped in v3.5, so upgrading to v3.5 may be just another way to address the same sessions issue.
Comment by Magnolia International [ 11/Jan/08 ]

Http session handling and usage was definitely changed in 3.5, yes.

Qu, it would be really helpful if you could decrease Xms and Xmx(even as low as 128 and 256m); hopefully that would show more "major GC" hits in the graphs.

Comment by Yuanhua Qu [ 11/Jan/08 ]

No idea how my env setup msg got another 3 copies 30 min after I first submitted. Weird.

I'll take Gregory's advice and set the heap low to test and for a longer period. Will update the result when it's done.

Comment by Magnolia International [ 11/Jan/08 ]

Hmm, this probably happen as I've just been restarting the server a couple times just now, fixing a couple quirks since the migration
I'll delete the extra comments. Sorry for that.

Comment by Sean McMains [ 14/Jan/08 ]

I saw your latest updates on the status, Gregory. I wanted to point out that, among your recommendations, it was moving to Jackrabbit 1.3.3 that caused the problem in our case.

I understand that you guys don't have the resources to backport the fixes to the 3.0 branch. However, if you're not going to do that, I would recommend warning people that upgrading to Jackrabbit 1.3.3 may not be a good idea until they're ready to move to Magnolia 3.5.

Comment by Magnolia International [ 14/Jan/08 ]

Hmm, I didn't realize this. So you're saying that with Magnolia 3.0.x and Jackrabbit 1.0.x, memory usage was as stable as with Magnolia 3.5.x and JR1.3.x ?

Comment by Yuanhua Qu [ 14/Jan/08 ]

I set the JVM max size to be 128M and it did show the mem usuage stable after each fgc hit. Here is the graph I got in the attachment Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG . This is much better than magnolia 3.0.5 with jackrabbit1.3.3 which shows mem climbing after each fgc. My concern for magnolia 3.5 with jackrabbit 1.3.3 is what is the influence to the application's performance if mem only get stable after each fgc happens . From the graphic, we can see that heap still grows fast after each minor collection. Looks like lots of objects get longer life time and push to Old space will will trigger fgc when old space is full.

Comment by Sean McMains [ 14/Jan/08 ]

We haven't profiled the memory usage as closely with Magnolia 3.0.x and Jackrabbit 1.0.x, but that certainly seemed to be the case, yes.

Comment by Magnolia International [ 15/Jan/08 ]

Sean : I edited the status to reflect your comment; it's still unclear as to why only upgrading Jackrabbit would make the issue worse, though.

Qu : the purpose of this last test was to see how the GC behaved with less memory, i.e. trying and detect a potential leak. Resetting your Xmx to a more viable limit will produce less scary graphs

Another question now: could you provide some details as to how this was tested? Is this real or generated traffic? Author or public instance? It seems to match our internal tests on a public instance, browsing with the anonymous user, so that's a rather good sign on that side of things.

Comment by Sean McMains [ 15/Jan/08 ]

Gregory, all of our testing was done on public – the edit stage doesn't appear to display the problem, or does so at a much less dramatic rate. (Given that it seems to be tied to not tracking sessions, this makes some sense.)

For our stats, I think Qu used JMeter to generate artificial traffic. But the problem first appeared in our production instance with real traffic. Qu, feel free to jump in with any details I've missed.

Comment by Yuanhua Qu [ 15/Jan/08 ]

Yes. As Sean described, test was done by Jmeter (without enabling cookie for all http requests) ) on public instance browsing by anonymous user as your guess. It shows the similar mem usage when testing 3.0.5 with JR1.3.3 when enable persistent cookie for all http resquests.

Comment by Ryan Gardner [ 08/Feb/08 ]

Jackrabbit 1.4 was released recently, and perhaps has an influence on this issue? I was able to update the version in my parent-pom and rebuild the project with no problems... (Although I suspect dropping in the jar files would work just as well)

Jackrabbit 1.4's release notes indicate 220 bugfixes, and at a quick glance a few of them seemed like they might fix memory issues.

In any case, it would give another graph to help complete the set

Comment by Todd Farrell [ 10/Mar/08 ]

See attached image:
memory-leak_magnolia-CE-3.0.5_jackrabbit-1.3.1.png

We have consistently reproduced an OutOfMemory condition (memory usage pattern similar to attached image) using the Magnolia 3.0.x series.

  • Memory leak ONLY occured with the CACHE DISABLED.
    • Cache is bypassed for any request that has request parameters, including the search feature provided with the samples.
  • We tried all possible combinations of the officially distributed Magnolia CE 3.0.2 / 3.0.3 / 3.0.5 with Jackrabbit 1.0.1 / 1.3.1.
  • We tried both the Derby persistence manager and the MsSqlPersistenceManager --> SQL Server 2005
  • We also tried Magnolia EE 3.0.5 with Derby (only)

Unfortunately, we haven't had the opportunity to test against the Magnolia 3.5.x series yet.

Our production setup:

  • Windows Server 2003 R2
  • Sun JDK 1.5.0 update 14
  • JBoss 4.2.2
  • Magnolia CE 3.0.5 with Jackrabbit 1.3.1
  • SQL Server 2005 (and Derby for local development environments - memory leak reproducible either way)
  • We have some custom filters etc. to integrate our application with Magnolia, but we used a standard Magnolia install as a control when load testing.
Comment by sebastian.frick [ 14/Mar/08 ]

has anyone drawn comparisons ee3.5.x/jr1.3.3 <> ee3.5.x/jr1.4?

Comment by Philipp Bracher [ 08/May/08 ]

OK I wrote a jmeter test plan which does a heavy authoring (three threads):

  • create pages
  • add 10 paragraphs
  • activate every now and then
  • activation uses versioning (but no workflow)

The setup I used was:

  • tomcat 5.5 (default magnolia bundle)
  • external db (h2)
  • -Xmx256M

I add the graph of the tenured gen space which shows:

  • it increases (slowly but steady)
  • after stopping jmeter, the memory is not freed

As a next step I will test 3.6 (to see if the latest changes have an impact on that)

Comment by Philipp Bracher [ 08/May/08 ]

Same test on a 3.6 looks quite nice.

Note that the maximum memory usage was about 80MB (author & public in same VM!). After test exectuion the Memory was reduced to 36MB.

Note that the throughput was much better (up to 8 times faster)

Comment by Philipp Bracher [ 09/May/08 ]

MAGNOLIA-2099 did the trick, a backport to 3.5 was possible

Comment by Mike Jones [ 14/May/08 ]

Hi,

Is this patch available for public use? We can download the patch file, but honestly we aren't quite sure how to apply it.
Do you have a compiled version/update for 3.5 (3.5.5?) that we can install to test?

Thx

Comment by Magnolia International [ 24/Jul/08 ]

Mike - sorry for the late reply, but : 3.5.8 has been released a while ago now - and 3.6 is on the verge of being released, too.

Generated at Mon Feb 12 03:32:24 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.