Issue Details (XML | Word | Printable)

Key: MAGNOLIA-1998
Type: Task Task
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Philipp Bärfuss
Reporter: Vivian Steller
Votes: 2
Watchers: 7
Operations

If you were logged in you would be able to see more operations.
Magnolia

Potential memory leak: investigation.

Created: 11/Jan/08 06:05 PM   Updated: 17/Mar/09 07:15 PM
Component/s: None
Affects Version/s: None
Fix Version/s: 3.5.5, 3.6

Time Tracking:
Not Specified

File Attachments: 1. Microsoft Word magnoliaMemLeak_toMag.doc (1.04 MB)

Image Attachments:

1. 3.5.4.png
(36 kB)

2. 3.6.png
(40 kB)

3. Mag3.0.5Jackrabbit1.3.3_memGraph.JPG
(163 kB)

4. Mag3.5Jackrabbit1.3.3_memGraph.JPG
(141 kB)

5. Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG
(169 kB)

6. memory-leak_magnolia-CE-3.0.5_jackrabbit-1.3.1.png
(47 kB)
Issue Links:
dependency
 

Labels:
Resolution Date: 09/May/08 01:42 PM
Date of First Response: 11/Jan/08 06:44 PM


 Description  « Hide
Several users have reported memory issues. We're creating this issue to collect information, reports and other evidence. Please attach relevant files and leave comments on your experiences. Thanks!

Current status:

  • We're investigating about this issue, but we're lacking evidence of a real leak in Magnolia 3.5 when used with an external database at the moment. (see comments below)
  • Magnolia 3.0 must not be used with jackrabbit 1.3 but with the delivered 1.0 version
  • There have been too many architectural changes in 3.5 to backport the fixes to the 3.0 branch
  • The following definitely helps:
    • Using an external database such as MySQL
    • Using Magnolia 3.5.x (and Jackrabbit 1.3.x)
  • If you're experiencing OutOfMemoryError: PermGen space, you need to increase the -XX:MaxPermSize JVM setting.
  • If you're still having memory issues after applying the above advices, please report them here with the following information:
    • Operating system:
    • Java version:
    • Container (tomcat, jetty, ...) and version:
    • Precise JVM settings (JAVA_OPTS, CATALINA_OPTS, ...)
    • How is your container started:
    • Is there any specific operation that triggers your memory issues?
    • Any stacktraces or relevant log files (in attachment, with your name in the filename, please)


 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Yuanhua Qu added a comment - 11/Jan/08 06:44 PM
I recorded the mem leak behavious for 3.0.5 and 3.5. Heap still have a growing trend and will eventually cause problem. And the record is only under small testing environment with pretty small load. The mem leak crushed our production server which have hundreds of sites and high load and already given big heap size for JVM. Your fast fix on this issue will surely save us from a catastrophe. Looking forward to your great news on this.

Grégory Joseph added a comment - 11/Jan/08 07:53 PM
Could you please give us more details about your setup ? java version, JAVA_OPTS and CATALINA_OPTS, the PersistenceManager you use in jackrabbit are 3 informations we need to be able to give perspective to your document.
Thanks for the detailed report !

Grégory Joseph added a comment - 11/Jan/08 08:05 PM
What I can probably already tell you from this, is that the graphs for 3.5 don't show a real flagrant evidence of a leak. At first sight, there are only minor GC hits. (we can see a major one on page 2 at about 35 minutes, but that's Magnolia 3.0)
Reducing the Xmx parameter would help the test, by forcing more major GC hits. After a handful of such hits, the graph will hopefully (rather not ) show evidence of a leak. Until then, seems like your VM just uses what it's been given.

Mike Jones added a comment - 11/Jan/08 08:07 PM
We've had the exact same issue:

RHEL, JDK 1.6, JAVA_OPTS: -Xmx2048m -Xms512m

Happens with both Magnolia 3.0.5 > BDB & Magnolia 3.0 > Derby


Yuanhua Qu added a comment - 11/Jan/08 08:37 PM
My testing env setup:

tomcat 5.0.28
JAVA_OPTS=-server -Xms512m -Xmx512m
<PersistenceManager class="info.magnolia.state.berkeley.BerkeleyDBPersistenceManager" />
Here are jars we updated and added for 3.0.5:
derby-10.2.2.0.jar
jackrabbit-api-1.3.3.jar
jackrabbit-core-1.3.3.jar
jackrabbit-jcr-commons-1.3.3.jar
jackrabbit-text-extractors-1.3.3.jar
lucene-core-2.0.0.jar
magnolia-bdb-1.2.jar
PooledJNDIDatabasePersistenceManager.jar

For 3.5, I also use magnolia-bdb-1.2.jar same as above.

As for the graphs in the doc, every first graph for each mangolia version is just recording the mem at start up till after about 15 minutes with no hit at all. Page 1 for 3.0.5 and page 5 for 3.5. And you can see that mem usage was about 100M for 3.0.5 and 67M for 3.5;

Page 2 shows mem leak trend for version 3.0.5 when consistant hit starts after 15 minutes of startup till about 50 minutes after startup.
Page 6 shows mem leak trend for version 3.5 when consistant hit starts after about 15 min of startup till about 90min after startup, heap climb up to 210M. The leak certainly grows much slower than version 3.0.5, but still grows. It gets 260M after 100min

I hereby attach a screen shot Mag3.5Jackrabbit1.3.3_memGraph.JPG for mem heap usage when I tested v3.5

Our production is using 3.0.5 so the situation is really bad.


Yuanhua Qu added a comment - 11/Jan/08 08:39 PM
my test use JDK1.6.0_03

Sean McMains added a comment - 11/Jan/08 08:43 PM
Qu should be providing the salient details on our environment.

A few particularly interesting details that our experiments have turned up:

  • Due to our caching architecture, visitor's cookies were getting stripped out, which caused Tomcat to generate a new session for each page load. This wasn't a problem with the old Jackrabbit. Modifying our caching architecture so that sessions are preserved seemed to help slow the rate of memory use considerably.
  • Qu's experiments this morning showed that with Magnolia 3.5, the issue was far less pronounced. As Gregory pointed out, there may not even be a leak there. MAGNOLIA-623 seems to indicate that session use on public was dropped in v3.5, so upgrading to v3.5 may be just another way to address the same sessions issue.

Grégory Joseph added a comment - 11/Jan/08 08:59 PM - edited
Http session handling and usage was definitely changed in 3.5, yes.

Qu, it would be really helpful if you could decrease Xms and Xmx(even as low as 128 and 256m); hopefully that would show more "major GC" hits in the graphs.


Yuanhua Qu added a comment - 11/Jan/08 09:26 PM
No idea how my env setup msg got another 3 copies 30 min after I first submitted. Weird.

I'll take Gregory's advice and set the heap low to test and for a longer period. Will update the result when it's done.


Grégory Joseph added a comment - 11/Jan/08 09:33 PM
Hmm, this probably happen as I've just been restarting the server a couple times just now, fixing a couple quirks since the migration
I'll delete the extra comments. Sorry for that.

Sean McMains added a comment - 14/Jan/08 04:49 PM
I saw your latest updates on the status, Gregory. I wanted to point out that, among your recommendations, it was moving to Jackrabbit 1.3.3 that caused the problem in our case.

I understand that you guys don't have the resources to backport the fixes to the 3.0 branch. However, if you're not going to do that, I would recommend warning people that upgrading to Jackrabbit 1.3.3 may not be a good idea until they're ready to move to Magnolia 3.5.


Grégory Joseph added a comment - 14/Jan/08 05:43 PM
Hmm, I didn't realize this. So you're saying that with Magnolia 3.0.x and Jackrabbit 1.0.x, memory usage was as stable as with Magnolia 3.5.x and JR1.3.x ?

Yuanhua Qu added a comment - 14/Jan/08 09:19 PM
I set the JVM max size to be 128M and it did show the mem usuage stable after each fgc hit. Here is the graph I got in the attachment Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG . This is much better than magnolia 3.0.5 with jackrabbit1.3.3 which shows mem climbing after each fgc. My concern for magnolia 3.5 with jackrabbit 1.3.3 is what is the influence to the application's performance if mem only get stable after each fgc happens . From the graphic, we can see that heap still grows fast after each minor collection. Looks like lots of objects get longer life time and push to Old space will will trigger fgc when old space is full.

Sean McMains added a comment - 14/Jan/08 09:24 PM
We haven't profiled the memory usage as closely with Magnolia 3.0.x and Jackrabbit 1.0.x, but that certainly seemed to be the case, yes.

Grégory Joseph added a comment - 15/Jan/08 02:08 PM
Sean : I edited the status to reflect your comment; it's still unclear as to why only upgrading Jackrabbit would make the issue worse, though.

Qu : the purpose of this last test was to see how the GC behaved with less memory, i.e. trying and detect a potential leak. Resetting your Xmx to a more viable limit will produce less scary graphs

Another question now: could you provide some details as to how this was tested? Is this real or generated traffic? Author or public instance? It seems to match our internal tests on a public instance, browsing with the anonymous user, so that's a rather good sign on that side of things.


Sean McMains added a comment - 15/Jan/08 03:23 PM
Gregory, all of our testing was done on public – the edit stage doesn't appear to display the problem, or does so at a much less dramatic rate. (Given that it seems to be tied to not tracking sessions, this makes some sense.)

For our stats, I think Qu used JMeter to generate artificial traffic. But the problem first appeared in our production instance with real traffic. Qu, feel free to jump in with any details I've missed.


Yuanhua Qu added a comment - 15/Jan/08 04:35 PM
Yes. As Sean described, test was done by Jmeter (without enabling cookie for all http requests) ) on public instance browsing by anonymous user as your guess. It shows the similar mem usage when testing 3.0.5 with JR1.3.3 when enable persistent cookie for all http resquests.

Ryan Gardner added a comment - 08/Feb/08 07:33 AM
Jackrabbit 1.4 was released recently, and perhaps has an influence on this issue? I was able to update the version in my parent-pom and rebuild the project with no problems... (Although I suspect dropping in the jar files would work just as well)

Jackrabbit 1.4's release notes indicate 220 bugfixes, and at a quick glance a few of them seemed like they might fix memory issues.

In any case, it would give another graph to help complete the set


Todd Farrell added a comment - 10/Mar/08 03:54 AM - edited
See attached image:
memory-leak_magnolia-CE-3.0.5_jackrabbit-1.3.1.png

We have consistently reproduced an OutOfMemory condition (memory usage pattern similar to attached image) using the Magnolia 3.0.x series.

  • Memory leak ONLY occured with the CACHE DISABLED.
    • Cache is bypassed for any request that has request parameters, including the search feature provided with the samples.
  • We tried all possible combinations of the officially distributed Magnolia CE 3.0.2 / 3.0.3 / 3.0.5 with Jackrabbit 1.0.1 / 1.3.1.
  • We tried both the Derby persistence manager and the MsSqlPersistenceManager --> SQL Server 2005
  • We also tried Magnolia EE 3.0.5 with Derby (only)

Unfortunately, we haven't had the opportunity to test against the Magnolia 3.5.x series yet.

Our production setup:

  • Windows Server 2003 R2
  • Sun JDK 1.5.0 update 14
  • JBoss 4.2.2
  • Magnolia CE 3.0.5 with Jackrabbit 1.3.1
  • SQL Server 2005 (and Derby for local development environments - memory leak reproducible either way)
  • We have some custom filters etc. to integrate our application with Magnolia, but we used a standard Magnolia install as a control when load testing.


Philipp Bracher [old account - now Philipp Bärfuss] added a comment - 08/May/08 02:31 PM - edited
OK I wrote a jmeter test plan which does a heavy authoring (three threads):
  • create pages
  • add 10 paragraphs
  • activate every now and then
  • activation uses versioning (but no workflow)

The setup I used was:

  • tomcat 5.5 (default magnolia bundle)
  • external db (h2)
  • -Xmx256M

I add the graph of the tenured gen space which shows:

  • it increases (slowly but steady)
  • after stopping jmeter, the memory is not freed

As a next step I will test 3.6 (to see if the latest changes have an impact on that)


Philipp Bracher [old account - now Philipp Bärfuss] added a comment - 08/May/08 05:22 PM
Same test on a 3.6 looks quite nice.

Note that the maximum memory usage was about 80MB (author & public in same VM!). After test exectuion the Memory was reduced to 36MB.

Note that the throughput was much better (up to 8 times faster)


Philipp Bracher [old account - now Philipp Bärfuss] added a comment - 09/May/08 01:42 PM
MAGNOLIA-2099 did the trick, a backport to 3.5 was possible

Mike Jones added a comment - 14/May/08 04:12 PM
Hi,

Is this patch available for public use? We can download the patch file, but honestly we aren't quite sure how to apply it.
Do you have a compiled version/update for 3.5 (3.5.5?) that we can install to test?

Thx


Grégory Joseph added a comment - 24/Jul/08 02:20 PM
Mike - sorry for the late reply, but : 3.5.8 has been released a while ago now - and 3.6 is on the verge of being released, too.