[MAGNOLIA-1998] Potential memory leak: investigation. Created: 11/Jan/08 Updated: 23/Jan/13 Resolved: 09/May/08 |
|
| Status: | Closed |
| Project: | Magnolia |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 3.5.5, 3.6 |
| Type: | Task | Priority: | Critical |
| Reporter: | Vivian Steller | Assignee: | Philipp Bärfuss |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Template: |
|
||||||||
| Acceptance criteria: |
Empty
|
||||||||
| Task DoR: |
Empty
|
||||||||
| Date of First Response: | |||||||||
| Description |
|
Several users have reported memory issues. We're creating this issue to collect information, reports and other evidence. Please attach relevant files and leave comments on your experiences. Thanks! Current status:
|
| Comments |
| Comment by Yuanhua Qu [ 11/Jan/08 ] |
|
I recorded the mem leak behavious for 3.0.5 and 3.5. Heap still have a growing trend and will eventually cause problem. And the record is only under small testing environment with pretty small load. The mem leak crushed our production server which have hundreds of sites and high load and already given big heap size for JVM. Your fast fix on this issue will surely save us from a catastrophe. Looking forward to your great news on this. |
| Comment by Magnolia International [ 11/Jan/08 ] |
|
Could you please give us more details about your setup ? java version, JAVA_OPTS and CATALINA_OPTS, the PersistenceManager you use in jackrabbit are 3 informations we need to be able to give perspective to your document. |
| Comment by Magnolia International [ 11/Jan/08 ] |
|
What I can probably already tell you from this, is that the graphs for 3.5 don't show a real flagrant evidence of a leak. At first sight, there are only minor GC hits. (we can see a major one on page 2 at about 35 minutes, but that's Magnolia 3.0) |
| Comment by Mike Jones [ 11/Jan/08 ] |
|
We've had the exact same issue: RHEL, JDK 1.6, JAVA_OPTS: -Xmx2048m -Xms512m Happens with both Magnolia 3.0.5 > BDB & Magnolia 3.0 > Derby |
| Comment by Yuanhua Qu [ 11/Jan/08 ] |
|
My testing env setup: tomcat 5.0.28 For 3.5, I also use magnolia-bdb-1.2.jar same as above. As for the graphs in the doc, every first graph for each mangolia version is just recording the mem at start up till after about 15 minutes with no hit at all. Page 1 for 3.0.5 and page 5 for 3.5. And you can see that mem usage was about 100M for 3.0.5 and 67M for 3.5; Page 2 shows mem leak trend for version 3.0.5 when consistant hit starts after 15 minutes of startup till about 50 minutes after startup. I hereby attach a screen shot Mag3.5Jackrabbit1.3.3_memGraph.JPG for mem heap usage when I tested v3.5 Our production is using 3.0.5 so the situation is really bad. |
| Comment by Yuanhua Qu [ 11/Jan/08 ] |
|
my test use JDK1.6.0_03 |
| Comment by Sean McMains [ 11/Jan/08 ] |
|
Qu should be providing the salient details on our environment. A few particularly interesting details that our experiments have turned up:
|
| Comment by Magnolia International [ 11/Jan/08 ] |
|
Http session handling and usage was definitely changed in 3.5, yes. Qu, it would be really helpful if you could decrease Xms and Xmx(even as low as 128 and 256m); hopefully that would show more "major GC" hits in the graphs. |
| Comment by Yuanhua Qu [ 11/Jan/08 ] |
|
No idea how my env setup msg got another 3 copies 30 min after I first submitted. Weird. I'll take Gregory's advice and set the heap low to test and for a longer period. Will update the result when it's done. |
| Comment by Magnolia International [ 11/Jan/08 ] |
|
Hmm, this probably happen as I've just been restarting the server a couple times just now, fixing a couple quirks since the migration |
| Comment by Sean McMains [ 14/Jan/08 ] |
|
I saw your latest updates on the status, Gregory. I wanted to point out that, among your recommendations, it was moving to Jackrabbit 1.3.3 that caused the problem in our case. I understand that you guys don't have the resources to backport the fixes to the 3.0 branch. However, if you're not going to do that, I would recommend warning people that upgrading to Jackrabbit 1.3.3 may not be a good idea until they're ready to move to Magnolia 3.5. |
| Comment by Magnolia International [ 14/Jan/08 ] |
|
Hmm, I didn't realize this. So you're saying that with Magnolia 3.0.x and Jackrabbit 1.0.x, memory usage was as stable as with Magnolia 3.5.x and JR1.3.x ? |
| Comment by Yuanhua Qu [ 14/Jan/08 ] |
|
I set the JVM max size to be 128M and it did show the mem usuage stable after each fgc hit. Here is the graph I got in the attachment Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG . This is much better than magnolia 3.0.5 with jackrabbit1.3.3 which shows mem climbing after each fgc. My concern for magnolia 3.5 with jackrabbit 1.3.3 is what is the influence to the application's performance if mem only get stable after each fgc happens . From the graphic, we can see that heap still grows fast after each minor collection. Looks like lots of objects get longer life time and push to Old space will will trigger fgc when old space is full. |
| Comment by Sean McMains [ 14/Jan/08 ] |
|
We haven't profiled the memory usage as closely with Magnolia 3.0.x and Jackrabbit 1.0.x, but that certainly seemed to be the case, yes. |
| Comment by Magnolia International [ 15/Jan/08 ] |
|
Sean : I edited the status to reflect your comment; it's still unclear as to why only upgrading Jackrabbit would make the issue worse, though. Qu : the purpose of this last test was to see how the GC behaved with less memory, i.e. trying and detect a potential leak. Resetting your Xmx to a more viable limit will produce less scary graphs Another question now: could you provide some details as to how this was tested? Is this real or generated traffic? Author or public instance? It seems to match our internal tests on a public instance, browsing with the anonymous user, so that's a rather good sign on that side of things. |
| Comment by Sean McMains [ 15/Jan/08 ] |
|
Gregory, all of our testing was done on public – the edit stage doesn't appear to display the problem, or does so at a much less dramatic rate. (Given that it seems to be tied to not tracking sessions, this makes some sense.) For our stats, I think Qu used JMeter to generate artificial traffic. But the problem first appeared in our production instance with real traffic. Qu, feel free to jump in with any details I've missed. |
| Comment by Yuanhua Qu [ 15/Jan/08 ] |
|
Yes. As Sean described, test was done by Jmeter (without enabling cookie for all http requests) ) on public instance browsing by anonymous user as your guess. It shows the similar mem usage when testing 3.0.5 with JR1.3.3 when enable persistent cookie for all http resquests. |
| Comment by Ryan Gardner [ 08/Feb/08 ] |
|
Jackrabbit 1.4 was released recently, and perhaps has an influence on this issue? I was able to update the version in my parent-pom and rebuild the project with no problems... (Although I suspect dropping in the jar files would work just as well) Jackrabbit 1.4's release notes indicate 220 bugfixes, and at a quick glance a few of them seemed like they might fix memory issues. In any case, it would give another graph to help complete the set |
| Comment by Todd Farrell [ 10/Mar/08 ] |
|
See attached image: We have consistently reproduced an OutOfMemory condition (memory usage pattern similar to attached image) using the Magnolia 3.0.x series.
Unfortunately, we haven't had the opportunity to test against the Magnolia 3.5.x series yet. Our production setup:
|
| Comment by sebastian.frick [ 14/Mar/08 ] |
|
has anyone drawn comparisons ee3.5.x/jr1.3.3 <> ee3.5.x/jr1.4? |
| Comment by Philipp Bracher [ 08/May/08 ] |
|
OK I wrote a jmeter test plan which does a heavy authoring (three threads):
The setup I used was:
I add the graph of the tenured gen space which shows:
As a next step I will test 3.6 (to see if the latest changes have an impact on that) |
| Comment by Philipp Bracher [ 08/May/08 ] |
|
Same test on a 3.6 looks quite nice. Note that the maximum memory usage was about 80MB (author & public in same VM!). After test exectuion the Memory was reduced to 36MB. Note that the throughput was much better (up to 8 times faster) |
| Comment by Philipp Bracher [ 09/May/08 ] |
|
|
| Comment by Mike Jones [ 14/May/08 ] |
|
Hi, Is this patch available for public use? We can download the patch file, but honestly we aren't quite sure how to apply it. Thx |
| Comment by Magnolia International [ 24/Jul/08 ] |
|
Mike - sorry for the late reply, but : 3.5.8 has been released a while ago now - and 3.6 is on the verge of being released, too. |