|
[
Permlink
| « Hide
]
Yuanhua Qu added a comment - 11/Jan/08 06:44 PM
I recorded the mem leak behavious for 3.0.5 and 3.5. Heap still have a growing trend and will eventually cause problem. And the record is only under small testing environment with pretty small load. The mem leak crushed our production server which have hundreds of sites and high load and already given big heap size for JVM. Your fast fix on this issue will surely save us from a catastrophe. Looking forward to your great news on this.
Could you please give us more details about your setup ? java version, JAVA_OPTS and CATALINA_OPTS, the PersistenceManager you use in jackrabbit are 3 informations we need to be able to give perspective to your document.
Thanks for the detailed report ! What I can probably already tell you from this, is that the graphs for 3.5 don't show a real flagrant evidence of a leak. At first sight, there are only minor GC hits. (we can see a major one on page 2 at about 35 minutes, but that's Magnolia 3.0)
Reducing the Xmx parameter would help the test, by forcing more major GC hits. After a handful of such hits, the graph will hopefully (rather not We've had the exact same issue:
RHEL, JDK 1.6, JAVA_OPTS: -Xmx2048m -Xms512m Happens with both Magnolia 3.0.5 > BDB & Magnolia 3.0 > Derby My testing env setup:
tomcat 5.0.28 For 3.5, I also use magnolia-bdb-1.2.jar same as above. As for the graphs in the doc, every first graph for each mangolia version is just recording the mem at start up till after about 15 minutes with no hit at all. Page 1 for 3.0.5 and page 5 for 3.5. And you can see that mem usage was about 100M for 3.0.5 and 67M for 3.5; Page 2 shows mem leak trend for version 3.0.5 when consistant hit starts after 15 minutes of startup till about 50 minutes after startup. I hereby attach a screen shot Mag3.5Jackrabbit1.3.3_memGraph.JPG for mem heap usage when I tested v3.5 Our production is using 3.0.5 so the situation is really bad. Qu should be providing the salient details on our environment.
A few particularly interesting details that our experiments have turned up:
Http session handling and usage was definitely changed in 3.5, yes.
Qu, it would be really helpful if you could decrease Xms and Xmx(even as low as 128 and 256m); hopefully that would show more "major GC" hits in the graphs. No idea how my env setup msg got another 3 copies 30 min after I first submitted. Weird.
I'll take Gregory's advice and set the heap low to test and for a longer period. Will update the result when it's done. Hmm, this probably happen as I've just been restarting the server a couple times just now, fixing a couple quirks since the migration
I'll delete the extra comments. Sorry for that. I saw your latest updates on the status, Gregory. I wanted to point out that, among your recommendations, it was moving to Jackrabbit 1.3.3 that caused the problem in our case.
I understand that you guys don't have the resources to backport the fixes to the 3.0 branch. However, if you're not going to do that, I would recommend warning people that upgrading to Jackrabbit 1.3.3 may not be a good idea until they're ready to move to Magnolia 3.5. Hmm, I didn't realize this. So you're saying that with Magnolia 3.0.x and Jackrabbit 1.0.x, memory usage was as stable as with Magnolia 3.5.x and JR1.3.x ?
I set the JVM max size to be 128M and it did show the mem usuage stable after each fgc hit. Here is the graph I got in the attachment Mag3.5Jackrabbit1.3.3_memGraph_128M.JPG . This is much better than magnolia 3.0.5 with jackrabbit1.3.3 which shows mem climbing after each fgc. My concern for magnolia 3.5 with jackrabbit 1.3.3 is what is the influence to the application's performance if mem only get stable after each fgc happens . From the graphic, we can see that heap still grows fast after each minor collection. Looks like lots of objects get longer life time and push to Old space will will trigger fgc when old space is full.
We haven't profiled the memory usage as closely with Magnolia 3.0.x and Jackrabbit 1.0.x, but that certainly seemed to be the case, yes.
Sean : I edited the status to reflect your comment; it's still unclear as to why only upgrading Jackrabbit would make the issue worse, though.
Qu : the purpose of this last test was to see how the GC behaved with less memory, i.e. trying and detect a potential leak. Resetting your Xmx to a more viable limit will produce less scary graphs Another question now: could you provide some details as to how this was tested? Is this real or generated traffic? Author or public instance? It seems to match our internal tests on a public instance, browsing with the anonymous user, so that's a rather good sign on that side of things. Gregory, all of our testing was done on public – the edit stage doesn't appear to display the problem, or does so at a much less dramatic rate. (Given that it seems to be tied to not tracking sessions, this makes some sense.)
For our stats, I think Qu used JMeter to generate artificial traffic. But the problem first appeared in our production instance with real traffic. Qu, feel free to jump in with any details I've missed. Yes. As Sean described, test was done by Jmeter (without enabling cookie for all http requests) ) on public instance browsing by anonymous user as your guess. It shows the similar mem usage when testing 3.0.5 with JR1.3.3 when enable persistent cookie for all http resquests.
Jackrabbit 1.4 was released recently, and perhaps has an influence on this issue? I was able to update the version in my parent-pom and rebuild the project with no problems... (Although I suspect dropping in the jar files would work just as well)
Jackrabbit 1.4's release notes indicate 220 bugfixes, and at a quick glance a few of them seemed like they might fix memory issues. In any case, it would give another graph to help complete the set See attached image:
memory-leak_magnolia-CE-3.0.5_jackrabbit-1.3.1.png We have consistently reproduced an OutOfMemory condition (memory usage pattern similar to attached image) using the Magnolia 3.0.x series.
Unfortunately, we haven't had the opportunity to test against the Magnolia 3.5.x series yet. Our production setup:
OK I wrote a jmeter test plan which does a heavy authoring (three threads):
The setup I used was:
I add the graph of the tenured gen space which shows:
As a next step I will test 3.6 (to see if the latest changes have an impact on that) Same test on a 3.6 looks quite nice.
Note that the maximum memory usage was about 80MB (author & public in same VM!). After test exectuion the Memory was reduced to 36MB. Note that the throughput was much better (up to 8 times faster) Hi,
Is this patch available for public use? We can download the patch file, but honestly we aren't quite sure how to apply it. Thx Mike - sorry for the late reply, but : 3.5.8 has been released a while ago now - and 3.6 is on the verge of being released, too.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||