[MAGNOLIA-2172] workflow: unexpected modified item state exceptions on heavy load Created: 05/Jun/08 Updated: 23/Jan/13 Resolved: 11/Jul/08 |
|
| Status: | Closed |
| Project: | Magnolia |
| Component/s: | workflow |
| Affects Version/s: | None |
| Fix Version/s: | 3.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Oscar Poves Hernanez | Assignee: | Philipp Bärfuss |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
VMWare Infraestructure, asigned 2 CPU 3,0Ghz , 3,5GB of RAM, linux RHEL 5, we have assigned to magnolia process 2,6GB |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Template: |
|
||||||||||||||||
| Acceptance criteria: |
Empty
|
||||||||||||||||
| Task DoD: |
[ ]*
Doc/release notes changes? Comment present?
[ ]*
Downstream builds green?
[ ]*
Solution information and context easily available?
[ ]*
Tests
[ ]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||||||||||
| Bug DoR: |
[ ]*
Steps to reproduce, expected, and actual results filled
[ ]*
Affected version filled
|
||||||||||||||||
| Date of First Response: | |||||||||||||||||
| Description |
|
In a load test robots with editors and activators. Publishers create pages and activate, there are 25 robots. The activators has only one activity, activate elments of inbox, there are only seven robots. After a load but not exceeding one hour on the inbox ceases to accept elements. |
| Comments |
| Comment by Magnolia International [ 05/Jun/08 ] |
|
Could you please
Thanks. |
| Comment by Oscar Poves Hernanez [ 05/Jun/08 ] |
|
Process detail are '-Xms2600m -Xmx2600m -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy' When the inbox ceases to accept elements, this is the exception. I don't know previous errors. |
| Comment by Philipp Bracher [ 05/Jun/08 ] |
|
I was able to investigate the system by using the shell. So we can say that the node /expressions/activation/xxxx can't be saved because the parent was modified externaly. As soon I refresh the session (by using the new LifeTimeJCRSessionUtil) the system starts to work again. I don't know how it could happen that the content was modified externally as we use a single session. It might be that the problem lays in the fact that the jackrabbit sessions are not thread safe while writing (at least that is my status). So I will patch the workflow module in a way that it uses unique sessions per atom operation (store, fetch, ...). You can then run it testwise in your instance (redoing the loadtest). I also noticed that the connection to the database was reestablished now and then (should not have an impact on that but you never know) |
| Comment by Philipp Bracher [ 12/Jun/08 ] |
|
The attached patch seams to fix the issue (by using a separate session per operation). But the customer reported a dramatic performance loss. So I wait with applying it until we know more. |
| Comment by Philipp Bracher [ 20/Jun/08 ] |
|
I committed the changes to 3.6 but modified as such that the cleanup and usage of life time jcr sessions is configurable. The default configuration is exactly the same as before. This configuration should not be used and is for testing purposes only. Based on that I will check the performance loss if we use other configurations. |
| Comment by Philipp Bracher [ 11/Jul/08 ] |
|
The methods are now synchronized to avoid concurrent modifications and savings. The module is configured to use single sessions (otherwise the performance loss is to big). In case the expressions store runs into the situation that the session has pending changes, a warn message is logged and the session is refreshed. |