[MAGNOLIA-9051] Improve cache performance on public instances Created: 10/Aug/23  Updated: 19/Jan/24

Status: Open
Project: Magnolia
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Epic Priority: Neutral
Reporter: Michael Duerig Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: dx-core-6.3
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Java Source File NavigationAwareCacheCleaner.java     Java Source File PublicationFilter.java     Java Source File PublishFilterEvent.java     PNG File exclude-workspace.png     Text File threads_report-3.txt     Text File threads_report.txt    
Issue Links:
Relates
relates to MAGNOLIA-9176 Apply central threadpool to DelayedEx... Closed
relates to MGNLCACHE-240 Optimise EhCache3Factory cache wrappi... Open
relates to MGNLCACHE-241 Consider replacing ReentrantRWLocks w... Open
relates to MAGNOLIA-9056 Memory leak when re-registering module Selected
relates to MGNLADVCACHE-124 Introduce new Flush policy and rely o... Resolved
Template:
Epic Name: cache performance
Acceptance criteria:
Empty
Date of First Response:

 Description   

Context

See the notes from UHZ for pain points and initial findings

Questions for discovery

  • (How) can we implement a cash flush policy that does not rely on observation?
  • Can we rely on publication instead?
  • UZH does this via the NavigationAwareCacheCleaner.java, which uses a filter for intercepting publish requests instead of relying on observation like we do. Can we leverage this solution?
  • How does that affect eager re-caching?
  • Can we do it in the background? Consistency guarantees?

Discover notes

 



 Comments   
Comment by Oanh Thai Hoang [ 29/Nov/23 ]

This is just a update of discovery. I haven't completed discovery yet.

When setting 
/modules/advanced-cache/config@createseparatecachesforeachsite to true. The number of 
WorkspaceEventListenerRegistration is increased by number current workspace * number of sites.
See below log, this happen for each of site:

info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [dam] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [keystore] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [contacts] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [usergroups] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [scripts] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [tours] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [website] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [userroles] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [campaigns] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [stories] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [workflow] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [tags] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [userranking] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [visitors] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [tasks] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [category] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [segments] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [rss] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [personas] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [resources] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [default] at path [/] 
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [pendingContacts] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [marketing-tags] at path [/]
info.magnolia.module.cache.AbstractListeningFlushPolicy 29.11.2023 09:41:40 -- Registering event listener for cache [test33] in workspace [config] at path [/]

There are indeed 24 observations listen to root path of workspaces and listen events changes then do flush policy.
 
One weird thing is I see in userranking workspace, it has duration of 1 minute to persist data to JCR
 
See below log:
 

2023-11-29 12:53:17,674 DEBUG magnolia.rank.persistence.jcr.JcrUserRankerStorage: Storing user ratings to user with id: 51ae3379-67cf-4994-9e05-f97cb8bc3e4a
2023-11-29 12:54:17,674 DEBUG magnolia.rank.persistence.jcr.JcrUserRankerStorage: Storing user ratings to user with id: 51ae3379-67cf-4994-9e05-f97cb8bc3e4a
2023-11-29 12:55:17,675 DEBUG magnolia.rank.persistence.jcr.JcrUserRankerStorage: Storing user ratings to user with id: 51ae3379-67cf-4994-9e05-f97cb8bc3e4a
2023-11-29 12:56:17,672 DEBUG magnolia.rank.persistence.jcr.JcrUserRankerStorage: Storing user ratings to user with id: 51ae3379-67cf-4994-9e05-f97cb8bc3e4a

See user ranking code:
https://git.magnolia-cms.com/projects/ENTERPRISE/repos/ranker/browse/user-result-ranker/src/main/java/info/magnolia/rank/rating/UserRatingsManager.java#44,
https://git.magnolia-cms.com/projects/ENTERPRISE/repos/ranker/browse/user-result-ranker-jcr/src/main/java/info/magnolia/rank/persistence/jcr/JcrUserRankerStorage.java#100
 
 
So let say my case, I have 100 site and have 100 listener to 'userranking' that is really wasted. 
 
First quick and easy suggest is add to userranking to excludedWorkspaces property.
 
 

Comment by Michael Duerig [ 30/Nov/23 ]

> The number of WorkspaceEventListenerRegistration is increased by number current workspace * number of sites.

This is exactly what UZH reported, great that we can reproduce this!

 

>  One weird thing is I see in userranking workspace, it has duration of 1 minute to persist data to JCR

Do we know why this is so slow? Getting a better understanding of this might provide some general clues on performance bottlenecks. E.g. is that particular code slow and should be optimized or is it slow because the system as a whole is under much load and/or contended?

 

> First quick and easy suggest is add to userranking to excludedWorkspaces property. 

Ack, and I'd be interested to know by how much this improves performance. Overall we should aim for a more general solution so we can get rid of the explosion of observation listeners (number current workspace * number of sites).

 

 

Comment by Oanh Thai Hoang [ 01/Dec/23 ]

threads_report.txt and and the bootstrap snapshot, open with IntellJ can be found here

Comment by Oanh Thai Hoang [ 01/Dec/23 ]

Here is another one, around 950 threads threads_report-3.txt, those thread don't release

Comment by Oanh Thai Hoang [ 11/Dec/23 ]

The relevant report is available at https://jira.magnolia-cms.com/browse/MAGNOLIA-9056?focusedId=388532&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-388532

 

Generated at Mon Feb 12 04:38:09 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.