[MAGNOLIA-8347] JcrQueryBuilder performance issues. Created: 14/Mar/22  Updated: 26/Jan/23  Resolved: 26/Jan/23

Status: Closed
Project: Magnolia
Component/s: None
Affects Version/s: 6.2.17
Fix Version/s: None

Type: Bug Priority: High
Reporter: Rico Jansen Assignee: Dominik Maslanka
Resolution: Obsolete Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Issue split
split to MGNLUI-7739 System properties and non string valu... Closed
split to MGNLUI-7740 Null check for non-existing property ... Closed
split to MGNLUI-7755 Ability to improve JCR filtering perf... Closed
Problem/Incident
causality
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:
Epic Link: [JCR] Search Performance issues
Team: Nucleus
Work Started:

 Description   

Performance of the column search in the pages and assets app when we tested it on our large repository (100.000+ nodes) was not really usable. Analysis showed that the queries generated by JcrQueryBuilder did not really perform.

By experimenting the biggest improvement was to reduce the use of likes, and if used only with a postfix wildcard (ie 'test%' instead of '%test%') as this reduces the number of hits we included an contains query as well with little to no performance hit (The resulting query is quite similar to what Magnolia 5 used in the search view).

Another thing that increased performance to remove the use of wildcards on columns that have limited values (ie mgnl:template and such).

A curious one (which we fully blame on JackRabbit) was the the jcrName wildcard query (ie '%nodename%). A comparison with the queries generated by Magnolia 5 revealed that adding a property existence query increased performance significantly. So the difference between one without and one without on our repository is really something:
SELECT * FROM [nt:base] AS t WHERE ISDESCENDANTNODE(t, [/]) AND (t.[jcr:primaryType] = 'mgnl:page' OR t.[jcr:primaryType] = 'mgnl:variants' OR t.[jcr:primaryType] = 'mgnl:variant') AND (LOWER(LOCALNAME(t)) LIKE '%test%' )
546 nodes returned in 19210ms
SELECT * FROM [nt:base] AS t WHERE ISDESCENDANTNODE(t, [/]) AND (t.[jcr:primaryType] = 'mgnl:page' OR t.[jcr:primaryType] = 'mgnl:variants' OR t.[jcr:primaryType] = 'mgnl:variant') AND (LOWER(LOCALNAME(t)) LIKE '%test%' OR t.['xxxx'] is not null)
546 nodes returned in 426ms



 Comments   
Comment by Dominik Maslanka [ 26/Jan/23 ]

Closing because of split in smaller parts

Generated at Mon Feb 12 04:31:55 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.