[MAGNOLIA-4055] Add Node/JCR API to QueryUtil Created: 14/Jul/11  Updated: 19/Jul/12  Resolved: 10/Jul/12

Status: Closed
Project: Magnolia
Component/s: core
Affects Version/s: None
Fix Version/s: 4.5.4, 5.0

Type: Improvement Priority: Major
Reporter: Christian Ringele Assignee: Ondrej Chytil
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:

 Comments   
Comment by Jan Haderka [ 30/Mar/12 ]

Too big of a task. Should be re-started early in next iteration. Create concept page first and then split issue in subtasks. It's not just 1:1 migration of old API.

Comment by Fabrizio Giustina [ 20/Jun/12 ]

looking at the recent commits looks like that new APIs are as bad as the old ones, what about fixing old mistakes during this rewrite?

I see that new APIs are really implemented like old ones, allowing the user to specify the node type AFTER the query has been executed, so that all the results have to be processed and filtered. New APIs returns an iterator simply because all the resultset is loaded in a collection and than an iterator is created from that, that's meaningless...

Having this post processing:

  • makes impossibile to use this APIs for large resultset, since everything will be eager (pre-processing of the whole resultset, instead of a streaming result set) and memory hungry
  • breaks pagination at all. Since the resultset is post-precessed andd the number of results changes you cannot use pagination or query limit at the jcr level
  • it's simply "not the way a jcr search for a page should work": the main topic here is to aggregate results from paragraphs in pages, but this was the workaround found years ago when there was no way better to do it properly. Nowadays the "right way" is simply to configure jackrabbit indexing in order to aggregate paragraph content into the parent page fulltext node.
    (old topic http://www.openmindlab.com/lab/products/mgnlcriteria.html )

So, the suggestion is simply to drop the itemtype parameter from methods and the filtering, and just return the iterator you get back from jackrabbit. That's the only way to make those API really usable and keep magnolia fast

A few additional nice features could be pagination, spell check, returning the total number of results (so a more complex object than the simple iterator) and so on... this is something we implemented years ago in mgnlcriteria but it's a pity to see that magnolia native APIs are so poor. I would be happy to help if you agree to make them better

Comment by Jan Haderka [ 20/Jun/12 ]

Hey Fabrizio,

glad to see you finally chimed in.

As for the possibility to specify type for aggregation of results, yes we kept it in. The reason being that not everyone can or want to change indexes and we felt it necessary to still provide them the functionality. But this parameter is in no way enforced. Also if you look at the way things are processed filtering is applied only when iterating the results which significantly lowers the performance penalty. Just because we provide a gun, doesn't mean that everyone has to go around and start shooting.

As for your offer to add support for all the goodies - yeah, this would be most welcome addition. I would not do it as part in the minor release, but master branch is now 5.0 and that is where this should happen imho. It would be great to see those additions from you. I would suggest you also summarize what you want to add on wiki in a concept page so we can discuss the amount of functionality and impact before too much of the code is committed and make sure that it is all done and ready by the release time of 5.0, just like it was done for the current change. Judging from the fact that you react only now, it might be good if you subscribe to a RSS feed from wiki to see what is being discussed and/or written in there and can take part in the discussion. Your feedback is most welcome as always.

Cheers,
Jan

Comment by Fabrizio Giustina [ 20/Jun/12 ]

Hi Jan,
thanks for your answer, I just found the wiki page you were referring to http://wiki.magnolia-cms.com/display/DEV/Concept+QueryUtil , I'll try contributing there.

As for the possibility to specify type for aggregation of results, yes we kept it in. The reason being that not everyone can or want to change indexes and we felt it necessary to still provide them the functionality.

well, maybe we could look if there is a way (especially now that we have new nodetypes for pages and areas) to add a standard indexing configuration in the default magnolia distribution which already aggregates paragraph content in the fulltext page... we usually use the area/column name, but if it's doable using nodetypes we could make it generic. It would be nice if we can avoid putting a gun in our user's hand, even if we don't force them to use it

Generated at Mon Feb 12 03:52:14 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.