[MGNLDAM-667] Fix full text dam search (Assets app) Created: 11/Aug/16  Updated: 17/Feb/21  Resolved: 29/Mar/18

Status: Closed
Project: Magnolia DAM Module
Component/s: DAM JCR Provider, DAM Templating
Affects Version/s: 2.1.6
Fix Version/s: 2.3.1

Type: Bug Priority: Major
Reporter: Richard Gange Assignee: Federico Grilli
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File dam-workspace.png    
Issue Links:
duplicate
is duplicated by MGNLDAM-737 Fix Assets app search to handle searc... Closed
relation
is related to MAGNOLIA-7123 Full text search in documents (pdf, d... Closed
is related to MAGNOLIA-7398 Full text search in documents (pdf, ... Closed
is related to MGNLDAM-939 JCR DAM has fixed nodetypes for node ... Closed
is related to MAGNOLIA-7847 Targeted indexing configurations Open
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Release notes required:
Yes
Date of First Response:
Sprint: Basel 139
Story Points: 3

 Description   

Currently you can use SearchTemplatingFunctions (searchfn) to search the DAM. The problem is that binary data itself gets stored on a resource sub node. So if I have a PDF and I search mgnl:asset nodes then I cannot find terms that may be located in the binary data itself. That will only search the metadata of the asset node. On the other hand, if I search mgnl:resource nodes then I will not find terms located in the asset metadata.

Two solutions to the problem could be:

  1. We add a searchAssets() function to the DamTemplatingFunctions (damfn) that considers both levels of data and wraps everything into a single call.
  2. We add another aggregation entry to the indexing_configuration.xml
    <aggregate primaryType="mgnl:asset”>
        <include primaryType="mgnl:resource”>*</include>
    </aggregate>
    

In the second method we can then use the SearchTemplatingFunctions since all the data will be aggregated into a single document. The preferred way from an efficiency standpoint.

 

As a result of the fix, asset search in Assets app works now properly.



 Comments   
Comment by Richard Gange [ 11/Aug/16 ]

For a robust asset search you will need to aggregate the content of both the asset and resource node into a single document. Like we do in the website workspace. See https://wiki.magnolia-cms.com/display/WIKI/Search+Index+Configuration+File.

The indexing_configuration.xml file will look like this:

<?xml version="1.0"?>
<!DOCTYPE configuration SYSTEM "http://jackrabbit.apache.org/dtd/indexing-configuration-1.2.dtd">
<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0" xmlns:mgnl="http://www.magnolia.info/jcr/mgnl" xmlns:jcr="http://www.jcp.org/jcr/1.0">
  <!--
      A global, generic indexing configuration used for all workspaces in Magnolia.
      It excludes some well known properties from the node scope fulltext index.
  -->
  <index-rule nodeType="nt:base">
    <property isRegexp="true" nodeScopeIndex="false">mgnl:.*</property>
    <property isRegexp="true" nodeScopeIndex="false">jcr:.*</property>
    <property isRegexp="true">.*:.*</property>
  </index-rule>

  <aggregate primaryType="mgnl:asset”>
    <include primaryType="mgnl:resource”>*</include>
  </aggregate>
</configuration>

You will need to to the following:

  1. Shut down magnolia.
  2. Create an indexing_configuration.xml file and move into the dam workspace folder
  3. Configure your dam workspace.xml to use the new indexing configuration. See https://wiki.magnolia-cms.com/display/WIKI/Jackrabbit+Workspace+Configuration+File#JackrabbitWorkspaceConfigurationFile-IndexingConfiguration.
  4. Rename the current index folder in the dam workspace. (we need to re-index with the new configuration)
  5. Start back up. That will trigger the re-indexing.
Comment by Richard Gange [ 18/Dec/19 ]

It should be noted that uses need to update their workspace.xml for the dam. See 5.6.6 release notes.

<param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_dam.xml" />
Generated at Mon Feb 12 05:02:05 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.