-
Task
-
Resolution: Done
-
Neutral
-
None
-
None
-
None
-
-
Empty show more show less
-
Empty show more show less
-
CM & OC 23
-
2
https://documentation.magnolia-cms.com/display/DOCS57/Indexing+and+crawling+a+website+with+Solr
By default crawler mechanism is connected(chained before the crawler command) with info.magnolia.search.solrsearchprovider.logic.commands.CleanSolrIndexCommand to clean index from outdated indexes(pages).
Configuration options:
- max - maximum number of documents which will be checked - by default set to 500 - since 5.0.1
- onlyHead - instead of fetching whole document only head is requested - default is false - if deleteNoIndex property is set to true, then this configuration is ignored, because robots meta tag can't be resolved from head request - since 5.5.1
- followRedirects - if set to true, redirects are followed and the status code of finale page is evaluated - by default set to false - since 5.5.2
- statusCodes - list of status codes, if page returns any of configured status codes then it will be removed from indexes. - by default it's empty, but 404 is every time considered to be removed - since 5.0.1
- deleteNoIndex - if set to true than also pages with robots meta tag set to noindex will be removed from index - by default set to true - since 5.5.4
- skipIfAlreadyRunning- if the clean command is already running for the crawler, it is stopped and a new one is started. By setting the property to true, behaviour is changed an previously running clean command is finished and the new one is skipped - since 5.5.1
Acceptance criteria
- documents
-
MGNLEESOLR-152 Clean command should delete also pages with robots meta-tag set to "noindex"
- Closed
- is cloned by
-
MGNLEESOLR-157 DOC: Port 5.7 doc update for Solr module clean index command to 6.2 docu repo
- Closed