[MGNLEESOLR-152] Clean command should delete also pages with robots meta-tag set to "noindex" Created: 02/Feb/21  Updated: 22/Sep/21  Resolved: 09/Feb/21

Status: Closed
Project: Solr Search Provider
Component/s: None
Affects Version/s: None
Fix Version/s: 5.5.4

Type: Improvement Priority: Major
Reporter: Milan Divilek Assignee: Milan Divilek
Resolution: Done Votes: 1
Labels: cs-bk, maintenance, quickwin
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screenshot 2021-02-09 at 09.32.42.png    
Issue Links:
documentation
to be documented by MGNLEESOLR-154 DOC: Solr module clean index command Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Release notes required:
Yes
Documentation update required:
Yes
Date of First Response:
Sprint: Maintenance 43
Story Points: 2

 Description   

An issue occurs if some page is regularly published and thus indexed in Solr. If later on editors decide such page should not show in search results anymore, they have a checkbox for this. When the page is rendered again, it gets robots:noindex attribute and also is no longer rendered in sitemap (which is used as entry point for Solr crawler). However, that means page stays indexed in Solr and will never be indexed again to update some data or even deleted.


Generated at Mon Feb 12 11:00:37 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.