[MGNLSSO-322] Flat structure of profiles can lead to indexing issues Created: 20/Nov/23  Updated: 26/Jan/24

Status: In Progress
Project: Single Sign On
Component/s: None
Affects Version/s: 3.1.8
Fix Version/s: None

Type: Bug Priority: Neutral
Reporter: Richard Gange Assignee: Nguyen Phung Chi
Resolution: Unresolved Votes: 0
Labels: SSO_and_Security_Initiative, cs-bk
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to MAGNOLIA-9057 Repository hygiene Open
relates to MGNLSSO-328 Enable SSO users to edit their locale... Open
relates to ADMINCTR-523 Re-structure profiles workspace and d... In Progress
dependency
depends upon MAGNOLIA-7082 Provide framework functionality to su... Accepted
depends upon MAGNOLIA-9264 Provide a foundation to manage user p... In Progress
relation
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:
Work Started:

 Description   

We have seen when the number of profiles starts to reach the thousands it can lead to corruption in the index.

Caused by: java.io.EOFException: read past EOF: SimpleFSIndexInput(path="/magnolia/repositories/magnolia/workspaces/profiles/index/_vh/_0.cfs")
	at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:125) ~[lucene-core-3.6.0.jar:3.6.0 1310449 - rmuir - 2012-04-06 11:31:16]
	... 206 more
ERROR org.apache.jackrabbit.core.SearchManager 16.11.2023 12:58:56 -- Error indexing node.
java.io.IOException: read past EOF: SimpleFSIndexInput(path="/magnolia/repositories/magnolia/workspaces/profiles/index/_vh/_0.cfs"): SimpleFSIndexInput(path="/magnolia/repositories/magnolia/workspaces/profiles/index/_vh/_0.cfs")
	at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:140) ~[lucene-core-3.6.0.jar:3.6.0 1310449 - rmuir - 2012-04-06 11:31:16]
	at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:270) ~[lucene-core-3.6.0.jar:3.6.0 1310449 - rmuir - 2012-04-06 11:31:16]
	at org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136) ~[lucene-core-3.6.0.jar:3.6.0 1310449 - rmuir - 2012-04-06 11:31:16]
	at org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:111) ~[lucene-core-3.6.0.jar:3.6.0 1310449 - rmuir - 2012-04-06 11:31:16]
...
...
	at info.magnolia.sso.SsoUserManager.updateLastAccessTimestamp(SsoUserManager.java:130) ~[magnolia-sso-3.1.5.jar:?]
	at info.magnolia.sso.jaas.SsoAuthenticationModule.validateUser(SsoAuthenticationModule.java:120) ~[magnolia-sso-3.1.5.jar:?]
	at info.magnolia.sso.jaas.SsoAuthenticationModule.login(SsoAuthenticationModule.java:90) ~[magnolia-sso-3.1.5.jar:?]

DEV NOTE: lets use same storage mechanism as PUR for start.

Workaround
Comment out the SearchIndex configuration in the profiles workspace.xml file:

    <!--SearchIndex class="info.magnolia.jackrabbit.lucene.SearchIndex">
      <param name="path" value="${wsp.home}/index"/>
      <param name="indexingConfiguration" value="/info/magnolia/jackrabbit/indexing_configuration_${wsp.name}.xml"/>
      <param name="useCompoundFile" value="true"/>
      <param name="minMergeDocs" value="100"/>
      <param name="volatileIdleTime" value="3"/>
      <param name="maxMergeDocs" value="100000"/>
      <param name="mergeFactor" value="10"/>
      <param name="maxFieldLength" value="10000"/>
      <param name="bufferSize" value="10"/>
      <param name="cacheSize" value="1000"/>
      <param name="forceConsistencyCheck" value="false"/>
      <param name="autoRepair" value="true"/>
      <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl"/>
      <param name="respectDocumentOrder" value="true"/>
      <param name="resultFetchSize" value="100"/>
      <param name="extractorPoolSize" value="3"/>
      <param name="extractorTimeout" value="100"/>
      <param name="extractorBackLogSize" value="100"/>
      <param name="supportHighlighting" value="true"/>
      <param name="excerptProviderClass" value="info.magnolia.jackrabbit.lucene.SearchHTMLExcerpt"/>
    </SearchIndex -->

Generated at Mon Feb 12 10:53:10 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.