[EEPUBLISH-28] Recurring Problem With Node Locking After Publishing is Causing Closed Sessions Created: 21/Aug/20  Updated: 29/Mar/22  Resolved: 30/Apr/21

Status: Closed
Project: Publishing Transactional
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1

Type: Bug Priority: Neutral
Reporter: Julian Nodarse Assignee: Jaroslav Simak
Resolution: Fixed Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File BS_author.log     Text File BS_public.log    
Issue Links:
Problem/Incident
Relates
relates to PUBLISHING-90 Set default timeout of 5 minutes for ... Closed
causality
dependency
depends upon PUBLISHING-99 Move away from JCR node locking Closed
relation
is related to EEPUBLISH-31 Unpublished parent reports node locked Closed
is related to PUBLISHING-91 Unpublished parent reports node locked Closed
is related to PUBLISHING-86 Provide an app to clear activation locks Closed
is related to PUBLISHING-88 Keep track of lock owners with sessio... Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[X]* Doc/release notes changes? Comment present?
[X]* Downstream builds green?
[X]* Solution information and context easily available?
[X]* Tests
[X]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[X]* Steps to reproduce, expected, and actual results filled
[X]* Affected version filled
Date of First Response:
Epic Link: Support
Sprint: Maintenance 22, Maintenance 23, Maintenance 24, Maintenance 25, Maintenance 26, Maintenance 33, Maintenance 34, HL & LD 17, HL & LD 18, HL & LD 19, HL & LD 20, HL & LD 22, HL & LD 23, HL & LD 24, HL & LD 25, HL & LD 26, HL & LD 27, HL & LD 28
Story Points: 8
Team: Nucleus

 Description   

Steps to reproduce

  1. No steps to reproduce - happens randomly on certain environments.

Expected results

  1. Content is published and unlocked on public instance.

Actual results

  1. Content publication fails and node remains locked on public instance.

Workaround

  1. Log in to public instance(s) and remove affected node(s) and republish content.
  2. Use publishing tools to clean mgnlSystem on public instance(s)
    OR
  3. Restart affected public instances

Development notes

Investigation ticket, also see tickets linked to PUBLISHING-86.
Timeboxed to 5sp for investigation

There is a recurring issue with having nodes locked after publishing. There is something happening where locked nodes are being broken and as a result are closing sessions. 

Here is an example of what the error is like : 

2020-08-18 16:50:44,524 WARN  org.apache.jackrabbit.core.lock.LockManagerImpl   : Unable to remove session-scoped lock on node ‘2aaf8f48-bff6-4cb7-a468-f8a95be0c68f-5’: This session has been closed. See the chained exception for a trace of where the session was closed

Despite attempts at using solutions for regular locked nodes issues, this still persists in some environments. 
 



 Comments   
Comment by Thomas Duffey [ 28/Aug/20 ]

In case this helps, we're mostly experiencing this problem within the assets manager and usually after we've moved a bunch of assets from one folder to another. Sometimes we let them sit for a while after moving before we publish the move. When we publish more often than not a few of the assets work OK (Sometimes just one) and then the next one fails due to locked node. Our solution, which can be extremely painful when moving hundreds or thousands of assets, is to then go delete the parent folder on the public instances and try again. Usually another asset or two work and then the folder locks again. Rinse and repeat which can take forever

One thing we noticed is that unpublishing does not seem to have this issue. We can move assets around and then click "unpublish" and they get removed from their old location just fine without locking. Still experimenting to see if we can then republish w/o getting node locked.

Comment by Richard Gange [ 28/Aug/20 ]

I have added some extra session tracking to the locks. See this commit.

My thinking is we give the locks IDs. Then if we enter a case where we cannot acquire the lock the ID (or owner) is printed in the TRACE log.

Generated at Mon Feb 12 10:38:02 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.