[MGNLCI-25] Automatic bootstrapping and configurable import-task strategy Created: 29/Jun/20  Updated: 08/Sep/20  Resolved: 24/Jul/20

Status: Closed
Project: Content Importer
Component/s: None
Affects Version/s: None
Fix Version/s: 1.0.4

Type: Bug Priority: Major
Reporter: Christopher Zimmermann Assignee: Robert Šiška
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File flow.png    
Issue Links:
causality
caused by MGNLCI-22 Bootstrapping/importing content with ... Closed
is causing MGNLCI-23 DOC: Bootstrapping content from light... Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Release notes required:
Yes
Documentation update required:
Yes
Date of First Response:
Sprint: HL & LD 6, HL & LD 7, HL & LD 8
Story Points: 8

 Description   

MGNLCI-22 adds the necessary capability to provision bootstrap files while the system is not running. However the current situation has a flaw which causes additional import tasks to be created at every startup which is very irritating and confusing to developers and administrators.

Two possible approaches to address this (maybe there are others):

1. The new 'magnolia.content.boostrap.initial=true' property should cause detected files to be imported only when the Magnolia instance is installed, not at every startup. This would be most consistant to the known behaviour of the bootstrap files that modules supply or that are placed in 'WEB-INF/boostraps' directory.

2. The system could keep track of which files have been imported, so that at every startup it would only import files that it had not already imported. This is related to what has been reported in MGNLCI-14.



 Comments   
Comment by Robert Šiška [ 08/Jul/20 ]

Current progress:

Changes implemented in MGNLCI-22 still apply with one difference - initial bootstrap is only triggered if the instance (or rather content-importer module) was freshly installed.

Bootstrap is not triggered when the module was only updated.

To rebootstrap again, one can set /modules/content-importer/config/initialBootstrap property to true.

Comment by Christopher Zimmermann [ 09/Jul/20 ]

Sounds good.

Comment by Christopher Zimmermann [ 09/Jul/20 ]

Can I acheive this? As a developer, I want

  • on install - everything is content-importer directory is bootstrapped with no tasks to author and public.
  • after install - everything added is to content-importer directory is detected and tasks created only on author.

Context:

As a developer - i typically never log into public instance. The expected practice is that i drop files in the directory, I then see the tasks in adminCentral, approve the tasks, and then go to that content and publish it to the public instance. If tasks are always created on public instances -  then it would be confusing if developer logs into public instance and finds many import tasks there (and they might not know that the content was already published from the author, etc.) Not a giant problem.

Comment by Christopher Zimmermann [ 10/Jul/20 ]

I'd like to rename magnolia.content.bootstrap.initial, as I don't think its clear what it does from the name.

I propose
magnolia.content.bootstrap.noTasksAtInstall
or
magnolia.content.bootstrap.noTasksOnInstall

 

Comment by Christopher Zimmermann [ 10/Jul/20 ]

It would be useful if there was additionally a way for developers to opt to always import without tasks.

Maybe another configuration property like:

magnolia.content.bootstrap.noTasks=true (default false)

Comment by Robert Šiška [ 13/Jul/20 ]

czimmermann
It seems to me that what people generally want from this module is similar behavior to light-modules - just drop content into folder to import it. Also, there's really no reason to have more that one import task per node. So this is the process I propose. It's not that difficult, but I made it into a picture, so it's clear.

Question remains what should be configurable, what should be the defaults, etc.

About bootstrapping into content-type defined workspaces - when both contentType definition and content are present at the instance startup, then it works. But when you just drop a light-module with content, then the content can get picked first (although it usually doesn't from my experience). We had this exact problem with apps and contentTypes and we found some workaround, but I'd rather discourage users from mixing light-modules and content. In that scenario, nobody would expect it work when you import content first.

About public/author issues - I guess there could be some checks in code whether you're on author, but wouldn't it be easier to just disable content bootstrapping on public through magnolia properties? That would also be much more clear if we encouraged to have content and light-modules on separate locations.

Comment by Christopher Zimmermann [ 13/Jul/20 ]

Thanks for the diagram. Process looks good.

Re issues with dropping light-module with content on running instance. We can certainly live without it. And evaluate later. Personally I think lightmodules should be able to include content, but there is a lot of disagreement there so for now I recommeend the pattern of not including bootstraps in light modules.

Re disabling contgent bootstrapping on public: Sadly this just becomes way too difficult for a developer. You could easily have 10's of bootstrap files. To process each task is minimum of 3 clicks. And then you need to know where all of these imports are and go to those apps to publish them all.

A 'magnolia.content.bootstrap.noTasksOnInstall' property mitigates this.

We should in the future look into:

  • Content Importer can create a 'ParentTask' which processes all the other ones if you accept it.
  • Content importer tasks give option to "import and push to public".
  • A Simplified task UI.
  • Checkboxes for bulk processing of tasks.
Comment by Robert Šiška [ 15/Jul/20 ]

What's been implemented:
 
There are two new properties, magnolia.content.bootstrap.onlyImportAtInstall and magnolia.content.bootstrap.createTasks, which allow for several import strategies, like:

  • most restrictive (onlyImportAtInstall=true and createTasks=always) - Content is not imported at every startup, but only at install. No content gets bootstrapped automatically.
  • createTasks=onchange - Content gets bootstrapped automatically if the path doesn't exist in JCR, otherwise a task is created.
  • most permissive (createTasks=never) - Content is bootstrapped at every startup and after every change.

Defaults are onlyImportAtInstall=false and createTasks=always. Valid values for createTasks are always, onchange & never.

Generated at Mon Feb 12 00:22:41 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.