[TXTREC-87] Provide a way to disable text classification Created: 29/Dec/20  Updated: 02/Feb/24  Resolved: 30/Nov/22

Status: Closed
Project: Text Classification
Component/s: None
Affects Version/s: 1.1.2, 2.0
Fix Version/s: 1.1.6, 2.0.0

Type: Improvement Priority: Neutral
Reporter: Richard Gange Assignee: Miguel Martinez
Resolution: Done Votes: 0
Labels: VN-Testing, aws
Σ Remaining Estimate: 0d Remaining Estimate: 0d
Σ Time Spent: 4.75h Time Spent: 4.75h
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Issue Links:
Relates
causality
relation
is related to TXTREC-88 Don't pin modules to a specific version Open
Sub-Tasks:
Key
Summary
Type
Status
Assignee
TXTREC-100 DOC: Describe how to disable the text... Documentation Task Completed Adrian Brooks  
TXTREC-101 Port changes to 2.0 Sub-task Completed  
TXTREC-102 QA Sub-task Completed Lam Nguyen Bao  
Template:
Acceptance criteria:
Empty
Task DoD:
[X]* Doc/release notes changes? Comment present?
[X]* Downstream builds green?
[X]* Solution information and context easily available?
[X]* Tests
[X]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Release notes required:
Yes
Documentation update required:
Yes
Date of First Response:
Epic Link: AuthorX Support
Sprint: AuthX 22
Story Points: 3
Team: AuthorX

 Description   

Issue 1
Provide a way to disable this feature. A customer might want to use other amazon connector modules. However, if AWS credentials are found in the Passwords app then text classification is enabled. For some customers this can slow down the start up since the system will pause until the classification process completes:

2020-12-29 10:59:40,242 INFO  info.magnolia.ai.text.TextClassificationModule    : Text classification might take some time, please do not shut down your instance.
2020-12-29 10:59:40,302 INFO  info.magnolia.ai.text.TextClassificationModule    : Number of untagged nodes: 42

I tried changing the config files here:

  • /text-classification/config.yaml
    aggregateDefinition:
      fieldTypes: []
    termFilteringDefinition:
      excludedTerms: []
    
  • /pages-content-tags-integration/decorations/text-classification/config/config.yaml

Nothing seems to disable the feature from running. You have to uninstall the module.

Issue 2
Also some users may not want to use AWS at all but still get a confusing error message in the log which they might think is causing other issues:

ERROR info.magnolia.aws.foundation.AwsCredentialsProvider 26.01.2021 13:35:55 - AWS credentials are expected to be set in Password manager module.
ERROR info.magnolia.ai.text.TextClassificationModule 26.01.2021 13:35:55 - Submission of text classification request has been failed at page path '/bdcwebsite/bdcchat/bdcchat2' with error code: 'null'.

Proposed solution
I would prefer to be able to hotfix /text-classification/config.yaml. The PR is attached to this ticket:

# turn off the module with this property
enabled: false
aggregateDefinition:
  fieldTypes: [text, textField, richText, richTextField, composite, compositeField, switchable, switchableField]
termFilteringDefinition:
  excludedTerms: []

would result in the log:

2020-12-29 13:11:11,177 INFO  info.magnolia.ai.text.TextClassificationModule    : Text classification module is disabled.


 Comments   
Comment by Roman Kovařík [ 29/Dec/20 ]

Any reason not to override workspaceClassificationConfigurations?

Looking at the code , this one could make the list of workspaces to tag empty.

Comment by Richard Gange [ 29/Dec/20 ]

So changing this config file /pages-content-tags-integration/decorations/text-classification/config/config.yaml:

workspaceClassificationConfigurations: [ ]

Ends up with this exception:

2020-12-29 12:25:57,066 INFO  info.magnolia.ai.text.TextClassificationModule    : Text classification might take some time, please do not shut down your instance.
2020-12-29 12:25:57,069 ERROR info.magnolia.event.SimpleEventBus                : Exception caught when dispatching info.magnolia.module.ModulesStartedEvent with info.magnolia.ai.text.TextClassificationModule$$Lambda$281/1773156898 eventHandler.
java.lang.NullPointerException: null
	at java.util.Hashtable.get(Hashtable.java:364) ~[?:1.8.0_261]
	at info.magnolia.repository.WorkspaceMapping.getWorkspaceMapping(WorkspaceMapping.java:124) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.repository.DefaultRepositoryManager.getSystemSession(DefaultRepositoryManager.java:319) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.SystemRepositoryStrategy.internalGetSession(SystemRepositoryStrategy.java:54) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.AbstractRepositoryStrategy.getSession(AbstractRepositoryStrategy.java:75) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.AbstractContext.getJCRSession(AbstractContext.java:124) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.ai.text.TaggingHelper.getAllUntaggedNodePaths(TaggingHelper.java:61) ~[magnolia-text-classification-1.1.2.jar:?]
	at info.magnolia.ai.text.TextClassificationModule.runClassification(TextClassificationModule.java:128) ~[magnolia-text-classification-1.1.2.jar:?]
	at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_261]
	at info.magnolia.ai.text.TextClassificationModule.start(TextClassificationModule.java:121) ~[magnolia-text-classification-1.1.2.jar:?]
	at info.magnolia.ai.text.TextClassificationModule.lambda$new$0(TextClassificationModule.java:106) ~[magnolia-text-classification-1.1.2.jar:?]
	at info.magnolia.module.ModulesStartedEvent.dispatch(ModulesStartedEvent.java:46) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.module.ModulesStartedEvent.dispatch(ModulesStartedEvent.java:42) ~[magnolia-core-6.2.5.jar:?]
	at info.magnolia.event.SimpleEventBus.fireEvent(SimpleEventBus.java:75) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.module.ModuleManagerImpl.startModules(ModuleManagerImpl.java:352) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.module.ui.ModuleManagerWebUI.onStartup(ModuleManagerWebUI.java:78) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.cms.beans.config.ConfigLoader.load(ConfigLoader.java:146) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener$1.doExec(MagnoliaServletContextListener.java:259) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.MgnlContext$VoidOp.exec(MgnlContext.java:407) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.MgnlContext$VoidOp.exec(MgnlContext.java:404) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.context.MgnlContext.doInSystemContext(MgnlContext.java:378) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.startServer(MagnoliaServletContextListener.java:256) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.contextInitialized(MagnoliaServletContextListener.java:182) [magnolia-core-6.2.5.jar:?]
	at info.magnolia.init.MagnoliaServletContextListener.contextInitialized(MagnoliaServletContextListener.java:128) [magnolia-core-6.2.5.jar:?]
	at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4678) [catalina.jar:9.0.37]
	at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5139) [catalina.jar:9.0.37]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) [catalina.jar:9.0.37]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384) [catalina.jar:9.0.37]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374) [catalina.jar:9.0.37]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_261]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) [tomcat-util.jar:9.0.37]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) [?:1.8.0_261]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909) [catalina.jar:9.0.37]
	at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:841) [catalina.jar:9.0.37]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) [catalina.jar:9.0.37]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384) [catalina.jar:9.0.37]
	at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374) [catalina.jar:9.0.37]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_261]
	at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75) [tomcat-util.jar:9.0.37]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) [?:1.8.0_261]
	at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909) [catalina.jar:9.0.37]
	at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:262) [catalina.jar:9.0.37]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) [catalina.jar:9.0.37]
	at org.apache.catalina.core.StandardService.startInternal(StandardService.java:421) [catalina.jar:9.0.37]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) [catalina.jar:9.0.37]
	at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930) [catalina.jar:9.0.37]
	at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) [catalina.jar:9.0.37]
	at org.apache.catalina.startup.Catalina.start(Catalina.java:738) [catalina.jar:9.0.37]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261]
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:342) [bootstrap.jar:9.0.37]
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:473) [bootstrap.jar:9.0.37]

It seems to only "clean" way to disable it is to remove the module entirely.

Generated at Mon Feb 12 11:05:19 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.