[MLEARN-17] Concurrency issues in NN ranker Created: 19/Jun/19 Updated: 14/Mar/22 Resolved: 21/Jul/20 |
|
| Status: | Closed |
| Project: | Machine Learning |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 1.2.1 |
| Type: | Bug | Priority: | Neutral |
| Reporter: | Aleksandr Pchelintcev | Assignee: | Chuong Doan Huy |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | maintenance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | 1d 1.5h | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Template: |
|
||||||||||||||||||||||||||||
| Acceptance criteria: |
Empty
|
||||||||||||||||||||||||||||
| Task DoD: |
[ ]*
Doc/release notes changes? Comment present?
[ ]*
Downstream builds green?
[ ]*
Solution information and context easily available?
[ ]*
Tests
[ ]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||||||||||||||||||||||
| Bug DoR: |
[ ]*
Steps to reproduce, expected, and actual results filled
[ ]*
Affected version filled
|
||||||||||||||||||||||||||||
| Release notes required: |
Yes
|
||||||||||||||||||||||||||||
| Epic Link: | Periscope improvements | ||||||||||||||||||||||||||||
| Sprint: | Maintenance 15, Maintenance 16 | ||||||||||||||||||||||||||||
| Story Points: | 5 | ||||||||||||||||||||||||||||
| Description |
|
It appears that NN result ranker suffers from an intermittent concurrency issue. The problem seems to be related to the simultaneous unsynched r/w access to the NeuralNetworkResultRanker#resultTexts (see the stacktrace below). After some code review it can be cocnluded that in situation when there are two different browser sessions authenticated with the same account (e.g. superuser), it is going to be the case that NeuralNetworkResultRanker instance will be shared between them (since the corresponding ranker factory caches the instances by user). Even though it was hard to detect the issue during manual testing and simulations, it does appear in e.g. load tests and even sometimes during development. All the API that let's to interact with NN result ranker needs to be synchronised.
2019-06-10 13:21:48,899 ERROR gnolia.admincentral.findbar.search.ResultCollector: An error occurred during the search process, therefore an empty collection will be returned.
java.util.concurrent.CompletionException: java.util.ConcurrentModificationException
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[?:1.8.0_144]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) ~[?:1.8.0_144]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592) ~[?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_144]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_144]
Caused by: java.util.ConcurrentModificationException
at org.apache.commons.collections4.map.AbstractLinkedMap$LinkIterator.nextEntry(AbstractLinkedMap.java:552) ~[commons-collections4-4.1.jar:4.1]
at org.apache.commons.collections4.map.AbstractLinkedMap$KeySetIterator.next(AbstractLinkedMap.java:450) ~[commons-collections4-4.1.jar:4.1]
at java.util.AbstractCollection.toArray(AbstractCollection.java:141) ~[?:1.8.0_144]
at java.util.ArrayList.<init>(ArrayList.java:177) ~[?:1.8.0_144]
at info.magnolia.periscope.rank.ml.IndexedBuffer.asList(IndexedBuffer.java:97) ~[magnolia-periscope-result-ranker-1.1-SNAPSHOT.jar:?]
at info.magnolia.periscope.rank.ml.NeuralNetworkResultRanker.outputArrayToResults(NeuralNetworkResultRanker.java:192) ~[magnolia-periscope-result-ranker-1.1-SNAPSHOT.jar:?]
at info.magnolia.periscope.rank.ml.NeuralNetworkResultRanker.rank(NeuralNetworkResultRanker.java:142) ~[magnolia-periscope-result-ranker-1.1-SNAPSHOT.jar:?]
at info.magnolia.periscope.Periscope.fetchSupplierAwareSearchResults(Periscope.java:135) ~[magnolia-periscope-core-1.1-SNAPSHOT.jar:?]
at info.magnolia.periscope.Periscope.lambda$search$0(Periscope.java:113) ~[magnolia-periscope-core-1.1-SNAPSHOT.jar:?]
at info.magnolia.periscope.search.SearchRunner.lambda$execute$0(SearchRunner.java:85) ~[magnolia-periscope-core-1.1-SNAPSHOT.jar:?]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[?:1.8.0_144]
... 3 more
|