[MLEARN-6] Neural network storage debouncer is global, breaking multiple-ranker setups Created: 01/Feb/19  Updated: 21/Mar/19  Resolved: 11/Feb/19

Status: Closed
Project: Machine Learning
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1

Type: Bug Priority: Neutral
Reporter: Cedric Reichenbach Assignee: Federico Grilli
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: 0d
Time Spent: 5h
Original Estimate: Not Specified

Attachments: Zip Archive findbar-masses-1550041741427-before-mlearn6.zip     Zip Archive findbar-masses-1550041932235-after-mlearn6.zip     PNG File rankingsAfterMlearn6.png     PNG File rankingsBeforeMlearn6.png    
Issue Links:
Relates
relates to MLEARN-5 Make neural network storage and Resul... Closed
relation
is related to MLEARN-9 Training neural network sometimes fai... Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:
Epic Link: Periscope improvements
Sprint: Foundation 4
Story Points: 5

 Description   

The issue was uncovered while doing load and performance tests on periscope with a RankingNetworkStorageStrategy per user (the default one, at the moment). 
In a concurrent scenario with 50 different users, we should end up with 50 different nodes in the rankings workspace, one for each user and their neural network.
At the end of the tests, the actual nodes stored were less than 50, meaning that for some users their NN was not saved after the training step, as it was supposed to be.

Currently, we only use one global debouncer to limit how often ranking neural networks are stored (see RankingNetworkStorage).
As a result, multiple storage requests from different rankers may end up in all of them but the last one being dropped.

Instead, each separate ranker should be handled by a dedicated debouncer, meaning that we limit how often ranking information is persisted per individual ranker.

Implementation ideas

  • RankingNetworkStorage could be non-singleton, i.e. one instance for each ranker.
  • RankingNetworkStorage could keep track of multiple debouncers (might be complicated though).
  • Debouncing could be somehow handled by the strategy.


 Comments   
Comment by Hieu Nguyen Duc [ 13/Feb/19 ]

QAed on bundle "magnolia-enterprise-pro-demo-bundle-6.0.1-20190211.172442-328-tomcat-bundle.zip" by running loadtests before and after the fix. The response time between those look merely similar. The nodes created after the fix is exactly 50 which is correct.

Before

After

 

 

Generated at Mon Feb 12 02:28:59 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.