[MLEARN-11] Configuration to reduce the size of networks Created: 16/Jan/19  Updated: 25/Mar/19  Resolved: 20/Mar/19

Status: Closed
Project: Machine Learning
Component/s: None
Affects Version/s: None
Fix Version/s: 1.1

Type: Story Priority: Neutral
Reporter: Federico Grilli Assignee: Federico Grilli
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: 0d
Time Spent: 2d
Original Estimate: Not Specified

Issue Links:
Relates
relates to MGNLPER-90 LRU Policy instead of FIFO Closed
relation
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:
Account:
Epic Link: Periscope improvements
Sprint: Foundation 6, Foundation 7
Story Points: 2

 Description   

Make the number of output labels configurable - currently it is hardcoded to 10'000 (https://git.magnolia-cms.com/projects/ENTERPRISE/repos/machine-learning/browse/periscope-result-ranker/src/main/java/info/magnolia/periscope/rank/ml/NeuralNetworkResultRanker.java#65).

Default value: 10'000

Original Ticket:

As a further measure to mitigate possible memory issues, we could reduce the size of networks, for example reducing the max number of output units (labels) from 10k to 1k would likely shave off ~70% of its size. Here some possible downsides

  • Shrink the output layer from 10k to 1k -> we would "forget" rankings more quickly, that is, once we've seen 1001 different results, the first one would be forgotten. It's a bit hard to estimate how soon that would be in a typical setup, but 10k certainly feels safer.
  • Ignore non-printable ascii characters and perhaps uppercase letters -> ideally no effect on accuracy since those are useless anyway. But we'd only save a small percentage (maybe 10%?) and have quite a bit of implementation work.
  • Shrink hidden layers -> that's a gamble, hard to predict.


 Comments   
Comment by Antti Hietala [ 23/Jan/19 ]

fgrilli / creichenbach, can you explain what shrinking the output layer (from 10k to 1k labels) would mean to a user? What does forgetting rankings mean?

Comment by Cedric Reichenbach [ 23/Jan/19 ]

ahietala The ranking system works in a way that it gets a list of results and applies a relevance score to each one of them based on the search query. Now, in order to improve that scoring over time, it memorizes results and learns from past selections, and the memorized result set is limited in size. For example if you often search for "pa" and select "Pages", it will give a higher and higher score to that same result for similar searches in the future.

If the set of memorized/trained-on results is full and new ones arrive, we drop the oldest result with its scoring to make some space. For example if our size limit is 3 and we already have ["Pages", "Assets", "Tours"], then a new result "Contacts" arrives, we'll drop the oldest one and forget any past-learned ranking, so our new memorized list is ["Contacts", "Assets", "Tours"]. Of course, if now "Pages" turns up again in the future, the carousel keeps on turning and another result is dropped in favor of "Pages", but previous rankings will be lost, meaning it's again ranked low (or randomly, to some degree).

Now in our case with a limit of 10k, the first time we'll drop a result ranking will be as soon as we see the 10001st different result, if ever. "Seeing" in this context means anything that's ever listed in Find Bar results, which is a maximum of 100 at a time, distinguished by result title.

So to give you an idea of magnitude, this would mean triggering 100 searches with completely pairwise different/unique results until the quota is reached. Or in other words, Find Bar must have listed 10k different results/things (apps, JCR nodes) to get to this point. With a 1k limit, that would mean 10 of those completely unique result sets (1k different results).

To check out how many labels your local setup currently has memorized, count the number of properties on /modules/periscope-result-ranker/persistence/labels in the configuration workspace - all the labels are listed there (not in order though). The best simulation would be to have a long-living instance and count it there. I just check out my local ee-pro-demo one I've been using for development the past few days, and it has 599 entries so far.

I hope my explanations make sense, otherwise feel free to ask more.

Comment by Federico Grilli [ 24/Jan/19 ]

creichenbach Would reducing the labels also reduce the size of the file (I guess that's the "training dataset"?) that we store in JCR and gets loaded in memory by DL4J? Cause that file seems to be the heavy weight there (~13MB in our demo), not much the labels which occupy just a few KB?

Comment by Cedric Reichenbach [ 24/Jan/19 ]

fgrilli I think so. If I'm not mistaken, that file consists mostly of (trained) "parameters" (it's not the training dataset though), which is for the most part input weights of the network (one per "edge" of the graph). So the number of input weights usually dominate in-memory as well as persisted file size.

FWIW, this illustration outlines the configuration of our neural network, n is the number of units/nodes we'd change, in the last layer: Result Ranker Neural Network (Illustration)

Comment by Simon Lutz [ 24/Jan/19 ]

After discussion with creichenbach, an option would be to remove results by a LRU policy (last recently used) rather than FIFO. With this we would still keep the most often used ones even when we'd reduce the labels.

Comment by Hieu Nguyen Duc [ 25/Mar/19 ]

MLEARN-13 found while doing QA.

Generated at Mon Feb 12 02:29:02 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.