XMLWordPrintable

    • Icon: Sub-task Sub-task
    • Resolution: Done
    • Icon: Neutral Neutral
    • 1.1
    • None
    • None
    • Foundation 2, Foundation 3

      ResultRankerProvider will keep a mapping of users and NeuralNetworkResultRanker(s), as the neural network and labels are now per user.

      Such mapping will be created lazily upon user's first search. Problem is, how and when to clean up such mapping when a user logs out. Cleaning up is essential because, internally, NeuralNetworkResultRanker uses deeplearning4j which may allocate significant amounts of "off-heap" memory, that is not managed by the JVM (on our demo a neural network takes ~13MB per user).
      Removing a ranker from the mapping, makes it eligible for garbage collection which in turn should cause deeplearning4j’s memory management to deallocate unused memory.

      See also https://deeplearning4j.org/docs/latest/deeplearning4j-config-memory

      Ideally, NeuralNetworkResultRanker should be an admincentral scoped component whose lifecycle ends with a user session but machine-learning module has no dependency on UI.

      One solution could be keeping the mapping in a cache with a LRU eviction policy.

      An alternative solution would be to have one Periscope instance per user (instead of being a singleton):
      PRO

      • no need to mess with keeping a map of users/rankers and clean it up
      • one instance of Periscope + NeuralNetworkResultRanker per logged in user, after logout the instances can be garbage collected and memory used by deeplearning4j freed as a consequence.

      CONTRA

      • Periscope is the entry point to the underlying search engine, so most properties of and services provided by this class are globally unique. With multiple instances, we end up with several duplicates of almost the same object with just a different ranker. It looks like we're escalating the per-user differentiation too high up.

      -------

      UPDATE

       ResultRankerProvider is replaced with a factory explicitly bound to user.
      The memory consumption part is tackled by MLEARN-5 and combines a cache approach in the factory with a pluggable storage strategy in order to mitigate possible performance issues.

            fgrilli Federico Grilli
            fgrilli Federico Grilli
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: