[MGNLSCH-15] Allow tasks to specify cluster node on which to run in clustered configurations Created: 17/May/10  Updated: 23/Jul/12  Resolved: 20/Feb/12

Status: Closed
Project: Scheduler
Component/s: None
Affects Version/s: 1.3
Fix Version/s: 1.4.3

Type: Improvement Priority: Major
Reporter: Jan Haderka Assignee: Ondrej Chytil
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Java Source File AbstractClusterLockCommand.java    
Issue Links:
causality
is causing MGNLSCH-24 Cluster-aware job functionality can b... Closed
duplicate
is duplicated by MGNLSCH-23 Support cluster ID in configuration o... Closed
relation
is related to DOCU-299 Running scheduled jobs on a particula... Closed
is related to MGNLSCH-25 It should be possible to run a job on... Closed
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:

 Description   

Right now when running Magnolia in fully clustered environment (i.e. with clustered config workspace) scheduled tasks are launched on all nodes. This leads to problems when such tasks include content modification and locking. It should be possible to specify at which node such tasks should run or at least say whether to run on all or only on one.



 Comments   
Comment by Danilo Ghirardelli [ 04/Jan/12 ]

If you are interested, I created the attached command, that guarantees the execution of a command (or better, a job) on a single cluster node, using Jackrabbit's clusterwise locks. The node on which the job will run is not forced, just the first one that gets the lock is the one that will run the job, the others will quietly skip.
There is a slight chance that the lock is not released if the node dies abruptly during the job execution, but forcing a timeout should solve the problem (unlike session locks, cluster locks are not automatically released). The point is that we should have a dedicated repository for jobs locks, maybe with auto-created nodes with the job name, that will be the lock paths. At the moment I used the jobs definition nodes in the config repo as lock paths.

Comment by Ondrej Chytil [ 20/Feb/12 ]

The change is active only when magnolia.clusterid property is set in magnolia.properties file. This value is then compared with clusterId parameter which should be placed under params node in job config. ClusterId parameter has to be set if magnolia.clusterid is not null - otherwise job will never be launched.
Non-cluster enviroments will work as before.

Comment by Danilo Ghirardelli [ 20/Feb/12 ]

Does this mean that if I have a clustered environment, now I have to set the clusterId parameter or all the jobs won't run? And if set it, every job will run only on one node of the cluster and not on all nodes as before? If so, this is quite a breaking change...

Comment by Jan Haderka [ 20/Feb/12 ]

two things:

  • warning should be just info. And it should say that the command will be executed only in cluster node XYZ.
  • the mgnl context should not be set at all outside of the if() clause so you don't need to reset it twice but only when it's actually used.
Comment by Ondrej Chytil [ 20/Feb/12 ]

Hello Danilo,

this cluster-awareness is dependent on magnolia.clusterid property so if you have your cluster ID set just in Jackrabbit config file everything will work as before (partly because different ticket - MAGNOLIA-3971).
Anyway when you set this property you really have to config clusterId parameter in all scheduled jobs.
To your second question - this just prevents running same job on multiple clustered instances. In case of cluster enviroment you can run one instance of the same scheduled job and for non-cluster you have to run independent jobs anyway. At least I hope I understood you question correctly.

Comment by Danilo Ghirardelli [ 20/Feb/12 ]

Explaining the use case might be simpler: I have (like most people, I think) a clustered environment, based on magnolia.clusterid property, usually magnolia.clusterid = ${{

{magnolia.servername}

}}, to make things quicker and easier for all cluster nodes. And I have jobs, that currently are meant to run on every single node of the cluster, not knowing other cluster details, just like I think a cluster is meant to be.
With your change, to keep everything running as it is now, I have to create n copies of the job and configure them for each node of the cluster, and then adding copies when a cluster node is added, or remember to edit all the copies when a parameter changes. If I understood correctly, now there is no way to have a job simply running on all cluster nodes.
Then my question is why the clusterId is so forced? Wouldn't it be simpler having the job running on all nodes just like now if no clusterId parameter is set (thus covering both the single-node case and the current behaviour) and make it run on the appropriate node only when the clusterId parameter is set?

Comment by Danilo Ghirardelli [ 20/Feb/12 ]

Just another consideration, slightly related. In my experience, when I set up a Magnolia cluster, is usually done with a bunch of cloned virtual (or physical) machines, that are identical. So, when coding a scheduled job, I usually have only two use cases:

  • a job that should be run on all the (active) cluster nodes. Usually this is something like temporary file cleaner or something related to every single machine.
  • a job that must be run on a single active node of the cluster, whatever it is. This usually operates somehow on the Jackrabbit clustered data, any node that runs the job will access the same data and change it for all other nodes, and the node that runs the job is not really important, it just need to be one among the active ones when the trigger fires. Specifying a single node for the job would in this case skip the job if the designated node is offline for any reason when the trigger fires, but in my opinion a well configured cluster should not behave differently depending on which nodes are online...

If I understood correcty, your change kills my first use case but does not cover the other use case, so I was saying the change is "breaking".

Comment by Ondrej Chytil [ 21/Feb/12 ]

Since this can really cause issues in certain enviroments I logged MGNLSCH-24 - cluster-aware jobs will be active just if parameter is set in job definition and independent on cluster definition.
New Scheduler version will be available later this day.
Thanks for pointing this use-case to us Danilo.

Generated at Mon Feb 12 10:45:04 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.