Page 1 of 1

How to use JobCluster setup function with Cmd class extension

Posted: Tue May 18, 2021 8:42 pm
by bsync
I have an extension to the cmd.Cmd python command shell class that functions something like the following:

MyClass(cmd.Cmd):

def do_submit_job(...):
// dispy client logic happens here
cluster.submit(...)

def do_analysis(...):
//dispy node work happens here

if __main__:
cluster = JobCluster(MyClass.do_analysis, depends=[MyClass])
myCmd = MyClass()
myCmd.do_submit_job(...)
cluster.wait()

When __main__ is run on the client machine the analysis job does in fact get submitted to the dispy node machine but I immediately get an error on the dispynode machine to the effect: NameError: 'cmd' undefined.

I assume dispynode is having trouble constructing a MyClass instance to call do_analysis on?

I tried providing a 'setup' function that looked like:

def setmeup():
global cmd
import cmd
return 0

but to no avail...still get NameError: 'cmd' undefined.

Appreciate any help?

Thanks,
Travis

Re: How to use JobCluster setup function with Cmd class extension

Posted: Tue May 18, 2021 11:40 pm
by Giri
I am guessing 'cmd' is a module. If so, that needs to be included in 'depends' as well. Essentially, everything that is needed to create the environment at client for sending objects should be sent to the nodes.

If you tried 'setmeup', note that this should be specified with 'setup=setmeup' to JobCluster. If done so, it should work fine, unless nodes run Windows. As mentioned in 'node_setup' example, modules can't be global with Windows as they can't be serialized.

It may be easier to simplify to get it working and then add features.

Re: How to use JobCluster setup function with Cmd class extension

Posted: Thu Jun 03, 2021 5:03 pm
by bsync
Appreciate your efforts with dispy.... I am finding it to be very capable.

The issue I ran into with the Cmd class extension seems like a small oversight which could be remedied easy enough. To be clear, the 'cmd' module comes with python3 so I assume it does NOT itself need to be serialized to the dispy node. It seems like the problem occurs when a class instance needs to be deserialized on the dispy node and that class happens to extend another builtin class whose module has not yet been imported. It is different when the import can happen inside a method of the class but when the class itself needs a base class definition that must be imported that import needs to happen immediately or the deserialized code will fail on the node.

I created a pull request on GitHub for you to examine my rudimentary approach to solving this issue. You may have a better solution in mind but the changes I made to your obj_instances.py example will at least help illuminate the problem:

https://github.com/pgiri/dispy/pull/220

Re: How to use JobCluster setup function with Cmd class extension

Posted: Thu Jun 03, 2021 11:13 pm
by Giri
Thanks for your pull request. I think I understand the problem you are trying to solve. Let me take a look your pull request / think about solving this in general (so it can be used for other purposes). It may take couple of days, say by early next week. If you need this addressed sooner, let me know / email me.

Re: How to use JobCluster setup function with Cmd class extension

Posted: Fri Jun 04, 2021 12:03 am
by Giri
I am wondering if you can use 'setup' feature to create 'MyClass' (instead of using 'depends').

Re: How to use JobCluster setup function with Cmd class extension

Posted: Fri Jun 04, 2021 1:55 pm
by bsync
Giri wrote: Thu Jun 03, 2021 11:13 pm Thanks for your pull request. I think I understand the problem you are trying to solve. Let me take a look your pull request / think about solving this in general (so it can be used for other purposes). It may take couple of days, say by early next week. If you need this addressed sooner, let me know / email me.
You are welcome and no hurry here I am sure the pull request could be improved on...just the first thing that came to mind for my immediate problem.
Giri wrote: Fri Jun 04, 2021 12:03 am I am wondering if you can use 'setup' feature to create 'MyClass' (instead of using 'depends').
Now that you mention it, yes, I suspect if you marked it as 'global' the MyClass definition itself could be established from inside the setup method. Might get unwieldy if you had a very large class or classes but that would probably work. In that case you might only need to show folks how to do it in the obj_instance.py example.

Another possible solution might be to provide special handling for import strings in the depends list. If a string looks like an 'import <module>' statement perhaps you could prefix it to the compute.code unmodified so that all 'import <module>' statements get executed first just like a normal python script. I have created another pull request employing that logic in case you like it better:

https://github.com/pgiri/dispy/pull/221

Re: How to use JobCluster setup function with Cmd class extension

Posted: Mon Jun 07, 2021 4:59 am
by Giri
I committed a fix that allows client to specify 'init_depends' function in 'depends'. I will update documentation for this feature with 4.15.1 release (probably in a week or so, as I have some changes pending to commit for next release).

Re: How to use JobCluster setup function with Cmd class extension

Posted: Mon Jun 07, 2021 6:33 pm
by bsync
Giri wrote: Mon Jun 07, 2021 4:59 am I committed a fix that allows client to specify 'init_depends' function in 'depends'. I will update documentation for this feature with 4.15.1 release (probably in a week or so, as I have some changes pending to commit for next release).
Looks great. Small code change consistent with the existing code and opens up the full potential of using imported modules. Looking forward to using dispy on our project. Thanks for all the hard work.