dispy Releases since version 3.0
Posted: Wed Jan 27, 2021 5:21 am
Below is list of release announcements / short summary of changes of dispy since version 3.0.
- Version 4.15.2 releaed (2022-10-06)
- Added support for Java programs as computations.
- Fixed a race condition with 'resetup_node'.
- Version 4.15.1 released (2022-01-26)
- Added '--init=filename' option to dispynode. This given file may contain 'init'
and/or 'close' function(s) that are executed when dispynode is starting and
shutting down. Any variables declared global in 'init' are made available to all
clients. - Added 'init_depends' feature to JobCluster. If clients specify
'init_depends' function in 'depends', then this function is
executed at nodes before other 'depends'. This allows clients to
load any module at node etc. that may be needed for dependencies
in 'depends'. - Fixed in-memory processing in macOS.
- If loading 'depends' fails on a node, exception trace is sent
back to client so user can understand / fix issue. - Fixed dispyscheduler to work with httpd.
- Fixed dispyscheduler so all CPUs are used when running multiple clients.
- Added '--init=filename' option to dispynode. This given file may contain 'init'
- Version 4.15.0 released (2021-05-18)
dispy version 4.15.0 has been released. Major changes since previous release are:- Default ports used by dispy now are 9700-9703 (instead of
61590-61593 in previous versions). For firewall / port forwarding,
9700 to 9703 must be allowed / forwarded appropriately. - Added support for KeyboardInterrupt in job computations with
Windows. However, the interrupt doesn't interrupt currently
executing statement but when control returns to interpreter for
next statement. - Job cancel/termination now works even with suid/sgid (earlier
versions didn't support this). - Fixed terminating jobs under Windows. With earlier versions, a
job termination may terminate dispynode itself. - If 'netifaces' module is not available and no 'host' parameter
is given, default 'hostname' is used to get IP address to use (by
client, scheduler, node). - Fixed IP address parsing with OS X.
- Fixed process termination when 'clean' option is used with dispynode.
- Even if 'clean' option is not used when starting dispynode,
files saved by previous run of dispynode are now removed except
for any job results files (that can be retrieved by clients with
'recver_jobs' function). - If dispynode is started with '--serve=0', dispynode quits
right after processing '--clean' option. This combination of
'--clean --serve=0' can be used to terminate any currently running
dispynode, run cleanup and quit. - Fixed installation with new pip version.
- Default ports used by dispy now are 9700-9703 (instead of
- Version 4.14.0 released (2021-03-17)
dispy version 4.14.0 has been released. Major changes since previous release are:- 'callback' parameter to JobCluster and SharedJobCluster have
been changed to 'job_status' to avoid confusion with
'cluster_status' which has similar purpose. Moreover, these are
simply notification mechanisms, not really callbacks executed from
scheduler (e.g., scheduler does not wait for such notifications to
finish before 'cluster.wait()' finishes). - When a running job is canceled, dispynode now raises
KeyboardInterrupt in the computation first. If computation doesn't
handle it, usually the process is terminated. However, computation
can process this exception, for example, to send appropriate
result to client, or save state of computation and send that to
client etc. Computations have about 5 seconds after receiving
KeyboardInterrupt before it is terminated. - Added 'longrun.py' example to show simple use case that
handles KeyboardInterrupt as described above.
- 'callback' parameter to JobCluster and SharedJobCluster have
- Version 4.13.0 released (2021-02-25)
dispy version 4.13.0 has been released. Major changes since previous release are:- Renamed 'ip_addr' and 'ext_ip_addr' parameters to JobCluster,
SharedJobCluster, dispynode etc. to more appropriate names 'host'
and 'ext_host' to reflect that their values can be host names as
well as IP addresses. - Fixed 'save_config' option with Python 3.7+.
- dispy client / scheduler, dispynode and dispynetrelay can now
work with multiple remote networks (by listing all such networks
with appropriate 'ext_host' parameters). - Fixed examples 'replace_inmem.py' and 'node_setup.py' to use
basename of files transferred.
- Renamed 'ip_addr' and 'ext_ip_addr' parameters to JobCluster,
- Version 4.12.4 released (2021-01-27)
dispy version 4.12.4 has been released. Major changes since previous release are:- Fixed exception when Cluster is closing when used with recent pycos versions.
- Fixed occasional KeyboardInterrupt exception traceback when dispynode is terminating.
- Until now dispynode under Python 2.7 sometimes didn't terminate with 'quit' command; a second
'quit' command was required to terminate. This is now fixed.
- Version 4.12.3 released (2020-09-09)
dispy version 4.12.3 has been released. Major changes since previous release are:- dispynode under Windows doesn't force daemon mode. Command line now can be used to see the
status of computations, change CPUs, quit disypynode etc. - It is possible now to use client and nodes with different versions of Python 3 even if pickle
protocol versions are different. See pycos configuration variable PickleProtocolVersion; dispy
uses pycos for communication so setting this variable appropriately will allow serialization /
deserialization even if different Python versions use different default protocol versions (e.g.,
Python 3.7 and 3.8 use different protocol versions). Note that this will not work between Python
3 and Python 2. - This version requires pycos 4.9.0 due to changes in pycos's Task exit value semantics.
If dispy is installed manually, please upgrade pycos to 4.9.0 as well.
- dispynode under Windows doesn't force daemon mode. Command line now can be used to see the
- Version 4.12.2 released (2020-05-14)
dispy version 4.12.2 has been released. In this version- Fixed white space issues in httpd module in Python 3 so it works (same as httpd in Python 2).
- Version 4.12.1 released (2020-04-11)
dispy version 4.12.1 has been released. In this version- Fixed dispynode in version 4.12.0 to run setup only client uses this feature; otherwise, dispynode crashes. With this programs that don't use setup, includingsample.py in examples, work now.
- Version 4.12.0 released (2020-03-11)
dispy version 4.12.0 has been released. In this version- Added resetup_node method to JobCluster. This method can be used to run cleanup and setup function with different depends and arguments to prepare the node for a new run of jobs. This is useful to, e.g., replace in-memory data to run jobs with data.
- pulse_interval can now be set to as small as 0.1 (seconds).
- Version 4.11.1 released (2019-11-25)
dispy version 4.11.1 has been released. In this version- Fixed dispyadmin.py to parse ports in configuration file (config.py) as integers.
- Version 4.11.0 released (2019-06-25)
dispy version 4.11.0 has been released. In this version- Default ports used have been changed from 51347-51349 to 61590-61592. This may require appropriate changes to firewall setup.
- Instead of configuring different parts of dispy with different ports (e.g., dispynode with node_port, dispyscheduler with scheduler_port etc.), now only base dispy port (61590) needs to be given to all of them.
- Ports and other parameters can be changed in config.py module in dispy for convenience (useful in site-wide configuration).
- Added dispyadmin, an administrative interface to manage / control nodes. All the nodes, even the ones not used by any scheduler, can be controlled with this program. Only nodes started with --admin_secret option, nodes can be controlled with dispyadmin; otherwise, nodes are shown in the web interface but can't be managed.
- Renamed dispy.py command line program to dispy_cmd.py.
- Added depends and setup_args optional parameters to NodeAllocate. This feature can be used to send node and computation specific dependent files and to use node and computation spefiic arguments to run setup function. setup and cleanup functions can no longer be partial functions.
- Added relay optional parameter to dispy_provisional_result and dispy_send_file functions (that can be used in computation functions). If this parameter is False (default), results and files are sent from nodes to client directly, even if SharedJobCluster is used. If this parameter is True, they are sent via dispyscheduler, which is required if nodes can't communicate with client directly (e.g., if secret between dispyscheduler and nodes is different from that between dispyscheduler and client or SSL setup is different).
- Version 4.10.6 released (2019-03-27)
dispy version 4.10.6 has been released. In this version- Added submit_job_id and submit_job_id_node methods to cluster. These methods take 'id' argument that is set to job's id attribute when job is created by scheduler so that when callback methods are called, the id can be used by those methods. If id is initialized by client after submit methods return job instance, the callback may be called before client can initialize (as dispy and callback run in other threads).
- Callback functions are now called with copies of jobs returned with submit methods (i.e., not the same instances). Thus callback functions should use id attribute of jobs to distinguish jobs instead of the job instance itself; if jobs need to be stored in a dictionary, for example, they should be managed with id attribute (which should be unique for each job created) and not job itself.
- If sending a job to a node fails (either node is disconnected from network or node is terminated), that node is removed from scheduler so if node is discovered later, it is initialized and reusable.
- Fixed job termination when suid feature is used (so one computation doesn't access another computation's files).
- Version 4.10.5 released (2019-01-08)
dispy version 4.10.5 has been released. In this version- DispyNode instances now have tx and rx attributes that maintain amount data sent to / received from that node. print_status method and web interface now show this information.
- Version 4.10.4 released (2018-12-21)
dispy version 4.10.4 has been released. In this release- Added support for Python 3.7.
- Version 4.10.3 released (2018-12-17)
dispy version 4.10.3 has been released. In this release- Fixed job scheduler to handle jobs submitted with submit_node.
- Added discover_nodes method to JobCluster that clients can use to establish communication with nodes that may not be found when cluster initialized.
- Version 4.10.2 released (2018-11-20)
dispy version 4.10.2 has been released. In this version- Added option force_cleanup to dispynode so all files transferred or created by computation are removed when computation is closed, even if computation disables cleanup.
- Fixed processing of provisional results so jobs are not dropped after first such result.
- Fixed close_node method.
- Version 4.10.1 released (2018-11-05)
dispy version 4.10.1 has been released. In this release- Added support for suid and sgid features in Unix so a computation can not access files of other computations. See "Isolate Computation Files" for details and how to use this feature.
- id attribute of jobs is now set by dispy to successeive integers starting with 1.
- dispynode checks if any job process is dead at zombie_interval interval and if so, sends reply for it.
- If client / scheduler determines that a node is dead (or disconnected), jobs scheduled on it are not discarded now, so if a node comes back, client can accept replies for those jobs.
- set_node_cpus, deallocate_node etc. now accept DispyNode instances as arguments (in addition to host name or IP address, as done before).
- Version 4.10.0 released (2018-10-02)
dispy version 4.10.0 has been released. In this version- Added support for *BSD (FreeBSD, NetBSD etc).
- Run setup and cleanup functions in a subprocess (instead of in main dispynode process). This prevents computations from interfering each other (i.e., there are no side effects from one computation loading modules etc.).
- Semantics of return value from setup function has changed. See setup function description in dispy JobCluster for details.
- Added --unsafe_setup option to dispynode to run setup and cleanup functions in main dispynode process, as used to be the case until 4.10.0 release.
- Fixed SharedJobCluster so accept remote address as scheduler_node parameter.
- Improved process termination when client cancels jobs and when clean option is used to terminate processes in prior run. If psutil module is available (which is strongly recommended), dispynode makes additional checks to make sure processes being terminated are from previous dispynode run (when clean option is used).
- Version 4.9.1 released (2018-07-26)
dispy version 4.9.1 has been released. In this version- Fixed host name resolution when netifaces module is not available.
- Changed option listen_port for dispynetrelay to relay_port.
- Version 4.9.0 released (2018-07-16)
dispy version 4.9.0 has been released. In this release- Added secret option and SSL support for dispynetrelay so it communicates only with schedulers that use same settings with those options.
- Version 4.8.9 released (2018-07-09)
dispy version 4.8.9 has been released. In this version- Fixed processing ext_ip_addr option so even if netifaces module is available; in 4.8.8 this option didn't work if netifaces module is available.
- Added ipv4_udp_multicast and daemon options to dispynetrelay.
- Version 4.8.8 released (2018-07-05)
dispy version 4.8.8 has been released. In this version- Added ipv4_udp_multicast option to control whether to use multicast or broadcast for UDP (to discover nodes) with IPv4.
- Fixed shutdown so it cleans up completely, including shutting down pycos. Now it is possible to shutdown dispy and start another dispy (not just another cluster) in the same program.
- Set multicast group address to a distinct from (in earlier releases this was a simple, generic form, that may interfere with other devices on the network).
- Version 4.8.7 released (2018-05-08)
dispy version 4.8.7 has been released. Changes since last release are:- Fixed shutdown method to delete fault recovery file when SharedJobCluster used (and client doesn't crash).
- In some cases with IPv6 sockets, attribute IPV6_V6ONLY is not defined, causing crash during start; this is now fixed.
- Fixed parsing of ip_addrs and ext_ip_addrs fields saved in config file with dispynode.
- Fixed dispynode so changes to os.environ by one client don't interfere other clients.
- Version 4.6.8 released (2018-04-19)
dispy version 4.6.8 has been released. In this version- Added deallocate_node and close_node methods to cluster.
- Added controls to web interface to allocate / deallocate / close node(s).
- Added "release" command to dispynode to release from current client if client is not active.
- Version 4.8.5 released (2018-03-26)
dispy version 4.8.5 has been released. In this version- Fixed dispyscheduler crash when started in 'daemon' mode.
- When a node is rediscovered (i.e., after a node is lost and comes alive), use it again for any active clusters.
- When jobs are rescheduled or abandoned after a node is lost, call 'status' function (if provided) with appropriate job argument.
- Fixed issue with binding to UDP with Windows (issue #111])
- Version 4.8.4 released (2018-01-10)
dispy version 4.8.4 has been released. In this version- Fixed client crash when SharedJobCluster is closed with Python 3.
- Improved warning messsages when client's UDP server binding to its port fails.
- Version 4.8.3 released (2017-08-01)
dispy version 4.8.3 has been released. In this version- Fixed parsing of ext_ip_addr option to dispynode.
- Fixed recover_jobs function (which can be used for fault-recovery in case client crashes or loses network connection to nodes).
- recover_file option to recover_jobs is now optional; if this option is not given, latest file with prefix _dispy_ is used (default falut-recovery files are of the form _dispy_*).
- Version 4.8.2 released (2017-06-27)
dispy version 4.8.2 has been released. In this version,- License has been changed to Apache 2.0 (from MIT).
- If netifaces is not available, appropriate address is used for binding to UDP for broadcasting.
- Version 4.8.1 released (2017-06-07)
- dispy version 4.8.1 has been released. In this version crash with dispyscheduler (due to conflicts of names of variables and methods with change from asyncoro to pycos) has been fixed.
- Version 4.8.0 released (2017-05-31)
- dispy version 4.8.0 has been released. In this release asyncoro module has been replaced with pycos module.
- Version 4.7.7 released (2017-05-24)
- dispy version 4.7.7 has been released. This version fixes crash of dispyscheduler when starting.
- dispy 4.7.6 released (2017-05-03)
- dispy version 4.7.6 has been released. In this version httpd and dispynetrelay have been fixed to support IPv6.
- dispy 4.7.5 release (2017-04-20)
dispy version 4.7.5 has been released. In this version- Fixed IPv6 for Windows and OS X. IPv6 with Python 2.7 under Windows needs package win_inet_ptorn (it is not required for Python 3.6+). For IPv6 under OS X, package netifaces is required. Even if not required with other platforms, it is recommended to install netifaces to select suitable interface from multiple configured interfaces.
- Fixed dispyscheduler's httpd functionality (to monitor computations scheduled with SharedJobScheduler)
- Version 4.7.4 released (2017-04-06)
dispy version 4.7.4 has been released. In this version- Fixed (default) host name resolution under IPv6.
- Path and directory names in unicode are supported.
- Added show_job_args parameter to httpd module to control whether job arguments should be shown in browser or not.
- Version 4.7.3 released (2017-03-20)
dispy version 4.7.3 has been released. In this version- Fixed fault recovery of jobs (with dispy.recover_jobs function) after client crashes / loses connection to nodes.
- Fixed parsing and initializing configuration (with --config) saved in file by dispyscheduler.
- Version 4.7.2 released (2017-03-15)
dispy version 4.7.2 has been released. In this version- Fixed httpd module crash when vieweing jobs (in 'Node' page).
- Added timeout optional parameter to wait and close methods of JobCluster and SharedJobCluster methods. timeout, if given, is maximum time in seconds to wait before all pending jobs are completed.
- Added terminate optional parameter to close. If it is set to True any pending jobs are cancelled/terminated before cluster is closed.
- Added save_config and config options to dispyscheduler to save and load configuration parameters in a file.
- Added ping_interval option to dispynode to send periodic messages to discover nodes.
- Version 4.7.1 released (2017-01-18)
dispy version 4.7.1 has been released. In this version- SSL has been fixed. The fix is in asyncoro 4.4.0 version. If dispy is installed with pip, asyncoro will automatically be upgraded. Otherwise, please upgrade asyncoro to latest version.
- Python function computations can now refer to 'dispy_node_name' and 'dispy_node_ip_addr'. These give name and IP address of node where computation is being executed.
- dispynode will remove any files generated by computation and left behind when computation is closed. Earlier such files were left behind by dispynode, taking up disk space.
- dispyscheduler has new option cleanup_nodes that if set (to True), will cause dispynode to cleanup (i.e., remove any files transferred and generated by computations) even if computations set cleanup=False. This can be used when nodes are being shared by multiple users and computations don't leave beind any files, which may take up disk space.
- Added support for IPv6. It is strongly recommended that netifaces module is installed to use IPv6.
- Version 4.7.0 released (2016-12-20)
dispy version 4.7.0 has been released. In this version- Jobs (DispyJob instances) created by cluster.submit no longer have args and kwargs attributes. These attributes used to store parameters used in job creation (i.e., parameters to cluster.submit). When many jobs are created and not freed as soon as a job is finised, it is likely that the memory used to store such arguments may cause problems at the client / scheduler. Now these arguments are still stored in DispyJob instances, but are cleared by scheduler as soon as possible (e.g., when job is finished / terminated). If access to parameters used in job creation is necessary, then they can be maintained in client program using id attribute of job. In any case, when submitting large number of jobs, consider using bounded_submit.py to schedule only enough jobs to keep the processors busy and not all at once (which can cause memory issues).
- When jobs are terminated (killed) at nodes, their status attribute is now set to DispyJob.Terminated. In the past few releases jobs may not always be terminated and even if they did, their status may indicate jobs finished and not terminated.
- Version 4.6.18 released (2016-11-30)
dispy version 4.6.18 has been released. In this version- Fixed dispy client and scheduler to use add_cluster etc. to use as coroutines instead of regular functions, as they use iterators shared with other coroutines. This prevents potential crashes when shared data structures are updated by other coroutines.
- Fixed issue https://github.com/pgiri/dispy/issues/51 with transferring large files with OS X. The fix is actually in asyncoro version 4.3.3. If dispy is upgraded with pip, asyncoro will be upgraded automatically; otherwise, please upgrade asyncoro.
- Version 4.6.17 released (2016-09-28)
dispy version 4.6.17 has been released. In this release- NodeAllocate's allocate method is now called with additional parameter platform, which is value returned by platform.platform() on the node. This can be used to filter / allocate nodes depending on platform when cluster consists of nodes with different platforms.
- Fixed dispynode to terminate when 'quit' or 'exit' commands are given.
- Version 4.6.16 released (2016-08-16)
dispy version 4.6.16 has been released. In this release- Added --client_shutdown option to dispynode. If this option is given, client program can call dispynode_shutdown() in cleanup function to shutdown dispynode.
- --save_config option in dispynode now takes filename argument to save configuration in. In earlier releases --config option (which is used to load configuration) had to be specified to give the filename to save configuration.
- dispynode shutdown improved - if dispynode is terminated (killed or interrupted with Ctrl+C), signal handler cleans up before quitting. In earlier releases (4.6.15 in particular), dispynode left behind some files that prevented next dispynode to start.
- Fixed a race condition with dispyscheduler and client. In earlier releases if a client submitted jobs that took very little time, the scheduler may create new jobs with same ID as some job that finished at dispyscheduler, but not finished at client, causing client to ignore such jobs.
- Version 4.6.15 released (2016-07-19)
dispy version 4.6.15 has been released. In this version- Modules and files transferred to remote servers are saved with same relative paths as at client. With this, modules with multiple files, and submodules etc. can be sent with depends options. The paths are preserved only for files relative to client's current working directory; files with paths not under client's working directory don't preserve paths, as saving them at remote server with such paths may not be possible (due to permissions), or unwarrnated.
- Saving and restoring configuration to initialize options to dispynode (to avoid having to give options every time / on every node) have been fixed.
- Version 4.6.14 released (2016-06-09)
dispy version 4.6.14 has been released. In this version- Job arguments are stored in only DispyJob instance. In earlier versions another copy of arguments is stored in _DispyJob_ internal structure of dispy scheduler. When job arguments are large in space usage (which can happen when large arrays or lists are sent as job arguments), this required almost double the space, as these structures are kept in memory at least until jobs are finished. In this version arguments in _DispyJob_ internal structure refer to arguments in DispyJob structure (instead of saving a serialized copy of arguments). Note that in many examples submitted jobs are kept in a list and processed one after another and the list is never cleared. This is acceptable for simple cases, but when submittting large structures as arguments, it may be necessary to dispose of DispyJob instance, or at least clear job.args and job.kwargs attributes as soon as job is done (e.g., with callback feature) to avoid issues with memory usage.
- Added bounded_submits.py example to illustrate how to submit enough jobs to keep processors busy, but not all at once, which can consume large amount of memory, especially if the job arguments are large structures, as explained above. The example can be esaily customized as appropriate for specific cluster.
- Under some circumstances dispy scheduler may leave connections open, causing failures with "too many files open" issue. This has now been fixed. Thanks to Stylianos Kyriacou for pointing this issue.
- Logging with 'error' or 'warning' levels is fixed; this was broken in 4.0 release. The fix actually is in asyncoro 4.1. If dispy is upgraded with pip, asyncoro will be upgraded as well; otherwise, please upgrade to asyncoro-4.1 (due to other changes, dispy-4.6.14 will require asyncoro-4.1).
- Version 4.6.13 released (2016-04-18)
dispy version 3.6.15 has been released. Following are changes since last release:- Default value for MaxFileSize, which is maximum size of file that can be transferred to dispynode (server) or dispyscheduler, has been changed from 10MB to 0, which means now there is no limit size of files to transfer with the default value. This can be changed with --max_file_size n option (to dispynode and dispyscheduler) to use n as maximum file size allowed.
- By default dispynode removes any files transferred from client when computation is done. However, any files generated by computation were being left behind. Now, all files transferred / generated by computation are removed, unless cleanup is set to False (then it is up to clients to remove the files and directories).
- Swap space information is added to DispyNodeAvailInfo so clients can use it to monitor node / application performance. This information is also shown in web clients when httpd is used.
- Version 4.6.12 released (2016-03-21)
- dispy version 4.6.12 has been released. This release adds support for using psutil module to frequently gather and send node availability status information (availbe CPU as percent, memory in bytes and disk space in bytes) to cluster_status callback. This information is also shown in web browser (if httpd is used). The information is also sent to NodeAllocate's allocate method so clients can filter available nodes depending on computation's requirements.
- Version 4.6.11 released (2016-03-16)
dispy version 4.6.11 has been released. In this relase- dispy_job_depends keyword argument is supported for computations that are programs. dispy_job_depends can be used to send job-specific file(s). Until now this was suported only for computations that are Python functions.
- Fixed dispynode to work with Python 3 under Windows. It seems multiprocessing.Process seems to wait for reading line(s) under Python 3 under Windows if main program reads standard input, so new jobs won't start until 'Enter' is pressed couple of times. For now the solution is to not support input commands with Python 3 under Windows; i.e., dispynode works as a daemon even if it is started from command line.
- Fixed rescheduling of jobs when nodes are detected zombies or when nodes are restarted.
- Fixed node_setup.py example to work under Windows.
- Version 4.6.10 released (2016-03-07)
- dispy version 4.6.10 has been released. This release fixes initializing and closing nodes so a node is not initialized more than once (which then causes node not to respond to future computations due to a spurious computation still pending) and closing nodes is finished before client quits (otherwise the node may not be ready yet for new client if it starts quickly).
- dispy 4.6.9 release (2016-03-02)
dispy version 4.6.9 has been released. In this version- cooperative option has been added to dispyscheduler.py. If this option is given, or if client cluster is marked exclusive, the client(s) can update CPUs of node(s) (otherwise, clients won't be allowed to change CPUs, as this may prevent other clients to run their jobs). If node CPUs are changed by exclusive cluster and cooperative option is not used with dispyscheduler, the CPUs are reset to how they were before that cluster started. If CPUs are changed due to cooperative option, though, it is up to client clusters to cooperate and set/reset CPUs.
- dispynode now removes modules loaded by computations from sys.modules when they are done, so that previous computation's modules are not used by new computations (because they were cached in sys.modules).
- dispy 4.6.8 release (2016-02-15)
- dispy version 4.6.8 has been released. This version implements 3 scheduling algorithms for jobs in dispyscheduler (scheduler for SharedJobCluster):
- fair_cluster_scheduler first picks a node with least load (i.e., jobs running divided by CPUs available), then among the clusters that can use that node chooses cluster that was least recently choosen from last time (i.e., has been pending longest time since last time a job was run for it). It then schedules earliest job created for that cluster on that node.
- early_cluster_scheduler first picks a node with least load, then among the clusters that can use that node chooses cluster that was created earliest (i.e., client created SharedJobCluster earliest). It then schedules earliest job created for that cluster on that node.
The default is to choose earliest created job among all clusters that can use the node with least load.
- dispy version 4.6.8 has been released. This version implements 3 scheduling algorithms for jobs in dispyscheduler (scheduler for SharedJobCluster):
- dispy 4.6.7 release (2016-01-29)
- dispy version 4.6.7 has been released. This version fixes initialization of SharedJobCluster client (broken in 4.6.6 release).
- dispy 4.6.6 release (2016-01-27)
dispy version 4.6.6 has been released. Following is short summary of changes since last release:- In the last couple of releases, node discovery messages are sent from UDP server, by when TCP server may not have been ready to receive response from nodes, causing the client to not detect some nodes; it may have worked most of the time, but due to concurrency of coroutines, not always guaranteed. Now discovery messages are sent only after both UDP and TCP servers are ready to process responses from nodes.
- In previous releases the computations were sent to nodes sequentially. Especially with SharedJibCluster, sending large computations to many nodes may cause unnecessary delays. Now nodes are setup concurrently.
- If servers don't accept a computation (due to issues with computation, for example), appropriate error message is sent back to the client so the issue can be fixed; earlier error code -1 is sent which doesn't help to understand the reasons.
- dispy 4.6.5 release (2016-01-22)
- disp version 4.6.5 has been released. This version fixes timeout issues when transferring large files. In earlier releases MsgTimeout (defined as global variable in dispy module) could be used to adjust (from default of 5 seconds) to avoid timeout error. However, the client could send large amount of data before waiting for nodes to acknowledge the transfer, so adjusting MsgTimeout may not always work. Now MsgTimeout is for transferring at most 1MB of data.
- dispy 4.6.4 release (2016-01-13)
dispy version 4.6.4 has been released. In this version- Fixed sending pulse messages (issue #25) from client to dispyscheduler (used with SharedJobCluster).
- Scheduler uses TCP exclusively to contact nodes; UDP is used only for broadcasting and listeners will respond with TCP. This addresses a few issues: If scheduler is mult-homed and using ext_ip_addr, the nodes reply back on TCP and only the working configuration will be received by the scheduler so it can use that information in further communication. TCP works with port forwarding, less likely to be blocked with firewall etc.
- This release now depends on asyncoro 3.6.7 release., which fixes many issues with socket errors when used with SSL. If dispy is installed/upgraded with pip from PyPI, then asyncoro will be upgraded as well.
- dispy 4.6.3 release (2015-12-29)
dispy version 4.6.3 has been released. In this version:- Added Dockerfile to build docker images to run dispynode program in containers. This fully isolates d ispy so executing arbitrary programs does not affect host operating system.
- Added serve option to dispynode program to quit after serving given number of clients. This option can be used in conjunction with docker images to create same environment for every client.
- If dispynode program is started as background process or daemon option is given, standard input is not read and program won't block waiting to read. Otherwise, commands to stop and restart the service, and to change cpus used for executing computations.
- Changed SharedJobCluster to use port 51347 (which is default port for JobCluster) as default. Earlier versions used any unused (random) port so both the client dispyscheduler (which also uses 51347 port) can be run on same computer. However, using random port makes seting up firewall difficult. To run both client and dispyscheduler on same computer now, either port=0 (or any other specific port) can be passed to SharedJobCluster.
- dispy 4.6.2 release (2015-12-14)
dispy version 4.6.2 has been released. The changes since last release are:- Fixed transferring job dependencies (specified with dispy_job_depends) when using with SharedJobCluster.
- Fixed dispyscheduler so it can now be used with separate "secret"s for nodes and clusters; i.e., options node_secret and cluster_secret can be different)..
- dispy 4.6.1 release (2015-12-07)
dispy version 4.6.1 has been released. The changes since last release are:
- Fixed SharedJobCluster in submitting jobs with arguments (broken in 4.6.0 release).
- SharedJobCluster now uses one port for getting all replies from (remote) scheduler. By giving specific port, and port forwarding (e.g., with SSH), scheduler in remote network can now be used.
- dispy 4.6.0 release (2015-10-28)
dispy version 4.6.0 has been released. The changes since last release are:- Added service_start, service_stop and service_end options to dispynode to allow a node to run jobs during specific hours of the day.
- Added save_config and config optoins to dispynode to save given options in a file and use that file to start dispynode in all machines in a cluster with those options (instead of having to list options for each run).
- Added submit_node option to JobCluster and SharedJobCluster to schedule a job to given node. This method, along with cluster_status callback can be used to implement custom job schedulers in the client itself.
- Added exclusive option to SharedJobCluster. If this option is set to True, that cluster is executed exclusively, without sharing nodes with other clusters.
- An issue since 4.5.3 release with cluster not detecting nodes (apparently seems to occur with Google Cloud) has been fixed; thanks to Stylianos Kyriacou.
- dispy 4.5.5 release (2015-09-08)
- dispy version 4.5.5 has been released. This version fixes Issue #21 - Using same cluster more than once.
- dispy 4.5.4 release (2015-08-13)
dispy version 4.5.4 has been released.- This version supports setup and cleanup parameters under Windows with some limitations (compared to Linux, OS X and other Posix systems, where there are no limitations); these should be Python functions or partial functions. See JobCluster and node_setup.py and node_shvars.py examples for more details on limitations with Windows.
- dispy 4.5.3 release (2015-08-03)
- dispy version 4.5.3 has been released. This version supports passing class instances in client to servers with Python 3 under Windows. Processes started with multiprocessing in Python 3 under Windows use __mp_main__ namespace, so user defined code is executed in that namespace so unpickling/deserialization of objects works.
- dispy 4.5.2 release (2015-07-23)
dispy version 4.5.2 has been released.- Version 4.5 introduced a feature to keep a computation's global variables in a (dictionary) variable in that computation. This feature works with Linux, OS X and other Unix variants, but not with Windows. The implementation of this feature broke Windows, as the variable couldn't be sent as argument to multiprocessing.Process. Version 4.5.2 fixes this issue so dispynode works with Windows again.
- dispy 4.5.1 release (2015-07-08)
dispy version 4.5.1 release fixes issue in 4.5 version with loading httpd module in dispyscheduler. - dispy 4.5 release (2015-06-30)
Short summary of changes in version 4.5:- Fixed dispyscheduler to import httpd module.
- dispynode stores global variables of each computation in its own namespace. With this, if more than once cluster (from same scheduler) is being executed, global variables initialized by one computation doesn't interfere with those of another. Moreover, a computation doesn't need to worry about having to remove global variables in 'cleanup' - they will be thrown away when computation is closed.
- Fixed 'del_cluster' to remove cluster from all the nodes before removing cluster. Otherwise, because 'yield' is used to close node, if multiple clusters are used, a removed cluster can be accessed by scheduler through node's list of clusters. Thanks to Nikola Knezevic.
- dispy 4.4 release (2015-06-07)
Short summary of changes in version 4.4:- 'setup' and 'cleanup' can be partial (Python) functions (in addition to being normal functions). This way nodes can be setup with different functions (i.e.,. partial functions can be called with different arguments for different nodes).
- httpd module and web pages support multiple clusters in the same program. The web pages show each cluster separately, as well as a special cluster that combines information for all clusters in a cluster named '* Combined'.
- dispyscheduler program supports '--httpd' option. When this option is given, httpd server is started so all the clusters currently using the scheduler can be monitored in web browser. The names of clusters have ' @ <client ip="">' appended.</client>
- Added 'timeout' and 'terminate_pending' to 'recover_jobs' function in dispy so fault recovery can be used to close the nodes, if necessary.
- 'nodes' parameter for JobCluster and SharedJobCluster can be given as a list of 'NodeAllocate' objects. This gives more flexibility to allocate nodes, as well as customize allocation (for example, to use fewer CPUs during specific time interval).
- dispynode and dispyscheduler programs have '--msg_timeout' option to adjust socket timeout used to send/receive messages. Default is 5 seconds. If necessary, for example, to send large files over slow network, the timeout can be adjusted with this option. dispy module also has MsgTimeout variable that can be adjusted in the client program appropriately.
- dispy 4.3 released (2015-05-25)
- dispy version 4.3 has been released. Most of the changes since last release are fixes. The one change is simplification of fault recovery. Now the clients store recovery information always and in case of crash, function recover_jobs in dispy module can be used to retrieve the results of jobs and release the nodes. See Fualt Recovery for more details.
- dispy 4.2 released (2015-05-07)
dispy version 4.2 has been released. The changes are:- Fixes issue with distributing computations as programs (Bug 11)[https://sourceforge.net/p/dispy/bugs/11/].
- Added 'close_node' function in dispy module so client can close currently running computations on the node after recovering from lost connection / crash. See doc strings for 'close_node' on how to use it.
- Added '--clean' option to dispynode and dispyscheduler. If this option is given, dispynode and dispyscheduler remove any files that are left behind by previous runs. The files are now stored by dispynode in /tmp/dispy/node and by dispyscheduler in /tmp/dispy/scheduler.
- dispy 4.1 released (2015-05-04)
dispy version 4.1 has been released with following changes from version 4.0:- httpd module now includes interface to manage cluster (changing CPUs when JobCluster is used, adding nodes), monitoring jobs on nodes, terminating jobs etc.
- httpd can be used with SSL so https protocol can be used to monitor and manage cluster with web browsers.
- Using httpd module is slightly different from the way it was in version 4.0; see httpd module and example.
- dispy 4.0 released (2015-04-21)
dispy version 4.0 has been released. In this release,- cluster_status parameter has been added to JobCluster and SharedJobCluster. This should be set to a function or method. Whenever there is a change in status of a job (such as created, submitted to a node, finished executing etc.) or a change in status of a node (initialized for given cluster, closed), given function is called. See dispy page for details.
- It includes httpd module that can be used to easily create a web interface to view cluster status. See View Cluster Status for information.
- asyncoro module is no longer included in dispy package; it must be installed separately. If Python's package index (PyPI / pip) is used for installing dispy, asyncoro will be automatically installed from PyPI. If dispy is installed manually, then asyncoro must be installed at appropriate dist-packages.
- dispy package includes basic documentation (in text form) and a fews examples. If installed with PyPI, these are stored under the directory where dispy module is stored (e.g., /usr/local/lib/python2.7/dist-packages/dispy).
- dispy 3.22 released (2015-03-24)
- dispy version 3.22 has been released. With this release, dependencies can be distributed for each job (in addition to distributing dependencies for the whole computation). Job's dependencies (functions, classes, modules and files if computations are Python functions, and modules and files if computations are standalone programs) are sent with the job to the node and removed after the job is done. Job's dependencies should be given as a list to 'cluster.submit()' when creating a job. See Cluster methods for details.
- (This feature was available in earlier releases, probably from the first release, but with some limitations, so was not documented.)
- dispy 3.21 released (2015-03-10)
- dispy version 3.21 has been released. This version adds support for computations to transfer files to client.
- dispy 3.20 released (2014-11-21)
- dispy version 3.20 has been released. This version fixes SSL setup so dispy, dispynode and dispyscheduler can be configured to use SSL certificates as documented.
- Version 3.19 released (2014-11-04)
dispy version 3.19 has been released. This release- Fixes issue with 'print' function for Python 2.7 version (in dispy 3.18, print function from __future__ has been imported, which made any user program using standard print statement to be invalid),
- asyncoro has been upgraded to latest release 3.1,
- Packaging/installing with pypi for Python 3.1+ has been fixed.
- Version 3.18 released (2014-10-24)
- dispy version 3.18 has been released. In this version dispyscheduler.py (used with SharedJobCluster) has been fixed to close the socket used in transferring files.
- Version 3.17 released (2014-10-16)
- dispy version 3.17 has been released. This release adds 'setup' parameter to JobCluster and SharedJobCluster. If given, this must be a Python function which is run on a node before running any jobs. The parameter 'cleanup' can also be a function, which is run at a node after the computation is done. See dispy and Examples for details.
- Version 3.16 released (2014-10-07)
dispy version 3.16 has been released. This release fixes following issues:- dispynode executes programs with subprocess.Popen without 'shell=True' to avoid security issues, as explained in subprocess reference. This apparently doesn't work under Windows when Python programs are executed, as Windows command shell is needed to invoke Python interpreter to execute the given program. This is fixed in this release by invoking the interpreter directly (without using 'shell=True' option).
- dispyscheduler for Python 2.7+ in dispy-3.15 was broken due to a typo - submitting jobs failed. dispyscheduler for Python 3.1+ didn't have this issue.
- dispyscheduler now works with VPN / NAT / Masquerading when behind firewalls.
- dispy-3.16 package includes files for both Python 2.7+(under 'py2' directory) and Python 3.1+ (under 'py3' directory). Earlier releases had two separate packages for these versions.
- Version 3.15 released (2014-06-27)
- dispy version 3.15 has been released. This version fixes crash with dispyscheduler in Windows. Files are now copied to tempfile.gettempdir() instead of '/tmp'.
- Version 3.14 released (2014-06-15)
- dispy version 3.14 has been released. This version fixes socket timeout error processing in Windows (in asyncoro module).
- Version 3.13 released (2014-06-10)
- dispy version 3.13 has been released. This version fixes an issue with processing provisional results. asyncoro has also been updated to latest version.
- Note that asyncoro project now supports distributed / parallel computing (among other features) where computations are distributed to nodes, as done by dispy. With asyncoro the computation tasks and client(s) can communicate using message passing, which is not possible with dispy (other than sending provisional results). However, asyncoro doesn't provide job scheduling. See 'discoro_client.py' in asyncoro's files for an example.
- Version 3.12 released (2014-04-21)
dispy version 3.12 has been released. The major changes since version 3.11 are:- With JobCluster, multiple network interfaces on the client computer are now supported:
- If ip_addr parameter is None (default), all interfaces are used.
- If ip_addr is a string (name or IP address), only that interface is used.
- If ip_addr is a list of strings, then each interface corresponding to that list is used.
- Similarly, ext_ip_addr can be either a string or list of strings.
- SharedJobCluster has extra parameter scheduler_port. This should be the port used by dispyscheduler (if different from default value 51349).
- Version 3.11 released (2014-03-17)
dispy version 3.11 has been released. Summary of changes since version 3.10:- ssh tunneling (remote port forwarding through ssh) supported. It can be used to work with nodes in remote networks. See Examples for details.
- Added 'polling_interval' option. This can be used if nodes can't connect to client (i.e., gateway/firewall prevents connections and there is no way to setup port forwarding, even with ssh tunneling). However, this is inefficient, so it must be used only when needed and if number of nodes is small.
- Added setup script for installing dispy. The script is provided by Roger-Bermudez Chacon.
- Updated asyncoro to latest release (1.5).
- Version 3.10 released (2013-10-05)
dispy version 3.10 has been released. In this release- Fixed 'callback' processing. In dispy-3.9 callback processing was broken (dispy would raise exception after calling callback).
- Fixed 'dest_path' processing in dispynode. Until dispy-3.9 (not sure since when) if 'dest_path' option is used and the given directory already exists on the node, dispynode still tries to create it, causing exception.
- Fixed a typo in dispyscheduler (used in case of SharedCluster), which would prevent sending 'ping' messages.
- Version 3.7 released (2013-07-16)
- dispy version 3.7 has been released. This release fixes job cancel with SharedJobCluster and race condition when shutting down clients (both JobCluster and SharedJobCluster).
- Version 3.6 released (2012-10-24)
- dispy version 3.6 has been released. In this release asyncoro has been updated to latest release (1.3).
- Version 3.5 released (2012-08-15)
- dispy version 3.5 has been released. This release changes license to MIT license and asyncoro module has been updated to latest release (1.2).
- Version 3.4 released (2012-07-23)
- dispy version 3.4 has been released. This version fixes occasional deadlock/potential crash issues during dispy shutdown (the fix is in implementation of Condition locking primitive in asyncoro, so if you use asyncoro, please update to asyncoro version 1.1), and crashes with dispyscheduler (shared scheduler).
- Updated dispy 3.3 with latest asyncoro (2012-06-16)
- dispy version 3.3 distribution files have been updated with latest asyncoro (mostly improved support for distributed coroutines) from project http://asyncoro.sourceforge.net
- Fix in dispyscheduler (2012-06-11)
- dispy-3.3 has been updated with a fix to dispyscheduler, thanks to an anonymous contributor (https://sourceforge.net/tracker/?func=d ... id=2189569).
- dispy version 3.3 (2012-06-11)
- dispy version 3.3 has been released. The major change since 3.2 is that asyncoro (from http://asyncoro.sourceforge.net) now supports exchanging messages to/from coroutines/channels over network so that distributed coroutines can communicate. These features are not documented (yet) in the HTML files, but doc strings should help understand how to use the methods. I am releasing now in the hope that others find current implementation useful/interesting. dispy itself does not use the new features, so it should work same as before.
- Distributed coroutines (asyncoro) (2012-06-08)
- This is not exactly dispy news, but related: As you may know, asyncoro (http://asyncoro.sourceforge.net) is used to implement dispy. asyncoro supports coroutines to exchange messages. I just added to asyncoro package 'dasyncoro.py' which implements support for coroutines running on different instances of asyncoro to exchange messages. With this, distributed coroutines/applications can be implemented. This is in early stages, so there will be some flux in both features and implementation. If you are interested in this, let me know.
- asyncoro updated (2012-06-02)
- Yet another update to dispy-3.2: Earlier, channels are added to asyncoro, so coroutines can send/receive messages over channels. Now channels have 'deliver' method that waits until at least 'n' coroutines can receive the message. With this, sender is guaranteed a message is delivered to at least 'n' receivers.
- Channels are not (yet) documented in the HTML page, but documented in the code - I am still experimenting with channels. In addition, I am thinking of extending asyncoro for distributed programming, so messages can be exchanged not just within a single instance of asyncoro, but also with other asyncoro instances over network. At some point in future, this can be extended to execute distribute coroutines, monitor them, etc., to facilitate distributed, fault-tolerant coroutines/applications. Thus, distributed asyncoro will be generalization of most of the features of dispy.
- Updated dispy 3.2 (2012-05-24)
- Updated dispy version 3.2 with following changes: slots are used in DispyJob and DispyJob for space savings, asyncoro now has channels so messages can be broadcast (e.g., to many coroutines).
- dispy version 3.2 (2012-05-22)
- dispy version 3.2 has been released. In this release, slots are used in Coro and AsynCoroSocket for space savings. With Coro alone, memory consumption goes down by about 30%. This version also fixes couple of SSL issues in asyncoro, dispynode and dispyscheduler.
- asyncoro with monitoring coroutines (2012-05-21)
- Updated dispy-3.1 with latest asyncoro. It now supports monitoring coroutines that get notification(s) about other coroutines when they finish or terminated. With monitor, hot swap and restart features, fault-tolerant coroutines/applications can be developed.
- Until now send/receive methods to exchange messages to coroutines were synonyms for resume/suspend. This has now been changed so send/receive are asynchronous and suspend/resume are synchronous; i.e., sending a message to a coroutine that is currently not waiting queues the message for later delivery, whereas resume must be called on a suspended coroutine, otherwise resume is ignored.
See updated documentation and examples at asyncoro project http://asyncoro.sourceforge.net for details.
- dispy version 3.1 (2012-05-14)
- dispy version 3.1 is released. This version adds 'ext_ip_addr' option to dispy, dispynode and dispyscheduler. With this option, dispy, dispynode and dispyscheduler will work when they are behind a NAT firewall/gateway, provided appropriate ports are forwarded. This also works with Amazon EC2 cloud computing service (e.g., to use dispy client from outside EC2 and EC2 for additional processing units). See documentation for details.
- Bug fix in asyncoro (2012-05-12)
- There was an issue with asyncoro when receiving large messages over slow connections. I updated dispy-3.0 with fix.
- asyncoro with hot swapping and messages (2012-05-09)
- I updated dispy-3.0 files with latest asyncoro framework at http://asyncoro.sourceforge.net . The new features since last update are: Added facility to hot swap coroutine's generator function, added send/receive methods so coroutines can exchange data as messages.
- With hot swapping, specific coroutines' code can be upgraded while the application is running. See updated documentation for details on these features.
- asyncoro project (2012-04-28)
- I just refreshed dispy-3.0 with documentation updates. I also removed asyncoro files from downloads section (although asyncoro is included in dispy distribution files). Please note that there is now separate asyncoro project at http://asyncoro.sourceforge.net if you are interested in asyncoro.
- Marked "stable" (2012-04-24)
- FWIW, I have marked dispy 3.0 "stable". dispy and asyncoro grew in features and complexity more than I expected. I use only some of the features, so likely there are issues, but I expect the issues to be in the implementation that can be easily fixed. If you find an issue, please post in one of the forums or submit bug report.
- dispy version 3.0 (2012-04-23)
- dispy version 3.0 has been released. Most of the changes are in asyncoro (asynchronous framework used for dispy).
- asyncoro in this version supports full duplex communication; earlier versions supported only half duplex - earlier, at any time a socket could have either read operation or write operation pending, whereas now one coroutine may be waiting on read operation on a socket and another coroutine may be waiting on write operation on same socket.
- asyncoro uses suspend/resume facility to implement asynchronous I/O operations. Earlier versions used same interface for this. So user programs can inadvertently resume a coroutine that is waiting for I/O operation, causing failures. In the new version asyncoro uses different interface for asynchronous I/O operations to prevent this.