Page 1 of 1

No status_task Message When Node Dropped Due To Connection Errors

Posted: Tue May 28, 2024 4:48 am
by OptrixAU
There doesn't seem to be a status_task message when a node is removed due to a time-out.

-------------

I've got a long-running (ie. it never closes) task. The client system A creates the task on remote node B.

If I type 'terminate' into B, my client (A) gets StatusUpdates for each of the nodes and servers disconnecting - except for the one that is running the long task. In the log on the client program, I get...

dispycos - <A>: peer <B> terminated

for every taskless/free node, but I get...

dispycos - Waiting for 1 remote tasks at <B> to finish

...for the node that is still running the task. After some time, I get the

dispycos - too many connection errors to <B>; removing it
dispycos - 192.168.137.1:9705: peer <B> terminated

log entries, but no matching messages to my status_task function.

For consistency, wouldn't it be a good idea to send a message to status_task when the node is dropped/terminated?

Re: No status_task Message When Node Dropped Due To Connection Errors

Posted: Thu Jul 11, 2024 5:38 pm
by jhzheng_fzu
when you run dispynode.py in a computer you can input the paramter --zombie_interval=**** (e.g. 999999). this is means the computing node will be connected in this cluster until the 999999 seconds.