How do I detect if a thread died, and then restart it?
Solution 1
You could potentially put in an a try except around where you expect it to crash (if it can be anywhere you can do it around the whole run function) and have an indicator variable which has its status.
So something like the following:
class MyThread(threading.Thread):
def __init__(self, pass_value):
super(MyThread, self).__init__()
self.running = False
self.value = pass_value
self.RUNNING = 0
self.FINISHED_OK = 1
self.STOPPED = 2
self.CRASHED = 3
self.status = self.STOPPED
def run(self):
self.running = True
self.status = self.RUNNING
while self.running:
time.sleep(0.25)
rand = random.randint(0,10)
print threading.current_thread().name, rand, self.value
try:
if rand == 4:
raise ValueError('Returned 4!')
except:
self.status = self.CRASHED
Then you can use your loop:
while True:
# Create a copy of our groups to iterate over,
# so that we can delete dead threads if needed
for m in group1[:]:
if m.status == m.CRASHED:
value = m.value
group1.remove(m)
group1.append(MyThread(value))
for m in group2[:]:
if m.status == m.CRASHED:
value = m.value
group2.remove(m)
group2.append(MyThread(value))
time.sleep(5.0)
Solution 2
I had a similar issue and stumbled across this question. I found that join takes a timeout argument, and that is_alive will return False once the thread is joined. So my audit for each thread is:
def check_thread_alive(thr):
thr.join(timeout=0.0)
return thr.is_alive()
This detects thread death for me.
NewGuy
Updated on June 04, 2022Comments
-
NewGuy almost 2 years
I have an application that fires up a series of threads. Occassionally, one of these threads dies (usually due to a network problem). How can I properly detect a thread crash and restart just that thread? Here is example code:
import random import threading import time class MyThread(threading.Thread): def __init__(self, pass_value): super(MyThread, self).__init__() self.running = False self.value = pass_value def run(self): self.running = True while self.running: time.sleep(0.25) rand = random.randint(0,10) print threading.current_thread().name, rand, self.value if rand == 4: raise ValueError('Returned 4!') if __name__ == '__main__': group1 = [] group2 = [] for g in range(4): group1.append(MyThread(g)) group2.append(MyThread(g+20)) for m in group1: m.start() print "Now start second wave..." for p in group2: p.start()
In this example, I start 4 threads then I start 4 more threads. Each thread randomly generates an
int
between 0 and 10. If thatint
is4
, it raises an exception. Notice that I don'tjoin
the threads. I want bothgroup1
andgroup2
list of threads to be running. I found that if I joined the threads it would wait until the thread terminated. My thread is supposed to be a daemon process, thus should rarely (if ever) hit theValueError
Exception this example code is showing and should be running constantly. By joining it, the next set of threads doesn't begin.How can I detect that a specific thread died and restart just that one thread?
I have attempted the following loop right after my
for p in group2
loop.while True: # Create a copy of our groups to iterate over, # so that we can delete dead threads if needed for m in group1[:]: if not m.isAlive(): group1.remove(m) group1.append(MyThread(1)) for m in group2[:]: if not m.isAlive(): group2.remove(m) group2.append(MyThread(500)) time.sleep(5.0)
I took this method from this question.
The problem with this, is that
isAlive()
seems to always returnTrue
, because the threads never restart.Edit
Would it be more appropriate in this situation to use multiprocessing? I found this tutorial. Is it more appropriate to have separate processes if I am going to need to restart the process? It seems that restarting a thread is difficult.
It was mentioned in the comments that I should check
is_active()
against the thread. I don't see this mentioned in the documentation, but I do see theisAlive
that I am currently using. As I mentioned above, though, this returnsTrue
, thus I'm never able to see that a thread as died.