Many threads to write log file at same time in Python
Solution 1
You can simply create your own locking mechanism to ensure that only one thread is ever writing to a file.
import threading
lock = threading.Lock()
def write_to_file(f, text, file_size):
lock.acquire() # thread blocks at this line until it can obtain lock
# in this section, only one thread can be present at a time.
print >> f, text, file_size
lock.release()
def filesize(asset):
f = open("results.txt", 'a+')
c = wmi.WMI(asset)
wql = 'SELECT FileSize,Name FROM CIM_DataFile where (Drive="D:" OR Drive="E:") and Caption like "%file%"'
for item in c.query(wql):
write_to_file(f, item.Name.split("\\")[2].strip().upper(), str(item.FileSize))
You may want to consider placing the lock around the entire for loop for item in c.query(wql):
to allow each thread to do a larger chunk of work before releasing the lock.
Solution 2
print
is not thread safe. Use the logging
module instead (which is):
import logging
import threading
import time
FORMAT = '[%(levelname)s] (%(threadName)-10s) %(message)s'
logging.basicConfig(level=logging.DEBUG,
format=FORMAT)
file_handler = logging.FileHandler('results.log')
file_handler.setFormatter(logging.Formatter(FORMAT))
logging.getLogger().addHandler(file_handler)
def worker():
logging.info('Starting')
time.sleep(2)
logging.info('Exiting')
t1 = threading.Thread(target=worker)
t2 = threading.Thread(target=worker)
t1.start()
t2.start()
Output (and contents of results.log
):
[INFO] (Thread-1 ) Starting
[INFO] (Thread-2 ) Starting
[INFO] (Thread-1 ) Exiting
[INFO] (Thread-2 ) Exiting
Instead of using the default name (Thread-n
), you can set your own name using the name
keyword argument, which the %(threadName)
formatting directive then will then use:
t = threading.Thread(name="My worker thread", target=worker)
(This example was adapted from an example from Doug Hellmann's excellent article about the threading
module)
Solution 3
For another solution, use a Pool
to calculate data, returning it to the parent process. This parent then writes all data to a file. Since there's only one proc writing to the file at a time, there's no need for additional locking.
Note the following uses a pool of processes, not threads. This makes the code much simpler and easier than putting something together using the threading
module. (There is a ThreadPool
object, but it's not documented.)
source
import glob, os, time
from multiprocessing import Pool
def filesize(path):
time.sleep(0.1)
return (path, os.path.getsize(path))
paths = glob.glob('*.py')
pool = Pool() # default: proc per CPU
with open("results.txt", 'w+') as dataf:
for (apath, asize) in pool.imap_unordered(
filesize, paths,
):
print >>dataf, apath,asize
output in results.txt
zwrap.py 122
usercustomize.py 38
tpending.py 2345
msimple4.py 385
parse2.py 499
user3515946
Updated on July 09, 2022Comments
-
user3515946 almost 2 years
I am writing a script to retrieve WMI info from many computers at the same time then write this info in a text file:
f = open("results.txt", 'w+') ## to clean the results file before the start def filesize(asset): f = open("results.txt", 'a+') c = wmi.WMI(asset) wql = 'SELECT FileSize,Name FROM CIM_DataFile where (Drive="D:" OR Drive="E:") and Caption like "%file%"' for item in c.query(wql): print >> f, item.Name.split("\\")[2].strip().upper(), str(item.FileSize) class myThread (threading.Thread): def __init__(self,name): threading.Thread.__init__(self) self.name = name def run(self): pythoncom.CoInitialize () print "Starting " + self.name filesize(self.name) print "Exiting " + self.name thread1 = myThread('10.24.2.31') thread2 = myThread('10.24.2.32') thread3 = myThread('10.24.2.33') thread4 = myThread('10.24.2.34') thread1.start() thread2.start() thread3.start() thread4.start()
The problem is that all threads writing at the same time.
-
George L over 8 yearsIf a thread tries to write_to_file while the file is locked to another thread, will both still get their turn to write?
-
Martin Konecny over 8 yearsYes when the thread that has the lock releases it, the waiting thread will obtain the lock.
-
ProGrammer almost 7 years@MartinKonecny I am using this acquire and release method for a python script where 4 threads write to one file simultaneously. It neatly aligns the tasks and prevents writing errors. I am working on a second (completely separate script) that is writing to the same file as the first. When running at the same time, does the lock and acquire method implemented in the first script also prevent simultaneous access by the second script (where it isn't implemented)? I.e Does this method lock the file for every script attempting to access it at that time? ...curious but the documentation is unclear.
-
Martin Konecny almost 7 yearsNope, this lock method will only work for threads running in the same script
-
Mayur over 4 yearswhy some people use
with lock : #operation