How to delete files with a Python script from a FTP server which are older than 7 days?
Solution 1
OK. Assuming your FTP server supports the MLSD
command, make a module with the following code (this is code from a script I use to sync a remote FTP site with a local directory):
module code
# for python ≥ 2.6
import sys, os, time, ftplib
import collections
FTPDir= collections.namedtuple("FTPDir", "name size mtime tree")
FTPFile= collections.namedtuple("FTPFile", "name size mtime")
class FTPDirectory(object):
def __init__(self, path='.'):
self.dirs= []
self.files= []
self.path= path
def getdata(self, ftpobj):
ftpobj.retrlines('MLSD', self.addline)
def addline(self, line):
data, _, name= line.partition('; ')
fields= data.split(';')
for field in fields:
field_name, _, field_value= field.partition('=')
if field_name == 'type':
target= self.dirs if field_value == 'dir' else self.files
elif field_name in ('sizd', 'size'):
size= int(field_value)
elif field_name == 'modify':
mtime= time.mktime(time.strptime(field_value, "%Y%m%d%H%M%S"))
if target is self.files:
target.append(FTPFile(name, size, mtime))
else:
target.append(FTPDir(name, size, mtime, self.__class__(os.path.join(self.path, name))))
def walk(self):
for ftpfile in self.files:
yield self.path, ftpfile
for ftpdir in self.dirs:
for path, ftpfile in ftpdir.tree.walk():
yield path, ftpfile
class FTPTree(FTPDirectory):
def getdata(self, ftpobj):
super(FTPTree, self).getdata(ftpobj)
for dirname in self.dirs:
ftpobj.cwd(dirname.name)
dirname.tree.getdata(ftpobj)
ftpobj.cwd('..')
single directory case
If you want to work on the files of a directory, you can:
import ftplib, time
quite_old= time.time() - 7*86400 # seven days
site= ftplib.FTP(hostname, username, password)
site.cwd(the_directory_to_work_on) # if it's '.', you can skip this line
folder= FTPDirectory()
folder.getdata(site) # get the filenames
for path, ftpfile in folder.walk():
if ftpfile.mtime < quite_old:
site.delete(ftpfile.name)
This should do what you want.
a directory and its descendants
Now, if this should work recursively, you'll have to do the following two changes in the code for “single directory case”:
folder= FTPTree()
and
site.delete(os.path.join(path, ftpfile.name))
Possible caveat
The servers I've worked with didn't have any issues with relative paths in the STOR
and DELE
commands, so site.delete
with a relative path worked too. If your FTP server requires pathless filenames, you should first .cwd
to the path
provided, .delete
the plain ftpfile.name
and then .cwd
back to the base folder.
Solution 2
I had to do this and it took a while, thought I could save someones time here. We are using python with ftputil module installed:
#! /usr/bin/python
import time
import ftputil
host = ftputil.FTPHost('ftphost.com', 'username', 'password')
mypath = 'ftp_dir'
now = time.time()
host.chdir(mypath)
names = host.listdir(host.curdir)
for name in names:
if host.path.getmtime(name) < (now - (7 * 86400)):
if host.path.isfile(name):
host.remove(name)
print 'Closing FTP connection'
host.close()
Solution 3
OK, well rather than analyze the code you have posted any further, here's an example instead that might put you on the right track.
from ftplib import FTP
import re
pattern = r'.* ([A-Z|a-z].. .. .....) (.*)'
def callback(line):
found = re.match(pattern, line)
if (found is not None):
print found.groups()
ftp = FTP('myserver.wherever.com')
ftp.login('elvis','presley')
ftp.cwd('testing123')
ftp.retrlines('LIST',callback)
ftp.close()
del ftp
Run it and you'll get output something like this, which should be a start towards what you're trying to achieve. To finish it out you'd need to parse the first result into a datetime, compare it with "now" and use ftp.delete() to get rid of the remote file if it's too old.
>>>
('May 16 13:47', 'Thumbs.db')
('Feb 16 17:47', 'docs')
('Feb 23 2007', 'marvin')
('May 08 2009', 'notes')
('Aug 04 2009', 'other')
('Feb 11 18:24', 'ppp.xml')
('Jan 20 2010', 'reports')
('Oct 10 2005', 'transition')
>>>
Related videos on Youtube
Tom
From time to time I have to develop some scripts to deliver IT solutions.
Updated on July 09, 2022Comments
-
Tom 11 months
I would like to write a Python script which allows me to delete files from a FTP Server after they have reached a certain age. I prepared the scipt below but it throws the error message:
WindowsError: [Error 3] The system cannot find the path specified: '/test123/*.*'
Do someone have an idea how to resolve this issue? Thank you in advance!
import os, time from ftplib import FTP ftp = FTP('127.0.0.1') print "Automated FTP Maintainance" print 'Logging in.' ftp.login('admin', 'admin') # This is the directory that we want to go to path = 'test123' print 'Changing to:' + path ftp.cwd(path) files = ftp.retrlines('LIST') print 'List of Files:' + files #--everything works fine until here!... #--The Logic which shall delete the files after the are 7 days old-- now = time.time() for f in os.listdir(path): if os.stat(f).st_mtime < now - 7 * 86400: if os.path.isfile(f): os.remove(os.path.join(path, f)) except: exit ("Cannot delete files") print 'Closing FTP connection' ftp.close()
-
SilentGhost about 13 yearswhat is
os.directory
? Your code makes very little sense. Why are you trying to delete files from your local system? -
Tom about 13 yearsyeah, but it has to run on windows. therefore shell / bash is not an option in this case.
-
-
eemz about 13 yearsNote however that different ftp servers format the output of the LIST command differently, so you may have to modify the regular expression to match the one you're using.
-
Tom about 13 yearsHi is is running on Windows 2003 Server, and it connects currently to an test FTP Server wich is running on Windows XP.
-
Tom about 13 yearsNo it shall jump into the directory "test123", and then delete every file from it which is older then 7 days. The machine is indicating that it is not able to find the directory.
-
Tom about 13 yearsHi thank you for your answer, I will try to modify my code accordingly.
-
Tom almost 13 yearsHi ΤΖΩΤΖΙΟΥ, thank you for your idea, it looks very good to me. I have tried it out, and I had to modidy the code slightly, but I get an error message: site= ftplib.FTP('127.0.0.1, admin, admin') File "C:\Python26\lib\ftplib.py", line 116, in init self.connect(host) File "C:\Python26\lib\ftplib.py", line 131, in connect self.sock = socket.create_connection((self.host, self.port), self.timeout) for res in getaddrinfo(host, port, 0, SOCK_STREAM): socket.gaierror: [Errno 11001] getaddrinfo failed
-
Tom almost 13 yearsimport os, time, FTP_AUTO from ftplib import FTP quite_old= time.time() - 7*86400 # seven days # C:\Temp\ftp\test123 site= ftplib.FTP('127.0.0.1, admin, admin') site.cwd(test123) # if it's '.', you can skip this line folder= FTPDirectory() print folder folder.getdata(site) # get the filenames for path, ftpfile in folder.walk(): if ftpfile.mtime < quite_old: site.delete(ftpfile.name)
-
Ishbir almost 13 years@Tom:
'127.0.0.1, admin, admin'
is not a valid hostname; that's what the error is about. You probably meant'127.0.0.1', 'admin', 'admin'
in your code. -
Tom almost 13 yearsThank you, the connection is now working. But the system stated that: File "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py", line 6, in <module> folder= FTPDirectory() NameError: name 'FTPDirectory' is not defined
-
Ishbir almost 13 years@Tom: how did you name my module? Did you import it at the start of ftp_del.py? If you saved my code as, say, ftptool.py, then at the start of ftp_del.py you should
import ftptool
and later have the classes prefixed with the module name, e.g.folder = ftptool.FTPDirectory()
. ISTM you need to read the Python tutorial first; it's like you lack basic knowledge about Python. -
Tom almost 13 yearsHi ΤΖΩΤΖΙΟΥ, I named your module "FTP_dir" in that case. I import it as you mentioned. Now it seems to work! The old files are deleted from my test FTP server, now I will try it on the productive environment. Thank you very much for your assistance and help! It responses in the console with <FTP_dir.FTPDirectory object at 0x00B6E590> All look GOOD!
-
Tom almost 13 yearsIt worked on test environment Windows Based FileZilla Server, but in productive environment I get the error: ftplib.error_perm: 500 Cannot understand 'MLSD'" Would theren be an workaround for this issue? Can the provider just switch "MLSD" commands on?
-
SilentSteel almost 10 yearsThis is terrific code! Some things: @Tom MLSD was officially implemented in 2007, so you might need to update your FTP server. The reason it was done was bc every FTP server used a different format with LIST. NOTE: In function addline, You should convert field_name to lowercase. There are servers such as ServU that return uppercase field names. field_name = field_name.lower()
-
Sérgio over 9 yearsI like the solution , very easy but isn't complete , we got deal with dates etc...
-
David over 6 yearsI had to turn the
field_name
into lower case as the FTP server was returningType
,Modify
etc and this checks fortype
,modify
etc.