How to delete files with a Python script from a FTP server which are older than 7 days?

26,026

Solution 1

OK. Assuming your FTP server supports the MLSD command, make a module with the following code (this is code from a script I use to sync a remote FTP site with a local directory):

module code

# for python ≥ 2.6
import sys, os, time, ftplib
import collections
FTPDir= collections.namedtuple("FTPDir", "name size mtime tree")
FTPFile= collections.namedtuple("FTPFile", "name size mtime")
class FTPDirectory(object):
    def __init__(self, path='.'):
        self.dirs= []
        self.files= []
        self.path= path
    def getdata(self, ftpobj):
        ftpobj.retrlines('MLSD', self.addline)
    def addline(self, line):
        data, _, name= line.partition('; ')
        fields= data.split(';')
        for field in fields:
            field_name, _, field_value= field.partition('=')
            if field_name == 'type':
                target= self.dirs if field_value == 'dir' else self.files
            elif field_name in ('sizd', 'size'):
                size= int(field_value)
            elif field_name == 'modify':
                mtime= time.mktime(time.strptime(field_value, "%Y%m%d%H%M%S"))
        if target is self.files:
            target.append(FTPFile(name, size, mtime))
        else:
            target.append(FTPDir(name, size, mtime, self.__class__(os.path.join(self.path, name))))
    def walk(self):
        for ftpfile in self.files:
            yield self.path, ftpfile
        for ftpdir in self.dirs:
            for path, ftpfile in ftpdir.tree.walk():
                yield path, ftpfile
class FTPTree(FTPDirectory):
    def getdata(self, ftpobj):
        super(FTPTree, self).getdata(ftpobj)
        for dirname in self.dirs:
            ftpobj.cwd(dirname.name)
            dirname.tree.getdata(ftpobj)
            ftpobj.cwd('..')

single directory case

If you want to work on the files of a directory, you can:

import ftplib, time
quite_old= time.time() - 7*86400 # seven days
site= ftplib.FTP(hostname, username, password)
site.cwd(the_directory_to_work_on) # if it's '.', you can skip this line
folder= FTPDirectory()
folder.getdata(site) # get the filenames
for path, ftpfile in folder.walk():
    if ftpfile.mtime < quite_old:
        site.delete(ftpfile.name)

This should do what you want.

a directory and its descendants

Now, if this should work recursively, you'll have to do the following two changes in the code for “single directory case”:

folder= FTPTree()

and

site.delete(os.path.join(path, ftpfile.name))

Possible caveat

The servers I've worked with didn't have any issues with relative paths in the STOR and DELE commands, so site.delete with a relative path worked too. If your FTP server requires pathless filenames, you should first .cwd to the path provided, .delete the plain ftpfile.name and then .cwd back to the base folder.

Solution 2

I had to do this and it took a while, thought I could save someones time here. We are using python with ftputil module installed:

#! /usr/bin/python
import time
import ftputil
host = ftputil.FTPHost('ftphost.com', 'username', 'password')
mypath = 'ftp_dir'
now = time.time()
host.chdir(mypath)
names = host.listdir(host.curdir)
for name in names:
    if host.path.getmtime(name) < (now - (7 * 86400)):
      if host.path.isfile(name):
         host.remove(name)
print 'Closing FTP connection'
host.close()

Solution 3

OK, well rather than analyze the code you have posted any further, here's an example instead that might put you on the right track.

from ftplib import FTP
import re
pattern = r'.* ([A-Z|a-z].. .. .....) (.*)'
def callback(line):
    found = re.match(pattern, line)
    if (found is not None):
        print found.groups()
ftp = FTP('myserver.wherever.com')
ftp.login('elvis','presley')
ftp.cwd('testing123')
ftp.retrlines('LIST',callback)
ftp.close()
del ftp

Run it and you'll get output something like this, which should be a start towards what you're trying to achieve. To finish it out you'd need to parse the first result into a datetime, compare it with "now" and use ftp.delete() to get rid of the remote file if it's too old.

>>> 
('May 16 13:47', 'Thumbs.db')
('Feb 16 17:47', 'docs')
('Feb 23  2007', 'marvin')
('May 08  2009', 'notes')
('Aug 04  2009', 'other')
('Feb 11 18:24', 'ppp.xml')
('Jan 20  2010', 'reports')
('Oct 10  2005', 'transition')
>>> 
Share:
26,026

Related videos on Youtube

Author by

Tom

From time to time I have to develop some scripts to deliver IT solutions.

Updated on July 09, 2022

Comments

  • Tom 11 months

    I would like to write a Python script which allows me to delete files from a FTP Server after they have reached a certain age. I prepared the scipt below but it throws the error message: WindowsError: [Error 3] The system cannot find the path specified: '/test123/*.*'

    Do someone have an idea how to resolve this issue? Thank you in advance!

    import os, time
    from ftplib import FTP
    ftp = FTP('127.0.0.1')
    print "Automated FTP Maintainance"
    print 'Logging in.'
    ftp.login('admin', 'admin')
    # This is the directory that we want to go to
    path = 'test123'
    print 'Changing to:' + path
    ftp.cwd(path)
    files = ftp.retrlines('LIST')
    print 'List of Files:' + files 
    #--everything works fine until here!...
    #--The Logic which shall delete the files after the are 7 days old--
    now = time.time()
    for f in os.listdir(path):
      if os.stat(f).st_mtime < now - 7 * 86400:
        if os.path.isfile(f):
            os.remove(os.path.join(path, f))
    except:
        exit ("Cannot delete files")
    print 'Closing FTP connection'
    ftp.close()
    
    • SilentGhost
      SilentGhost about 13 years
      what is os.directory? Your code makes very little sense. Why are you trying to delete files from your local system?
    • Tom about 13 years
      yeah, but it has to run on windows. therefore shell / bash is not an option in this case.
  • eemz
    eemz about 13 years
    Note however that different ftp servers format the output of the LIST command differently, so you may have to modify the regular expression to match the one you're using.
  • Tom about 13 years
    Hi is is running on Windows 2003 Server, and it connects currently to an test FTP Server wich is running on Windows XP.
  • Tom about 13 years
    No it shall jump into the directory "test123", and then delete every file from it which is older then 7 days. The machine is indicating that it is not able to find the directory.
  • Tom about 13 years
    Hi thank you for your answer, I will try to modify my code accordingly.
  • Tom almost 13 years
    Hi ΤΖΩΤΖΙΟΥ, thank you for your idea, it looks very good to me. I have tried it out, and I had to modidy the code slightly, but I get an error message: site= ftplib.FTP('127.0.0.1, admin, admin') File "C:\Python26\lib\ftplib.py", line 116, in init self.connect(host) File "C:\Python26\lib\ftplib.py", line 131, in connect self.sock = socket.create_connection((self.host, self.port), self.timeout) for res in getaddrinfo(host, port, 0, SOCK_STREAM): socket.gaierror: [Errno 11001] getaddrinfo failed
  • Tom almost 13 years
    import os, time, FTP_AUTO from ftplib import FTP quite_old= time.time() - 7*86400 # seven days # C:\Temp\ftp\test123 site= ftplib.FTP('127.0.0.1, admin, admin') site.cwd(test123) # if it's '.', you can skip this line folder= FTPDirectory() print folder folder.getdata(site) # get the filenames for path, ftpfile in folder.walk(): if ftpfile.mtime < quite_old: site.delete(ftpfile.name)
  • Ishbir
    Ishbir almost 13 years
    @Tom: '127.0.0.1, admin, admin' is not a valid hostname; that's what the error is about. You probably meant '127.0.0.1', 'admin', 'admin' in your code.
  • Tom almost 13 years
    Thank you, the connection is now working. But the system stated that: File "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py", line 6, in <module> folder= FTPDirectory() NameError: name 'FTPDirectory' is not defined
  • Ishbir
    Ishbir almost 13 years
    @Tom: how did you name my module? Did you import it at the start of ftp_del.py? If you saved my code as, say, ftptool.py, then at the start of ftp_del.py you should import ftptool and later have the classes prefixed with the module name, e.g. folder = ftptool.FTPDirectory(). ISTM you need to read the Python tutorial first; it's like you lack basic knowledge about Python.
  • Tom almost 13 years
    Hi ΤΖΩΤΖΙΟΥ, I named your module "FTP_dir" in that case. I import it as you mentioned. Now it seems to work! The old files are deleted from my test FTP server, now I will try it on the productive environment. Thank you very much for your assistance and help! It responses in the console with <FTP_dir.FTPDirectory object at 0x00B6E590> All look GOOD!
  • Tom almost 13 years
    It worked on test environment Windows Based FileZilla Server, but in productive environment I get the error: ftplib.error_perm: 500 Cannot understand 'MLSD'" Would theren be an workaround for this issue? Can the provider just switch "MLSD" commands on?
  • SilentSteel
    SilentSteel almost 10 years
    This is terrific code! Some things: @Tom MLSD was officially implemented in 2007, so you might need to update your FTP server. The reason it was done was bc every FTP server used a different format with LIST. NOTE: In function addline, You should convert field_name to lowercase. There are servers such as ServU that return uppercase field names. field_name = field_name.lower()
  • Sérgio
    Sérgio over 9 years
    I like the solution , very easy but isn't complete , we got deal with dates etc...
  • David over 6 years
    I had to turn the field_name into lower case as the FTP server was returning Type, Modify etc and this checks for type, modify etc.

Related