Using wildcard for "if .... in .." statement
Solution 1
Try the following:
if any(f.startswith(existingXML) and f.endswith('.xml') for f in check_meta):
print "exists"
The any()
built-in function takes an iterable as an argument and returns true if any of the elements are true. The argument that we pass is a generator expression which will yield the value f.startswith(existingXML) and f.endswith('.xml')
for each file f
in your list check_meta
.
A regex solution might look something like this:
regex = re.compile(re.escape(existingXML) + '.*\.xml$')
if any(regex.match(f) for f in check_meta):
print "exists"
If you need to know which entry actually matches, use a for loop instead:
for f in check_meta:
if f.startswith(existingXML) and f.endswith('.xml'):
print "exists, file name:", f
Solution 2
why not just use:
searchtext = "sometext"
matching = [ f for f in os.listdir(currentPath) if f.startswith(searchtext) and ".xml" in f]
If you want to check for different extentions you can list them out.
exts = (".xml", ".tab", ".shp")
matching = [ f for f in os.listdir(currentPath) if f.startswith(searchtext) and os.path.splitext(f)[-1] in exts]
Of course you could figure out the regex to do the same thing as well.
GeorgeC
I am a conservation biologist (with 15 years international experience) who is now working in the Spatial industry. One of my projects was awarded the GITA Australia and New Zealand Spatial Excellence award for 2013 and Highly Commended at QSEA. I have presented at several GIS and Disaster Management conferences in Australia since I moved in 2010. I am focused on automating large processes using python along with ESRI/Mapinfo and open source GIS tools that complement the strengths of each other. I also work in Disaster Management and risk mitigation.
Updated on June 04, 2022Comments
-
GeorgeC almost 2 years
I am trying to find files in directories where the file name used is sometimes only a part of the full file name.
So
check_meta=os.listdir(currentPath)
gives
['ANZMeta.xsl', 'Benefited_Areas', 'divisons', 'emergency', 'Error_LOG.txt', 'hex.dbf', 'hex.shp', 'hex.shp_BaseMetadata.xml', 'hex.shx', 'Maintenance_Areas', 'Rates.mxd', 'Regulated_Parking', 'schema.ini', 'Service_Areas', 'Shortcut to Local_Govt.lnk', 'TAB', 'TRC.rar', 'trc_boundary.dbf', 'trc_boundary.kml', 'trc_boundary.prj', 'trc_boundary.sbn', 'trc_boundary.sbx', 'trc_boundary.shp', 'trc_boundary.shp.ATGIS29.1772.3444.sr.lock', 'trc_boundary.shp.ATGIS30.2668.2356.sr.lock', 'trc_boundary.shp.xml', 'trc_boundary.shx', 'trc_boundary_Metadata.xml.auto', 'trc_boundary_Polygon.dbf', 'trc_boundary_Polygon.prj', 'trc_boundary_Polygon.sbn', 'trc_boundary_Polygon.sbx', 'trc_boundary_Polygon.shp', 'trc_boundary_Polygon.shp.ATGIS29.1772.3444.sr.lock', 'trc_boundary_Polygon.shx', 'trc_boundary_polygon.xml', 'Urbanlevy_bdy_region.dbf', 'Urbanlevy_bdy_region.prj', 'Urbanlevy_bdy_region.shp', 'Urbanlevy_bdy_region.shp.xml', 'Urbanlevy_bdy_region.shx', 'Urbanlevy_bdy_trc.dbf', 'Urbanlevy_bdy_trc. prj', 'Urbanlevy_bdy_trc.sbn', 'Urbanlevy_bdy_trc.sbx', 'Urbanlevy_bdy_trc.shp', 'Urbanlevy_bdy_trc.shp.xml', 'Urbanlevy_bdy_trc.shx']
I want to
existingXML=FileNm[:FileNm.find('.')] if existingXML+"*"+'.xml' in check_meta: # this is where the issue is print "exists"
so sometimes the xml to use is Urbanlevy_bdy_trc.shp.xml and at others it is Urbanlevy_bdy_trc.xml (whichever exists -note it is not to simply use a OR function for ".shp.xml" as there are multiple file extentions like tab, ecw etc that the datasets will have). Also sometimes the related xml file maybe called Urbanlevy_bdy_trc_Metadata.shp.xml so the key is just to search for the core file name "Urbanlevy_bdy_trc" with extension .xml
How can I specify this? the purpose of this is mentioned in Search and replace multiple lines in xml/text files using python
FULL CODE
import os, xml, arcpy, shutil, datetime from xml.etree import ElementTree as et path=os.getcwd() RootDirectory=path arcpy.env.workspace = path Count=0 Generated_XMLs=RootDirectory+'\GeneratedXML_LOG.txt' f = open(Generated_XMLs, 'a') f.write("Log of Metadata Creation Process - Update: "+str(datetime.datetime.now())+"\n") f.close() for root, dirs, files in os.walk(RootDirectory, topdown=False): #print root, dirs for directory in dirs: currentPath=os.path.join(root,directory) os.chdir(currentPath) arcpy.env.workspace = currentPath print currentPath #def Create_xml(currentPath): FileList = arcpy.ListFeatureClasses() zone="_Zone" for File in FileList: Count+=1 FileDesc_obj = arcpy.Describe(File) FileNm=FileDesc_obj.file print FileNm check_meta=os.listdir(currentPath) existingXML=FileNm[:FileNm.find('.')] print "XML: "+existingXML print check_meta #if existingXML+'.xml' in check_meta: if any(f.startswith(existingXML) and f.endswith('.xml') for f in check_meta): print "exists" newMetaFile=FileNm+"_2012Metadata.xml" shutil.copy2(FileNm+'.xml', newMetaFile) else: print "Does not exist" newMetaFile=FileNm+"_BaseMetadata.xml" shutil.copy2('L:\Data_Admin\QA\Metadata_python_toolset\Master_Metadata.xml', newMetaFile) tree=et.parse(newMetaFile) print "Processing: "+str(File) for node in tree.findall('.//title'): node.text = str(FileNm) for node in tree.findall('.//northbc'): node.text = str(FileDesc_obj.extent.YMax) for node in tree.findall('.//southbc'): node.text = str(FileDesc_obj.extent.YMin) for node in tree.findall('.//westbc'): node.text = str(FileDesc_obj.extent.XMin) for node in tree.findall('.//eastbc'): node.text = str(FileDesc_obj.extent.XMax) for node in tree.findall('.//native/nondig/formname'): node.text = str(os.getcwd()+"\\"+File) for node in tree.findall('.//native/digform/formname'): node.text = str(FileDesc_obj.featureType) for node in tree.findall('.//avlform/nondig/formname'): node.text = str(FileDesc_obj.extension) for node in tree.findall('.//avlform/digform/formname'): node.text = str(float(os.path.getsize(File))/int(1024))+" KB" for node in tree.findall('.//theme'): node.text = str(FileDesc_obj.spatialReference.name +" ; EPSG: "+str(FileDesc_obj.spatialReference.factoryCode)) print node.text projection_info=[] Zone=FileDesc_obj.spatialReference.name if "GCS" in str(FileDesc_obj.spatialReference.name): projection_info=[FileDesc_obj.spatialReference.GCSName, FileDesc_obj.spatialReference.angularUnitName, FileDesc_obj.spatialReference.datumName, FileDesc_obj.spatialReference.spheroidName] print "Geographic Coordinate system" else: projection_info=[FileDesc_obj.spatialReference.datumName, FileDesc_obj.spatialReference.spheroidName, FileDesc_obj.spatialReference.angularUnitName, Zone[Zone.rfind(zone)-3:]] print "Projected Coordinate system" x=0 for node in tree.findall('.//spdom'): for node2 in node.findall('.//keyword'): print node2.text node2.text = str(projection_info[x]) print node2.text x=x+1 tree.write(newMetaFile) f = open(Generated_XMLs, 'a') f.write(str(Count)+": "+File+"; "+newMetaFile+"; "+currentPath+"\n") f.close() # Create_xml(currentPath)
RESULT