Python check if value in csv file
13,676
Solution 1
Read the file into a variable-
with open('urls_list.csv', 'r') as fp:
s = fp.read()
Check to see if each list item is in the file, if not save it
missing = []
for url in urls_list:
if url not in s:
missing.append(url + '\n')
Write the missing url's to the file
if missing:
with open('urls_list.csv', 'a+') as fp:
fp.writelines(missing)
Solution 2
Considering your file has only one column, the csv
module might be an overkill.
Here's a version that first reads all the lines from the file and reopens the file to write urls that are not already in the file:
lines = open('urls_list.csv', 'r').read()
with open('urls_list.csv', 'a+') as fp:
for url in urls_list:
if url in lines:
print "YEY!"
else:
fp.write(url+'\n')
Author by
Konstantin Rusanov
Updated on June 04, 2022Comments
-
Konstantin Rusanov almost 2 years
i got list of URLs, for example:
urls_list = [ "http://yandex.ru", "http://google.ru", "http://rambler.ru", "http://google.ru", "http://gmail.ru", "http://mail.ru" ]
I need to open the csv file, check if each value from list in file - skip to next value, else (if value not in a list) add this value in list.
Result: 1st run - add all lines (if file is empty), 2nd run - doing nothing, because all elements in already in file.
A wrote code, but it's work completely incorrect:
import csv urls_list = [ "http://yandex.ru", "http://google.ru", "http://rambler.ru", "http://google.ru", "http://gmail.ru", "http://mail.ru" ] with open('urls_list.csv', 'r') as fp: for row in fp: for url in urls_list: if url in row: print "YEY!" with open('urls_list.csv', 'a+') as fp: wr = csv.writer(fp, dialect='excel') wr.writerow([url])
-
Mauro Baraldi over 7 yearsYou are open a file in read mode, than, while reading, reopen it to append. There is the root of all problems.
-
ngulam over 7 yearsAs Mauro stated: use a second file to append.
-
Konstantin Rusanov over 7 yearsBut i need to add list element in csv file and check if element isn't in file - do some code and write element in file, if element in file skip this element and go to next, if next element isn't in file - do some code and write element in file, etc...
-
Moses Koledoye over 7 yearsDo you actually need
csv
? Considering your file has only one column -
Konstantin Rusanov over 7 yearsMy main target to make this happen: 1. I parse xml sitemap and go for each link. 2. Parse content, and store in file. 3. I need to check if i already parsed this url - skip, and go to next url.
-
-
Swapnil B. about 4 yearsWould this work for csv file with 524,000 (even 1.3B lines as well) lines? this would need large machine with sufficient RAM to hold this much amount of data.
-
wwii about 4 years@SwapnilB. - you would need to try it or experiment with something smaller and determine memory requirements. There are other ways to approach the problem if either the search list or the search space are too large to fit in memory.
-
Swapnil B. about 4 yearsthis deserves another post but this is how I will handle large processing. Do lazy reading, process each row (and discard it), store processed data in pickle file. Finally write distributed (probably using dispy) jobs to process pickle file. This will convert memory based to file based distributed processing with low cost.
-
wwii about 4 years@SwapnilB. ... IIRC there are numerous Q&A's here regarding processing large (csv) files.