How to iterate over space-separated ASCII file in Python

12,962

Solution 1

This code reads the space separated file.txt

f = open("file.txt", "r")
words = f.read().split()
for w in words:
    print w

Solution 2

file = open("test")
for line in file:
    for word in line.split(" "):
         print word

Solution 3

Untested:

def produce_words(file_):
   for line in file_:
     for word in line.split():
        yield word

def main():
   with open('in.txt', 'r') as file_:
      for word in produce_words(file_):
         print word

Solution 4

If you want to loop over an entire file, then the sensible thing to do is to iterate over the it, taking the lines and splitting them into words. Working line-by-line is best as it means we don't read the entire file into memory first (which, for large files, could take a lot of time or cause us to run out of memory):

with open('in.txt') as input:
    for line in input:
        for word in line.split():
            ...

Note that you could use line.split(" ") if you want to preserve more whitespace, as line.split() will remove all excess whitespace.

Also note my use of the with statement to open the file, as it's more readable and handles closing the file, even on exceptions.

While this is a good solution, if you are not doing anything within the first loop, it's also a little inefficient. To reduce this to one loop, we can use itertools.chain.from_iterable and a generator expression:

import itertools
with open('in.txt') as input:
    for word in itertools.chain.from_iterable(line.split() for line in input):
            ...
Share:
12,962
Hoops
Author by

Hoops

Updated on June 05, 2022

Comments

  • Hoops
    Hoops almost 2 years

    Strange question here.

    I have a .txt file that I want to iterate over. I can get all the words into an array from the file, which is good, but what I want to know how to do is, how do I iterate over the whole file, but not the individual letters, but the words themselves.

    I want to be able to go through the array which houses all the text from the file, and basically count all the instances in which a word appears in it.

    Only problem is I don't know how to write the code for it.

    I tried using a for loop, but that just iterates over every single letter, when I want the whole words.

  • Gareth Latty
    Gareth Latty about 12 years
    This is fine so long as the file is not too large to fit into memory.
  • dorien
    dorien almost 10 years
    So the r is the indicator for space seperated?
  • vz0
    vz0 almost 10 years
    @dorien No, the "r" tells the read() function to Read the file. Other option is "w" for Write (which is not related to this question), and more option are available on the docs.