How can I get python to read every nth line of a .txt file?
Solution 1
You can't... [do it using pure I/O functions] You have to read all lines and simply make your code ignore the lines that you do not need.
For example:
with open(filename) as f:
lines = f.readlines()
desired_lines = lines[start:end:step]
In the code above replace start, end, and step with desired values, e.g., "...if I wanted to read line 2 of a .txt file, but every 4 lines after that..." you would do like this:
desired_lines = lines[1::4]
Solution 2
A little late to the party, but here is a solution that doesn't require to read the whole file contents into RAM at once, which might cause trouble when working with large enough files:
# Print every second line.
step = 2
with open("file.txt") as handle:
for lineno, line in enumerate(handle):
if lineno % step == 0:
print(line)
File objects (handle
) allow to iterate over lines. This means we can read the file line by line without accumulating all the lines if we don't need to. To select every n-th line, we use the modulo operator with the current line number and the desired step size.
Solution 3
You can first open
the file
as f
with a with
statement. Then, you can iterate
through every line
in the file
using Python's slicing
notation.
If we take f.read()
, we get a string
with a new-line
(\n
) character at the end of every line
:
"line1\nline2\nline3\n"
so to convert this to a list
of lines
so that we can slice
it to get every other line
, we need to split
it on each occurrence of \n
:
f.read().split()
which for the above example will give:
["line1", "line2", "line3"]
Finally, we need to get every-other line
, this is done with the slice
[::2]
. We know this from the way that slicing
works:
list[start : stop : step]
Using all this, we can write a for-loop
which will iterate
through every-other line
:
with open("file.txt", "r") as f:
for line in f.read().split("\n")[::2]:
print(line)
superasiantomtom95
Updated on December 02, 2020Comments
-
superasiantomtom95 over 3 years
If I wanted python to read every 2nd (or every 4th) line of a file, how could I tell it to do so? Additionally, if I wanted to read line 2 of a .txt file, but every 4 lines after that (next line would be line 6, then 10, and so on forth), how could I make it do so?
-
Joe Iddon over 6 yearsPython doesn't use
arrays
, it useslists
, and this just won't work... -
Joe Iddon over 6 yearsYou also need a space between the end of the
function
definition and the call to thefunction
... -
Rehan Shikkalgar over 6 yearsit is just name of variable, have you tried executing this code?
-
Joe Iddon over 6 yearsYes, it fails in many ways... And I know its just a
variable
name, but it just shows a lack of understanding, that's all. -
AGN Gazer over 6 yearsI do not know why your answer was down-voted but I can reassure you it was not me.
-
Joe Iddon over 6 years@AGNGazer Don't worry, I wasn't accusing anyone, I just wanted to see what someone thought was not useful about it... A downvote without a comment isn't that helpful
-
Rehan Shikkalgar over 6 yearscan you give me example where it fails?
-
Joe Iddon over 6 yearsWell it will give an
IndexError
even if you leave a line before the call to the function. This is because you are adding all the lines in thefile
to alist
and theniterating
through everyindex
in thislist
. Then, you try to get the value from the list of thatindex
multiplied by the input. This means that unless the input is1
, you are going to be trying to get an element in the list which is way over the end - giving anIndexError
. You can fix the error by making the secondfor-loop
go tolen(array)/lineNo
. Nevertheless, atm your code does not work as I said. -
Tomerikoo over 3 yearsProbably the fact that this reads the whole file into a string instead of iterating the file line-by-line...
-
Tomerikoo over 3 yearsDownvoted because of the statement: "You have to read all lines and simply make your code ignore the lines that you do not need". Not true and not necessary, see other answers here... This creates two unnecessary lists
-
Tomerikoo over 3 yearsThis will fail for the second example of start from line 2 and take every 4 lines. You would expect to take line 6, but
6 % 4 != 0
...