Using python, how do you select a random row of a csv file?

17,023

Solution 1

Use the random and csv modules.

If your csv file is small enough to fit into memory, you could read the whole thing then select a line:

import csv
import random

with open(filename) as f:
    reader = csv.reader(f)
    chosen_row = random.choice(list(reader))

You have to read in the whole file at once because choice needs to know how many rows there are.

If you're happy making more than one pass over the data you could count the rows and then choose a random row and read in the file again up to that row:

with open(filename) as f:
    lines = sum(1 for line in f)
    line_number = random.randrange(lines)

with open(filename) as f:
    reader = csv.reader(f)
    chosen_row = next(row for row_number, row in enumerate(reader)
                      if row_number == line_number)

If you want to incrementally, and randomly, choose a row, without knowing how many rows there will be, you can use reservoir sampling. This may be slower, as it will make multiple random choices until it runs out of rows, but it will only need one row in memory at a time:

with open(filename) as f:
    reader = csv.reader(f)
    for index, row in enumerate(reader):
        if index == 0:
            chosen_row = row
        else:
            r = random.randint(0, index)
            if r == 0:
                chosen_row = row

Solution 2

You could use pandas:

import pandas as pd
csvfile = pd.read_csv('/your/file/path/here')
print csvfile.sample()

Solution 3

Since you stated that all words are in one column, that makes it easier to parse the file. Here is my solution:

import random

with open('random_word_from_file.txt') as f:
    words = f.read().split()
    my_pick = random.choice(words)
    print my_pick

Notes

  • In this solution, I assume the size of the file reasonably fits in memory
  • I used f.read().split() instead of f.readlines() because the later does not strip new line characters off the words
  • Once having a list of words, it is a matter of calling random.choice() to pick one randomly
Share:
17,023
Elliot Lee
Author by

Elliot Lee

Updated on June 17, 2022

Comments

  • Elliot Lee
    Elliot Lee almost 2 years

    I need to select a random word from a csv file and I just don't know how to start it off. All the words are in one column, but I want to get a random row so as I can output a random word. Any thoughts?

    • Peter Wood
      Peter Wood about 7 years
      Use the random and csv modules.
    • CodeCupboard
      CodeCupboard about 7 years
      I would count the number of rows. From this a random integer could be generated in the range of 1 to number of rows. After you have done this simply read the word off at that row?
    • matth
      matth about 7 years
      If the file is too large to read into memory all at once, you could use resevoir sampling.
  • Hai Vu
    Hai Vu about 7 years
    Excellent way to count lines in a file.