How to find the frequency of words in a list created from a .csv file

15,349

Solution 1

Try using set()

import csv
with open('input1.csv', 'r') as wordsfile:
words_reader = csv.reader(wordsfile)
for row in words_reader:
    list_of_words = set(row)
    for word in list_of_words:
        count = row.count(word)
        print(word, count)

I am not very familiar with csv library and I dont know if row is a list or not so sorry if this throws an error. If row is a string probably you can use

row = row.split()
list_of_words = set(row)

Hope it helps.

Solution 2

import csv

input1 = input()

with open(input1, 'r') as wordsfile:
    words_reader = csv.reader(wordsfile)
    for row in words_reader:
        list_of_words = row

no_duplicates_in_list = list(dict.fromkeys(list_of_words))
listlength = len(no_duplicates_in_list)

for i in range(listlength):
    print(no_duplicates_in_list[i], list_of_words.count(no_duplicates_in_list[i]))

pretty much the same as Aryman's but the order is the same as in the csv

Solution 3

Alright, so I'm pretty basic with Python but I was able to figure this out in about an hour of trying different for loops etc. I ended up sticking to using lists as that is what the assignment indicated in the instructions. In order to get rid of the duplicates within the first list, I made a second list and nested an if statement that only adds words that aren't contained within it, resulting in a new list of one copy of each word from the first.


filename = input()
words = []
new_words = []
with open(filename, 'r') as csvfile:
    reader = csv.reader(csvfile, delimiter = ',')
    for row in reader:
        for word in row:
            words.append(word)
        for word in words:
            freq = words.count(word)
            if word not in new_words:
                new_words.append(word)
                print(word, freq)
         

Share:
15,349
Admin
Author by

Admin

Updated on June 14, 2022

Comments

  • Admin
    Admin almost 2 years

    I am trying to write a program that first reads in the name of an input file and then reads the file using the csv.reader() method. The file contains a list of words separated by commas. The program should output the words and their frequencies (the number of times each word appears in the file) without any duplicates.

    The file input1.csv has hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy

    So far I have this:

    import csv
    with open('input1.csv', 'r') as wordsfile:
    words_reader = csv.reader(wordsfile)
    for row in words_reader:
        for word in row:
            count = row.count(word)
            print(word, count)
    

    But my output is this: "hello 1 cat 2 man 2 hey 2 dog 2 boy 2 Hello 1 man 2 cat 2 woman 1 dog 2 Cat 1 hey 2 boy 2"

    I am trying to output this but without any duplicates, I'm stumped and any help would be appreciated.

    • wwii
      wwii over 3 years
      When you printed row and word in the loops did you see what you expected? If you are using an IDE now is a good time to learn its debugging features - like setting breakpoints and examining values. Or you could spend a little time and get familiar with the built-in Python debugger. Also, printing stuff at strategic points in your program can help you trace what is or isn't happening.
    • Admin
      Admin over 3 years
      @wwii Im gonna be honest, I didnt understand anything you just said. Im just a beginner at this and this was something we just learned in class.
  • Admin
    Admin over 3 years
    I wish I could upvote your comment but I'm new here, and too low rep. You were right .csv is a list. What does set() do and how does it work? It seems that the command is random, is there a way to make it so that its that hen I put it in its in order?
  • Aryman Deshwal
    Aryman Deshwal over 3 years
    set() creates a set from the list, so it removes all words that are repeating. to remove randomness you could use list_of_words = list(set(row))
  • Joshua Desir
    Joshua Desir about 2 years
    This helped me get the word's frequency. I hope that helps. I was able to create a dict out of the words and their frequency, then print those keys and their values for the answer.
  • Admin
    Admin about 2 years
    Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.