How do I make a histogram from a csv file which contains a single column of numbers in python?

11,679

Solution 1

You can do it in one line with pandas:

import pandas as pd

pd.read_csv('D1.csv', quoting=2)['column_you_want'].hist(bins=50)

Solution 2

Okay I finally got something to work with headings, titles, etc.

import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('D1.csv', quoting=2)
data.hist(bins=50)
plt.xlim([0,115000])
plt.title("Data")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

My first problem was that matplotlib is necessary to actually show the graph. Also, I needed to set the action

pd.read_csv('D1.csv', quoting=2)

to data so I could plot the histogram of that action with

data.hist

Thank you all for the help.

Share:
11,679
Daniel Hodgkins
Author by

Daniel Hodgkins

Updated on June 04, 2022

Comments

  • Daniel Hodgkins
    Daniel Hodgkins almost 2 years

    I have a csv file (excel spreadsheet) of a column of roughly a million numbers. I want to make a histogram of this data with the frequency of the numbers on the y-axis and the number quantities on the x-axis. I know matplotlib can plot a histogram, but my main problem is converting the csv file from string to float since a string can't be graphed. This is what I have:

    import matplotlib.pyplot as plt
    import csv
    
    with open('D1.csv', 'rb') as data:
        rows = csv.reader(data, quoting = csv.QUOTE_NONNUMERIC) 
        floats = [[item for number, item in enumerate(row) if item and (1 <= number <= 12)] for row in rows]
    plt.hist(floats, bins=50)
    plt.title("histogram")
    plt.xlabel("value")
    plt.ylabel("frequency")
    plt.show()