Iterating over columns with for loops in pandas dataframe

11,282

iteritems iterates over columns, not rows. But your real problem is when you are trying to df[row] instead of df[index]. I'd switch wording to columns and do this:

for colname, col in df.iteritems():
p = figure()
p.scatter(df['Sample'], df[colname])
show(p)
Share:
11,282
JeremyD
Author by

JeremyD

Updated on June 04, 2022

Comments

  • JeremyD
    JeremyD almost 2 years

    I am trying to take a dataframe read in from CSV file, and generate scatter plots for each column within the dataframe. For example, I have read in the following with df=pandas.readcsv()

    Sample    AMP    ADP    ATP
    1A        239847 239084 987374
    1B        245098 241210 988950
    2A        238759 200554 921032
    2B        230029 215408 899804
    

    I would like to generate a scatter plot using sample as the x values, and the areas for each of the columns.

    I am using the following code with bokeh.plotting to plot each column manually

    import pandas
    from bokeh.plotting import figure, show
    
    df = pandas.read_csv("data.csv")
    p = figure(x_axis_label='Sample', y_axis_label='Peak Area', x_range=sorted(set(df['Sample'])))
    p.scatter(df['Sample'], df['AMP'])
    show(p)
    

    This generates scatter plots successfully, but I would like to create a loop to generate a scatter plot for each column. In my full dataset, I have over 500 columns I would like to plot.

    I have followed references for using df.iteritems and df.itertuples for iterating through dataframes, but I'm not sure how to get the output I want.

    I have tried the following:

    for index, row in df.iteritems():
        p = figure()
        p.scatter(df['Sample'], df[row])
        show(p)
    

    I hit an error right away:

    raise KeyError('%s not in index' % objarr[mask] KeyError: "['1A' '1B' '2A' '2B'] not in index

    Any guidance? Thanks in advance.