pyplot x-axis being sorted

29,226

Solution 1

You want to plot just the number of uses using the plotting function, then set the x-labels to the bike ID numbers. So when you plot, don't include the bike ID numbers. Just do plt.plot(c). If you give the plot function only one argument, it creates the x-values itself, in this case as range(len(c)). Then you can change the labels on the x-axis to the bike IDs. This is done with plt.xticks. You need to pass it the list of x-values that it created and the list of labels. So that would be plt.xticks(range(len(c)), b).

Try this:

import pandas as pd
import matplotlib.pyplot as plt

# read in the file and separate it into two lists
a = pd.read_csv('Sorted_Bike_Uses.csv', header=0)
b = a['Bike ID']
c = a['Number of Uses']

# create the graph
plt.plot(c)

# label the x and y axes
plt.xlabel('Bicycles', weight='bold', size='large')
plt.ylabel('Number of Rides', weight='bold', size='large')

# format the x and y ticks
plt.xticks(range(len(c)), b, rotation=50, horizontalalignment='right', weight='bold', size='large')
plt.yticks(weight='bold', size='large')

# give it a title
plt.title("Top Ten Bicycles (by # of uses)", weight='bold')

# displays the graph
plt.show()

Solution 2

If you use .plot method of pandas.DataFrame, just grab the resultant axis and set_xticklables:

a = pd.DataFrame({'Bike ID': [5454, 3432, 4432, 3314],
                  'Number of Uses': [11, 23, 5, 9]})
a.sort(columns='Number of Uses', inplace=True)
ax = a.plot(y='Number of Uses', kind='bar')
_ = ax.set_xticklabels(a['Bike ID'])

enter image description here

Share:
29,226
ay-ay-ron
Author by

ay-ay-ron

"Always code as if the guy who ends up maintaining your code is a violent psychopath who knows where you live" -John F. Woods

Updated on June 30, 2020

Comments

  • ay-ay-ron
    ay-ay-ron almost 4 years

    This is all on a windows 7 x64 bit machine, running python 3.4.3 x64 bit, in the PyCharm Educational edition 1.0.1 compiler. The data being used for this program is taken from the Citi Bike program in New York City (data found here: http://www.citibikenyc.com/system-data).

    I have sorted the data so that I have a new CSV file with just the uniqe bike ID's and how many times each bicycle was ridden (file is called Sorted_Bike_Uses.csv). I am trying to make a graph with the bike ID's against the number of uses (Bike ID's on the x-axis, # of uses on the y-axis). My code looks like this:

    import pandas as pd
    import matplotlib.pyplot as plt
    
    # read in the file and separate it into two lists
    a = pd.read_csv('Sorted_Bike_Uses.csv', header=0)
    b = a['Bike ID']
    c = a['Number of Uses']
    
    # create the graph
    plt.plot(b, c)
    
    # label the x and y axes
    plt.xlabel('Bicycles', weight='bold', size='large')
    plt.ylabel('Number of Rides', weight='bold', size='large')
    
    # format the x and y ticks
    plt.xticks(rotation=50, horizontalalignment='right', weight='bold', size='large')
    plt.yticks(weight='bold', size='large')
    
    # give it a title
    plt.title("Top Ten Bicycles (by # of uses)", weight='bold')
    
    # displays the graph
    plt.show()
    

    It creates an almost correctly formatted graph. The only issue is that it sorts the Bike ID's so that they are in numerical order, rather than being in order of uses. I have tried re-purposing old code that I used to make a similar graph, but it just makes an even worse graph that somehow has two sets of data being plotted. It looks like this:

    my_plot = a.sort(columns='Number of Uses', ascending=True).plot(kind='bar', legend=None)
    
    # labels the x and y axes
    my_plot.set_xlabel('Bicycles')
    my_plot.set_ylabel('Number of Rides')
    
    # sets the labels along the x-axis as the names of each liquor
    my_plot.set_xticklabels(b, rotation=45, horizontalalignment='right')
    
    # displays the graph
    plt.show()
    

    The second set of code is using the same set of data as the first set of code, and has been changed from the original to fit the citi bike data. My google-fu is exhausted. I have tried reformatting the xticks, adding pieces of the second code to the first code, adding pieces of the first code to the second, etc. It is probably something staring me right in the face, but I can't see it. Any help is appreciated.

    • tacaswell
      tacaswell almost 9 years
      Because plot(b, c) plots b against c. If you want to plot them by order of rides, use the xaxis that is their sorted number.
    • ay-ay-ron
      ay-ay-ron almost 9 years
      I want to plot them so that the bike ID's are on the x-axis and are left in the order that they are in the csv file as. In the file they are in order of least ridden bike to most ridden. However, when they are plotted on the graph they are sorted by numerical order, not by least- most ridden. Somewhere in the code the order system is being switched.
  • Roy Shilkrot
    Roy Shilkrot about 6 years
    The order in the chart is not correct though. 3432 has 23, not 9 like the chart suggests.