Pandas Dataframe - RemoteDataError - Python

10,959

pandas_datareader throws this error when yahoo does not make data for the ticker in question available through its API.

When reading your .csv file, you are including newline characters, so pandas_datareader doesn't recognize the tickers.

data = web.DataReader(ticker.strip('\n'), "yahoo", datetime(2011, 1, 1), datetime(2015, 12, 31))

works when I create a file that lists tickers in the first column.

Might be easier to do:

tickers = pd.read_csv('Companylistnysenasdaq.csv')
for ticker in tickers.iloc[:, 0].tolist():

assuming your file is a simple list with tickers in the first column. Might need header=None in read_csv depending on your file formatting.

To handle errors, you can:

from pandas_datareader._utils import RemoteDataError

try:
    stockData = DataReader(ticker, 'yahoo', datetime(2015, 1, 1), datetime.today())
except RemoteDataError:
    # handle error
Share:
10,959
RageAgainstheMachine
Author by

RageAgainstheMachine

Updated on June 04, 2022

Comments

  • RageAgainstheMachine
    RageAgainstheMachine almost 2 years

    I'm trying to pull data from yahoo finance.

    Here is the error I'm getting:

    File "banana.py", line 35, in <module>
        data = web.DataReader(ticker, "yahoo", datetime(2011,1,1), datetime(2015,12,31))
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\data.py", line 94, in DataReader
        session=session).read()
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\yahoo\daily.py", line 77, in read
        df = super(YahooDailyReader, self).read()
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 173, in read
        df = self._read_one_data(self.url, params=self._get_params(self.symbols))
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 80, in _read_one_data
        out = self._read_url_as_StringIO(url, params=params)
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 91, in _read_url_as_StringIO
        response = self._get_response(url, params=params)
      File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 117, in _get_response
        raise RemoteDataError('Unable to read URL: {0}'.format(url))
    pandas_datareader._utils.RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv
    

    The error shows up when I read from a .csv file instead of a list of tickers:

    This works:

    for ticker in ['MSFT']:
    

    This doesn't:

    input_file = open("testlist.csv", 'r')
    for ticker in input_file: 
    

    I've even put in exceptions (see below) but still not working:

        except RemoteDataError:
            print("No information for ticker '%s'" % t)
            continue
    
        except IndexError:
            print("Something went wacko for ticker '%s', trying again..." % t)
            continue
    
        except Exception, e:
            continue
    
        except:
            print "Can't find ", ticker
    

    My code:

    from datetime import datetime
    from pandas_datareader import data, wb
    import pandas_datareader.data as web
    import pandas as pd
    from pandas_datareader._utils import RemoteDataError
    import csv
    import sys 
    import os
    
    class MonthlyChange(object):
        months = { 0:'JAN', 1:'FEB', 2:'MAR', 3:'APR', 4:'MAY',5:'JUN', 6:'JUL', 7:'AUG', 8:'SEP', 9:'OCT',10:'NOV', 11:'DEC' }
    
    def __init__(self,month):
        self.month = MonthlyChange.months[month-1]
        self.sum_of_pos_changes=0
        self.sum_of_neg_changes=0
        self.total_neg=0
        self.total_pos=0
    def add_change(self,change):
        if change < 0:
            self.sum_of_neg_changes+=change
            self.total_neg+=1
        elif change > 0:
            self.sum_of_pos_changes+=change
            self.total_pos+=1
    def get_data(self):
        if self.total_pos == 0:
            return (self.month,0.0,0,self.sum_of_neg_changes/self.total_neg,self.total_neg)
        elif self.total_neg == 0:
            return (self.month,self.sum_of_pos_changes/self.total_pos,self.total_pos,0.0,0)
        else:
            return (self.month,self.sum_of_pos_changes/self.total_pos,self.total_pos,self.sum_of_neg_changes/self.total_neg,self.total_neg)
    
    input_file = open("Companylistnysenasdaq.csv", 'r')
    
    for ticker in input_file:  #for ticker in input_file:
    
    
    print(ticker)
    data = web.DataReader(ticker, "yahoo", datetime(2011,1,1), datetime(2015,12,31))
    data['ymd'] = data.index
    year_month = data.index.to_period('M')
    data['year_month'] = year_month
    first_day_of_months = data.groupby(["year_month"])["ymd"].min()
    first_day_of_months = first_day_of_months.to_frame().reset_index(level=0)
    last_day_of_months = data.groupby(["year_month"])["ymd"].max()
    last_day_of_months = last_day_of_months.to_frame().reset_index(level=0)
    fday_open = data.merge(first_day_of_months,on=['ymd'])
    fday_open = fday_open[['year_month_x','Open']]
    lday_open = data.merge(last_day_of_months,on=['ymd'])
    lday_open = lday_open[['year_month_x','Open']]
    
    fday_lday = fday_open.merge(lday_open,on=['year_month_x'])
    monthly_changes = {i:MonthlyChange(i) for i in range(1,13)}
    for index,ym, openf,openl in fday_lday.itertuples():
        month = ym.strftime('%m')
        month = int(month)
        diff = (openf-openl)/openf
        monthly_changes[month].add_change(diff)
    changes_df = pd.DataFrame([monthly_changes[i].get_data() for i in monthly_changes],columns=["Month","Avg Inc.","Inc","Avg.Dec","Dec"])
    
    
    
    
    t = ticker.strip()                                       
    j = 0
    while j < 13:
    
        try:
            if len(changes_df.loc[changes_df.Inc > 2,'Month']) != 0:
                print ticker
                print ("Increase Months: ")
                print (changes_df.loc[changes_df.Inc > 2,'Month'])
    
            if len(changes_df.loc[changes_df.Dec > 2,'Month']) != 0:
                print ticker
                print ("Decrease Months: ")
                print (changes_df.loc[changes_df.Dec > 2,'Month'])
    
            j += 13
    
    
        except RemoteDataError:
            print("No information for ticker '%s'" % t)
            j += 13
            continue
    
        except IndexError:
            print("Something went googoo for ticker '%s', trying again..." % t)
            j += 1
            time.sleep(30)
            continue
    
        except Exception, e:
            j+=13
            time.sleep(30)
            continue
    
        except:
            print "Can't find ", ticker
    
    
    input_file.close()
    
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    I just tested it with putting only MSFT in the .csv file and it still no luck....
  • Stefan
    Stefan almost 8 years
    if you do print(ticker) after for ticker in input_list:, does it show as expected without spaces or other changes to the ticker?
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    you're on to something here.....i get the SAME error........Actually it prints the ticker but I STILL see the error show up afterwards.
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    it says its on line 40: data = web.DataReader(ticker, "yahoo", datetime(2011,1,1), datetime(2015,12,31))
  • Stefan
    Stefan almost 8 years
    Which ticker is on that line? Chances are that you try the ticker in the online portal and notice there is no 'download historical data to excel' button on the 'historical prices' tab.
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    the ticker is AAPL, no way that one doesnt have historical data! "ticker" links to the tickers inside the .csv. Inside the .csv I put AAPL
  • Stefan
    Stefan almost 8 years
    The query runs fine with 'AAPL' and your start/end dates for me.
  • Merlin
    Merlin almost 8 years
    please see edit and add results top few line to question.
  • Stefan
    Stefan almost 8 years
    See updated answer, there's an issue with how you are reading the .csv file, your tickers include newline characters ('\n').
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    @ Stefan, It works (I made your changes) but for some tickers it keeps crashing with the exact same error (ex: TFSCU). How would you approach this? I tried using the remotedataerror exception but it does nothing...
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    @ Stefan, Great job, much appreciated! There's one last thing, the way I set the date information in the for loop doesn't seem to be working, can you figure out where I messed up? Thanks.
  • Stefan
    Stefan almost 8 years
    Would you mind asking a new question where you show how exactly you are setting the date info, and what error message you are receiving? Easier to respond that way.
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years
    @ Stefan. There's no error message, I'm just not getting accurate returns based on how the dates are set up.
  • Stefan
    Stefan almost 8 years
    If it's not about the RemoteDataError but about what is going on where you iterate through your data, you should ask a new question as this is a bit beyond the scope of this one.
  • RageAgainstheMachine
    RageAgainstheMachine almost 8 years