How to get data from pickle files into a pandas dataframe
18,513
You can use
pd.read_pickle(filename)
- add it to a list
- then
pd.concat(thelist)
Author by
Andrew Smith
Updated on June 21, 2022Comments
-
Andrew Smith almost 2 years
I'm working on a social media sentiment analysis for a class. I have gotten all of the tweets about the Kentucky Derby for a 2 month period saved into pkl files.
My question is: how do I get all of these pickle dump files loaded into a dataframe?
Here is my code:
import sklearn as sk import pandas as pd import got3 def daterange(start_date, end_date): for n in range(int ((end_date - start_date).days)): yield start_date + timedelta(n) start_date = date(2016, 3, 31) end_date = date(2016, 6, 1) dates = [] for single_date in daterange(start_date, end_date): dates.append(single_date.strftime("%Y-%m-%d")) for i in range(len(dates)-1): this_date = dates[i] tomorrow_date = dates[i+1] print("Getting tweets for " + tomorrow_date) tweetCriteria = got3.manager.TweetCriteria() tweetCriteria.setQuerySearch("Kentucky Derby") tweetCriteria.setQuerySearch("KYDerby") tweetCriteria.setSince(this_date) tweetCriteria.setUntil(tomorrow_date) Kentucky_Derby_tweets = got3.manager.TweetManager.getTweets(tweetCriteria) pkl.dump(Kentucky_Derby_tweets, open(tomorrow_date + ".pkl", "wb"))