How can I retrieve all Tweets and attributes for a given user using Python?
Solution 1
If you're open to trying another library, you could give rauth a shot. There's already a Twitter example but if you're feeling lazy and just want a working example, here's how I'd modify that demo script:
from rauth import OAuth1Service
# Get a real consumer key & secret from https://dev.twitter.com/apps/new
twitter = OAuth1Service(
name='twitter',
consumer_key='J8MoJG4bQ9gcmGh8H7XhMg',
consumer_secret='7WAscbSy65GmiVOvMU5EBYn5z80fhQkcFWSLMJJu4',
request_token_url='https://api.twitter.com/oauth/request_token',
access_token_url='https://api.twitter.com/oauth/access_token',
authorize_url='https://api.twitter.com/oauth/authorize',
base_url='https://api.twitter.com/1/')
request_token, request_token_secret = twitter.get_request_token()
authorize_url = twitter.get_authorize_url(request_token)
print 'Visit this URL in your browser: ' + authorize_url
pin = raw_input('Enter PIN from browser: ')
session = twitter.get_auth_session(request_token,
request_token_secret,
method='POST',
data={'oauth_verifier': pin})
params = {'screen_name': 'github', # User to pull Tweets from
'include_rts': 1, # Include retweets
'count': 10} # 10 tweets
r = session.get('statuses/user_timeline.json', params=params)
for i, tweet in enumerate(r.json(), 1):
handle = tweet['user']['screen_name'].encode('utf-8')
text = tweet['text'].encode('utf-8')
print '{0}. @{1} - {2}'.format(i, handle, text)
You can run this as-is, but be sure to update the credentials! These are meant for demo purposes only.
Full disclosure, I am the maintainer of rauth.
Solution 2
You're getting 401 response, which means "Unauthorized." (see HTTP status codes)
Your code looks good. Using api.user_timeline(screen_name="some_screen_name")
works for me in the old example I have lying around.
I'm guessing you either need to authorize the app, or there is some problem with your OAuth setup.
Maybe you found this already, but here is the short code example that I started from: https://github.com/nloadholtes/tweepy/blob/nloadholtes-examples/examples/oauth.py
chowden
Updated on July 16, 2022Comments
-
chowden almost 2 years
I am attempting to retrieve data from Twitter, using Tweepy for a username typed at the command line. I'm wanting to extract quite a bit of data about the status and user,so have come up with the following:
Note that I am importing all the required modules ok and have oauth + keys (just not included it here) and filename is correct, just been changed:
# define user to get tweets for. accepts input from user user = tweepy.api.get_user(input("Please enter the twitter username: ")) # Display basic details for twitter user name print (" ") print ("Basic information for", user.name) print ("Screen Name:", user.screen_name) print ("Name: ", user.name) print ("Twitter Unique ID: ", user.id) print ("Account created at: ", user.created_at) timeline = api.user_timeline(screen_name=user, include_rts=True, count=100) for tweet in timeline: print ("ID:", tweet.id) print ("User ID:", tweet.user.id) print ("Text:", tweet.text) print ("Created:", tweet.created_at) print ("Geo:", tweet.geo) print ("Contributors:", tweet.contributors) print ("Coordinates:", tweet.coordinates) print ("Favorited:", tweet.favorited) print ("In reply to screen name:", tweet.in_reply_to_screen_name) print ("In reply to status ID:", tweet.in_reply_to_status_id) print ("In reply to status ID str:", tweet.in_reply_to_status_id_str) print ("In reply to user ID:", tweet.in_reply_to_user_id) print ("In reply to user ID str:", tweet.in_reply_to_user_id_str) print ("Place:", tweet.place) print ("Retweeted:", tweet.retweeted) print ("Retweet count:", tweet.retweet_count) print ("Source:", tweet.source) print ("Truncated:", tweet.truncated)
I would like this eventually to iterate through all of a user's tweets (up to the 3200 limit). First things first though. So far though I have two problems, I get the following error message regarding retweets:
Please enter the twitter username: barackobamaTraceback (most recent call last): File " usertimeline.py", line 64, in <module> timeline = api.user_timeline(screen_name=user, count=100, page=1) File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call raise TweepError(error_msg) tweepy.error.TweepError: Twitter error response: status code = 401 Traceback (most recent call last): File "usertimeline.py", line 42, in <module> user = tweepy.api.get_user(input("Please enter the twitter username: ")) File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call raise TweepError(error_msg) tweepy.error.TweepError: Twitter error response: status code = 404
Passing the username as a variable seems to be a problem also:
Traceback (most recent call last): File " usertimleline.py", line 64, in <module> timeline = api.user_timeline(screen_name=user, count=100, page=1) File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call raise TweepError(error_msg) tweepy.error.TweepError: Twitter error response: status code = 401
I've isolated both these errors, i.e. they aren't working together.
Forgive my ignorance, I am not too hot with Twitter APIs but am learning pretty rapidly. Tweepy documentation really does suck and I've done loads of reading round on the net, just can't seem to get this fixed. If I can get this sorted, i'll be posting up some documentation.
I know how to transfer the data into an MySQL db once extracted (it will do that, rather than print to screen) and manipulate it so that I can do stuff with it, it is just getting it out that I am having the problems with. Does anyone have any ideas or is there another method I should be considering?
Any help really appreciated. Cheers
EDIT:
Following on from @Eric Olson's suggestion this morning; I did the following.
1) Created a completely brand new set of Oauth credentials to test. 2) Copied code across to a new script as follows:
Oauth
consumer_key = "(removed)" consumer_secret = "(removed)" access_key="88394805-(removed)" access_secret="(removed)" auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_key, access_secret) api=tweepy.API(auth) # confirm account being used for OAuth print ("API NAME IS: ", api.me().name) api.update_status("Using Tweepy from the command line")
The first time i run the script, it works fine and updates my status and returns the API name as follows:
>>> API NAME IS: Chris Howden
Then from that point on I get this:
Traceback (most recent call last): File "C:/Users/Chris/Dropbox/Uni_2012-3/6CC995 - Independent Studies/Scripts/get Api name and update status.py", line 19, in <module> api.update_status("Using Tweepy frm the command line") File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call raise TweepError(error_msg) tweepy.error.TweepError: Twitter error response: status code = 403
The only reason I can see for it doing something like this is that it is rejecting the generated access token. I shouldn't need to renew the access token should I?
-
chowden about 11 yearsCheers. I've done a little bit more investigation this morning and I've added some additional findings onto the original post...
-
chowden about 11 yearsAce, thanks for your efforts. I have in the meantime manged to find another way to get all what I wanted out using the tweepy module, but this helps to get to understand json a little better.
-
chowden about 11 yearsI will post what I've found when it's all complete.