How can I retrieve all Tweets and attributes for a given user using Python?

24,050

Solution 1

If you're open to trying another library, you could give rauth a shot. There's already a Twitter example but if you're feeling lazy and just want a working example, here's how I'd modify that demo script:

from rauth import OAuth1Service

# Get a real consumer key & secret from https://dev.twitter.com/apps/new
twitter = OAuth1Service(
    name='twitter',
    consumer_key='J8MoJG4bQ9gcmGh8H7XhMg',
    consumer_secret='7WAscbSy65GmiVOvMU5EBYn5z80fhQkcFWSLMJJu4',
    request_token_url='https://api.twitter.com/oauth/request_token',
    access_token_url='https://api.twitter.com/oauth/access_token',
    authorize_url='https://api.twitter.com/oauth/authorize',
    base_url='https://api.twitter.com/1/')

request_token, request_token_secret = twitter.get_request_token()

authorize_url = twitter.get_authorize_url(request_token)

print 'Visit this URL in your browser: ' + authorize_url
pin = raw_input('Enter PIN from browser: ')

session = twitter.get_auth_session(request_token,
                                   request_token_secret,
                                   method='POST',
                                   data={'oauth_verifier': pin})

params = {'screen_name': 'github',  # User to pull Tweets from
          'include_rts': 1,         # Include retweets
          'count': 10}              # 10 tweets

r = session.get('statuses/user_timeline.json', params=params)

for i, tweet in enumerate(r.json(), 1):
    handle = tweet['user']['screen_name'].encode('utf-8')
    text = tweet['text'].encode('utf-8')
    print '{0}. @{1} - {2}'.format(i, handle, text)

You can run this as-is, but be sure to update the credentials! These are meant for demo purposes only.

Full disclosure, I am the maintainer of rauth.

Solution 2

You're getting 401 response, which means "Unauthorized." (see HTTP status codes)

Your code looks good. Using api.user_timeline(screen_name="some_screen_name") works for me in the old example I have lying around.

I'm guessing you either need to authorize the app, or there is some problem with your OAuth setup.

Maybe you found this already, but here is the short code example that I started from: https://github.com/nloadholtes/tweepy/blob/nloadholtes-examples/examples/oauth.py

Share:
24,050
chowden
Author by

chowden

Updated on July 16, 2022

Comments

  • chowden
    chowden almost 2 years

    I am attempting to retrieve data from Twitter, using Tweepy for a username typed at the command line. I'm wanting to extract quite a bit of data about the status and user,so have come up with the following:

    Note that I am importing all the required modules ok and have oauth + keys (just not included it here) and filename is correct, just been changed:

    # define user to get tweets for. accepts input from user
    user = tweepy.api.get_user(input("Please enter the twitter username: "))
    
    # Display basic details for twitter user name
    print (" ")
    print ("Basic information for", user.name)
    print ("Screen Name:", user.screen_name)
    print ("Name: ", user.name)
    print ("Twitter Unique ID: ", user.id)
    print ("Account created at: ", user.created_at)
    
    timeline = api.user_timeline(screen_name=user, include_rts=True, count=100)
        for tweet in timeline:
            print ("ID:", tweet.id)
            print ("User ID:", tweet.user.id)
            print ("Text:", tweet.text)
            print ("Created:", tweet.created_at)
            print ("Geo:", tweet.geo)
            print ("Contributors:", tweet.contributors)
            print ("Coordinates:", tweet.coordinates) 
            print ("Favorited:", tweet.favorited)
            print ("In reply to screen name:", tweet.in_reply_to_screen_name)
            print ("In reply to status ID:", tweet.in_reply_to_status_id)
            print ("In reply to status ID str:", tweet.in_reply_to_status_id_str)
            print ("In reply to user ID:", tweet.in_reply_to_user_id)
            print ("In reply to user ID str:", tweet.in_reply_to_user_id_str)
            print ("Place:", tweet.place)
            print ("Retweeted:", tweet.retweeted)
            print ("Retweet count:", tweet.retweet_count)
            print ("Source:", tweet.source)
            print ("Truncated:", tweet.truncated)
    

    I would like this eventually to iterate through all of a user's tweets (up to the 3200 limit). First things first though. So far though I have two problems, I get the following error message regarding retweets:

    Please enter the twitter username: barackobamaTraceback (most recent call last):
      File " usertimeline.py", line 64, in <module>
        timeline = api.user_timeline(screen_name=user, count=100, page=1)
      File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call
        raise TweepError(error_msg)
    tweepy.error.TweepError: Twitter error response: status code = 401
    Traceback (most recent call last):
      File "usertimeline.py", line 42, in <module>
        user = tweepy.api.get_user(input("Please enter the twitter username: "))
      File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call
        raise TweepError(error_msg)
    tweepy.error.TweepError: Twitter error response: status code = 404
    

    Passing the username as a variable seems to be a problem also:

    Traceback (most recent call last):
      File " usertimleline.py", line 64, in <module>
        timeline = api.user_timeline(screen_name=user, count=100, page=1)
      File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call
        raise TweepError(error_msg)
    tweepy.error.TweepError: Twitter error response: status code = 401
    

    I've isolated both these errors, i.e. they aren't working together.

    Forgive my ignorance, I am not too hot with Twitter APIs but am learning pretty rapidly. Tweepy documentation really does suck and I've done loads of reading round on the net, just can't seem to get this fixed. If I can get this sorted, i'll be posting up some documentation.

    I know how to transfer the data into an MySQL db once extracted (it will do that, rather than print to screen) and manipulate it so that I can do stuff with it, it is just getting it out that I am having the problems with. Does anyone have any ideas or is there another method I should be considering?

    Any help really appreciated. Cheers

    EDIT:

    Following on from @Eric Olson's suggestion this morning; I did the following.

    1) Created a completely brand new set of Oauth credentials to test. 2) Copied code across to a new script as follows:

    Oauth

    consumer_key = "(removed)"
    consumer_secret = "(removed)"
    access_key="88394805-(removed)"
    access_secret="(removed)"
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_key, access_secret)
    api=tweepy.API(auth)
    
    
    
    # confirm account being used for OAuth
    print ("API NAME IS: ", api.me().name)
    api.update_status("Using Tweepy from the command line")
    

    The first time i run the script, it works fine and updates my status and returns the API name as follows:

    >>> 
    API NAME IS:  Chris Howden
    

    Then from that point on I get this:

    Traceback (most recent call last):
      File "C:/Users/Chris/Dropbox/Uni_2012-3/6CC995 - Independent Studies/Scripts/get Api name and update status.py", line 19, in <module>
        api.update_status("Using Tweepy frm the command line")
      File "C:\Python32\lib\site-packages\tweepy-1.4-py3.2.egg\tweepy\binder.py", line 153, in _call
        raise TweepError(error_msg)
    tweepy.error.TweepError: Twitter error response: status code = 403
    

    The only reason I can see for it doing something like this is that it is rejecting the generated access token. I shouldn't need to renew the access token should I?

  • chowden
    chowden about 11 years
    Cheers. I've done a little bit more investigation this morning and I've added some additional findings onto the original post...
  • chowden
    chowden about 11 years
    Ace, thanks for your efforts. I have in the meantime manged to find another way to get all what I wanted out using the tweepy module, but this helps to get to understand json a little better.
  • chowden
    chowden about 11 years
    I will post what I've found when it's all complete.