How to retrieve text from response made from Requests python

15,300

Solution 1

You are using re.search wrong. The first argument of the function is the pattern and the second one is the source string:

import re
import requests

s = '<a class=gb1 href=[^>]+>'
r = requests.get('https://www.google.com/?q=python')
result = re.search(s, r.text)

print result.group(0)

If you simply need the list of all matches you can use: re.findall(s, r.text)

Solution 2

You can access the raw text from the response object with the text attribute.

res = requests.get("http://google.com")
re.search('pattern', res.text)

Then, just use a regular expression to "search" or "match" the entire response.

Share:
15,300
user1919
Author by

user1919

Updated on July 14, 2022

Comments

  • user1919
    user1919 almost 2 years

    I am trying to search inside the response of a request (I used Requests and Python). I get the response and check the type of it, which is UNICODE.

    I want to retrieve a specific link which is located between two other strings. I have tried different ways found online such as the:

    • result = re.**search**('Currently: <a ', s)
    • url_file = response.**find**('Currently: <a ', beg=0, end=len(response))

    Also tried to transform the UNICODE string to a normal string:

    • s = unicodedata.normalize(response, title).encode('ascii','ignore')

    I get an error.

    EDITED

    For example:

    This works:

        s = 'asdf=5;iwantthis123jasd'
        result = re.search('asdf=5;(.*)123jasd', s)
        print result.group(1)
    

    This doesn't work (returns error):

        s = 'Currently: <a '
        result = re.search(r.text, s)
        print result.group(1)
    
  • andreas
    andreas over 7 years
    While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.