Loop through links using Selenium Webdriver (Python)

10,684

I'm not sure if this will fix the problem, but in general it is better to use WebDriverWait rather than implicitly_wait since WebDriveWait.until will keep calling the supplied function (e.g. driver.find_element_by_xpath) until the returned value is not False-ish or the timeout (e.g 5000 seconds) is reached -- at which point it raises a selenium.common.execptions.TimeoutException.

import selenium.webdriver.support.ui as UI

def test_text_saver(self):
    driver = self.driver
    wait = UI.WebDriverWait(driver, 5000)
    with open("textsave.txt","w") as textsave:
        list_of_links = driver.find_elements_by_xpath("//*[@id=\"learn-sub\"]/div[4]/div/div/div/div[1]/div[2]/div/div/ul/li/a")
        for link in list_of_links:  # 2
            link.click()   # 1
            text = wait.until(
                lambda driver: driver.find_element_by_xpath("//*[@id=\"learn-sub\"]/div[4]/div/div/div/div[1]/div[1]/div[1]/h1").text)
            textsave.write(text+"\n\n")
            driver.back()
  1. After you click the link, you should wait until the linked url is loaded. So the call to wait.until is placed directly after link.click()
  2. Instead of using

    while x <= link_count:
        ...
        x += 1
    

    it is better to use

    for link in list_of_links: 
    

    For one think, it improves readability. Moreover, you really don't need to care about the number x, all you really care about is looping over the links, which is what the for-loop does.

Share:
10,684
TRoch
Author by

TRoch

Updated on June 09, 2022

Comments

  • TRoch
    TRoch almost 2 years

    Afternoon all. Currently trying to use Selenium webdriver to loop through a list of links on a page. Specifically, it's clicking a link, grabbing a line of text off said page to write to a file, going back, and clicking the next link in a list. The following is what I have:

        def test_text_saver(self):
        driver = self.driver
        textsave = open("textsave.txt","w")
        list_of_links = driver.find_elements_by_xpath("//*[@id=\"learn-sub\"]/div[4]/div/div/div/div[1]/div[2]/div/div/ul/li")
        """Initializing Link Count:"""
        link_count = len(list_of_links)
        while x <= link_count:
            print x
            driver.find_element_by_xpath("//*[@id=\"learn-sub\"]/div[4]/div/div/div/div[1]/div[2]/div/div/ul/li["+str(x)+"]/a").click()
            text = driver.find_element_by_xpath("//*[@id=\"learn-sub\"]/div[4]/div/div/div/div[1]/div[1]/div[1]/h1").text
            textsave.write(text+"\n\n")
            driver.implicitly_wait(5000)
            driver.back()
            x += 1
        textsave.close()
    

    When run, it goes to the initial page, and...goes back to the main page, rather than the subpage that it's supposed to. Printing x, I can see it's incrementing three times rather than one. It also crashes after that. I've checked all my xpaths and such, and also confirmed that it's getting the correct count for the number of links in the list.

    Any input's hugely appreciated--this is really just to flex my python/automation, since I'm just getting into both. Thanks in advance!!

  • TRoch
    TRoch about 10 years
    Aaaah gotcha, understood on the WebDriverWait. Tried it, but the behavior was still the same as before. Logically, it should be iterating properly in the li item. Admittedly though, I could very easily be missing something. I'd paste my shell output, but I'm afraid I've a character limit. What's odd is that it looks like it's completely ignoring the wait, and clicking...I'm not rightly sure what div it's clicking, but it's not the one it's supposed to be.
  • unutbu
    unutbu about 10 years
    Is the URL publicly accessible? If so, post it and I'll try it out.
  • TRoch
    TRoch about 10 years
    Unfortunately not--new behavior though! Once I figure out code formatting in comments, anyway...using the for loop above inside a while loop (to increment x for the list item), it's not even incrementing...yet it's printing the heading on the initial page out to the file 30 times, so it's obviously going through the loop the 30 times. (Oh the perks of being new to both Python and Selenium...)
  • TRoch
    TRoch about 10 years
    So: while x <= link_count: for element in list_of_links: link=driver.find_element... link.click text=wat.until... textsave.write driver.back() x+=1 textsave.close() Is spitting out 30 of the same line (I'm...really failing at the code formatting here in comments, I'm sorry :/)
  • unutbu
    unutbu about 10 years
    I made a change to post. It shows how you can use a for-loop instead of that while-loop. You shouldn't use both, and in this case the for-loop is considered more "Pythonic". Also be sure you are calling link.click() (with parentheses) and not just link.click. The parentheses tell Python to call the function. Without the parentheses the expression evaluates to the function object itself.
  • TRoch
    TRoch about 10 years
    Sigh...so go figure, apparently something's wrong with my local installation of Python/the Selenium packages, because I sent my file over to a colleague and my solution ran without issue. Yay for bashing my head against a wall for two days over something that's been working! Thanks for the help though, you've at least pointed me to some better practices :)