name error 'html' not defined with beautifulsoup4

12,408

The error is correct, you haven't defined html anywhere. The documentation you link to shows that you should be passing "html.parser" as a string; it doesn't look like you need to import HTMLParser at all.

Share:
12,408
Tarun Uday
Author by

Tarun Uday

Don't Panic.

Updated on June 04, 2022

Comments

  • Tarun Uday
    Tarun Uday almost 2 years

    My python 3.4.4 code is:

    import urllib.request
    from bs4 import BeautifulSoup
    from html.parser import HTMLParser
    
    urls = 'file:///C:/Users/tarunuday/Documents/scrapdata/mech.html'
    htmlfile = urllib.request.urlopen(urls)
    soup = BeautifulSoup(htmlfile,html.parser)
    

    I'm getting this error

    Traceback (most recent call last):
        File "C:\Python34\saved\scrapping\scrapping2.py", line 7, in <module>
        soup = BeautifulSoup(htmlfile,html.parser)
        NameError: name 'html' is not defined
    

    Now I understand that HTMLParser is py2.x and html.parser is py3.x but how can I get this to work? The bs4 site says If you get the ImportError “No module named html.parser”, your problem is that you’re running the Python 3 version of the code under Python 2., but I'm running 3.x and I'm getting a NameError not an ImportError

  • Tarun Uday
    Tarun Uday about 8 years
    holy... Wow man, FML. My bad. I spent a couple of hours on that. Thanks.