How to fix this AttributeError?

14,990

Solution 1

There are one or two issues with the code you posted (mainly to do with initializing the HTMLParser properly).

Try running this amended version of your script:

from HTMLParser import HTMLParser

class MLStripper(HTMLParser):
    def __init__(self):
        # initialize the base class
        HTMLParser.__init__(self)

    def read(self, data):
        # clear the current output before re-use
        self._lines = []
        # re-set the parser's state before re-use
        self.reset()
        self.feed(data)
        return ''.join(self._lines)

    def handle_data(self, d):
        self._lines.append(d)

def strip_tags(html):
    s = MLStripper()
    return s.read(html)

html = """Python's <code>easy_install</code>
 makes installing new packages extremely convenient.
 However, as far as I can tell, it doesn't implement
 the other common features of a dependency manager -
 listing and removing installed packages."""

print strip_tags(html)

Solution 2

This error also appears if you override the reset method in HTMLParser class.

In my case I had added a method named reset for some other functionality and discovered that while Python does not tell you there is a problem with doing this (nor was there any indication I was overriding anything), it breaks the HTMLParser class.

Solution 3

You need to call the init in superclass HTMLParser.

you can also do it by using

class MLStripper(HTMLParser):
    def __init__(self):
        super(MLStripper, self).__init__()
        set()
        self.fed = []
Share:
14,990

Related videos on Youtube

Zeynel
Author by

Zeynel

I just installed Discourse forum on a home server (with a lot of help from kind folks here and SuperUser).

Updated on July 01, 2022

Comments

  • Zeynel
    Zeynel almost 2 years

    I installed a stripe package yesterday and now my app is not running. I am trying to understand where the problem is. Is it something to do with PyShell or HTLParser or something else. I am posting with GAE tag as well hoping that the trace from logs may give a clue about the problem:

    MLStripper instance has no attribute 'rawdata'
    Traceback (most recent call last):
      File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
        handler.post(*groups)
      File "/base/data/home/apps/ting-1/1.354723388329082800/ting.py", line 2070, in post
        pitch_no_tags = strip_tags(pitch_original)
      File "/base/data/home/apps/ting-1/1.354723388329082800/ting.py", line 128, in strip_tags
        s.feed(html)
      File "/base/python_runtime/python_dist/lib/python2.5/HTMLParser.py", line 107, in feed
        self.rawdata = self.rawdata + data
    AttributeError: MLStripper instance has no attribute 'rawdata'
    

    This is MLStripper:

    from HTMLParser import HTMLParser
    
    class MLStripper(HTMLParser):
        def __init__(self):
            set()
            self.fed = []
        def handle_data(self, d):
            self.fed.append(d)
        def get_data(self):
            return ''.join(self.fed)
    
    def strip_tags(html):
        s = MLStripper()
        s.feed(html)
        return s.get_data()
    

    MLStripper was working fine until yesterday.

    And these are my other questions:

    https://stackoverflow.com/questions/8152141/how-to-fix-this-attributeerror-with-htmlparser-py

    https://stackoverflow.com/questions/8153300/how-to-fix-a-corrupted-pyshell-py

  • Zeynel
    Zeynel over 12 years
    Many thanks for the answer. It works great. Do you mind adding some comments to the code for me to understand? And also why do you think this was working up to now for months, and suddenly stopped working. Thanks again.
  • ekhumoro
    ekhumoro over 12 years
    @Zeynel. I've added a few comments to show the main changes I made to your original script. As to why your previous script stopped working: that's very hard to say without knowing what's recently changed on your system. But in any case, I think the amended script is more generally correct.
  • noɥʇʎԀʎzɐɹƆ
    noɥʇʎԀʎzɐɹƆ almost 9 years
    Any future stackoverflow people who may visit here: this is especially good advice if you override the __init__ method.
  • ToonZ
    ToonZ over 8 years
    When defining init in a derived class, remember to explicitly call init of the base class. Otherwise, base' init is overridden by the derived's init, which causes undefined attribute problem in the original post.