TypeError: expected string or buffer with re.match and matchObj.group()

18,986

You are reading entire file into line. You get that error because line is a list and not a string or buffer. If you are looping through each line put your strip inside the for loop. Sample below to help you get started.

with open(filename) as f:
    for line in f:
        line = line.strip()
        matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
        #Rest of your code here. Also Use try except to catch AttributeError and IndexError
        try:
            player = matchObj.group(1)
            atBat = matchObj.group(2)
            hit = matchObj.group(3)
            #Other stuff
        except AttributeError as ae:
            print str(ae), "\skipping line:", line
        except IndexError as ie:
            print str(ie), "\skipping line:", line

Also unless you show sample data form your text file I can't say if your regex is accurate.

Update: Here is a working version based on your comments and update. Feel free to modify as you need:

#hard code file name for simplicity
filename = "in.txt"
    #Sample item: 'players': (0, 0, 0)
    playerStats = {}

    with open(filename) as f:
        for line in f:
            line = line.strip()
            #Match object should be here, after you read the line
            matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)

            try:
                player = matchObj.group(1)
                atBat = matchObj.group(2)
                hit = matchObj.group(3)
                runs = matchObj.group(4)

                #Bad indent - Fixed
                #You should put data against player, there is no players variable.
                #initialize all stats to 0
                if player not in playerStats:
                    playerStats[player] = [0, 0, 0]

                playerStats[player][0] += int(atBat)
                playerStats[player][1] += int(hit)
                playerStats[player][2] += int(runs)

            except AttributeError as ae:
                print str(ae), "skipping line:", line
            except IndexError as ie:
                print str(ie), "skipping line:", line

    #calculates average hits
    avgs = {}
    for player in playerStats:
        hitsOfplayer = playerStats[player][1]
        atBatsOfPlayer = playerStats[player][0]
        avgs[player] = round(float(hitsOfplayer)/float(atBatsOfPlayer), 3)
        print "%s: %.3f" % (player, avgs[player])

Contents of in.txt:

Mr X batted 10 times with 6 hits and 50 runs
Mr Y batted 12 times with 1 hits and 150 runs
Mr Z batted 10 times with 4 hits and 250 runs
Mr X batted 3 times with 0 hits and 0 runs
junk data
junk data 2

Output:

'NoneType' object has no attribute 'group' skipping line: junk data
'NoneType' object has no attribute 'group' skipping line: junk data 2
Mr Y: 0.083
Mr X: 0.462
Mr Z: 0.400
Share:
18,986
Virge Assault
Author by

Virge Assault

Updated on June 04, 2022

Comments

  • Virge Assault
    Virge Assault almost 2 years

    I keep getting the error:

    Traceback (most recent call last):
      File "ba.py", line 13, in <module>
        matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 137, in match
        return _compile(pattern, flags).match(string)
    TypeError: expected string or buffer 
    
    1. line should use line.strip to read each line in the file

    2. re.match uses a regex to look for matches to 3 groups (players, hits, atBats) in the string

    3. matchObj.group() should read each group and put the stats where they go in playerStats{} dictionary

    How do I get re.match to attribute a type to the matchObj so I can pull with group() and add to playerStats()?

       import re, sys, os
    
    if len(sys.argv) < 2:
        sys.exit("Usage: %s filename" % sys.argv[0])
    
    filename = sys.argv[1]
    
    if not os.path.exists(filename):
        sys.exit("Error: File '%s' not found" % sys.argv[1])
    
    playerStats = {'players': (0, 0, 0)} 
    
    matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
    
    with open(filename) as f:
        for line in f:
            line = line.strip()
    
        if player in playerStats:
            playerStats[players][0] += atBat
            playerStats[players][1] += hit
    
        if player not in players:
            player = matchObj.group(1)
            playerStats[players][0] = atBat
            playerStats[players][1] = hit
            avgs = 0
    
        else: 
            playerStats[players] = player
            playerStats[players][0] = atBat
            playerStats[players][1] = hit
            playerStats[players][2] = 0
    
        try:
            player = matchObj.group(1)
            atBat = matchObj.group(2)
            hit = matchObj.group(3)
    
            except AttributeError as ae:
                print str(ae), "\skipping line:", line
            except IndexError as ie:
                print str(ie), "\skipping line:", line
    
    #calculates averages
        for players in playerStats:
            avgs[player] = round(float(hits[player])/float(atBats[player]), 3) 
    
        print "%s: %.3f" % (player, avgs[player])
    
  • Virge Assault
    Virge Assault over 9 years
    I edited the above based on your suggestions. Seems to be running through fine but I'm getting a syntax error on the line code for players in playerStats: code. I'm not super familiar with try: and I'm not sure if its causing issues, do you see anything wrong?
  • user3885927
    user3885927 over 9 years
    You need except block after try block. Can you update your question with the new code?
  • Virge Assault
    Virge Assault over 9 years
    Updated. Gives me a syntax error (carrot under the t in first "except").