Extract data from lines of a text file
Solution 1
The following will read everything into a dictionary keyed by player name. The value associated with each player is itself a dictionary acting as a record with named fields associated with the items converted to a format suitable for further processing.
info = {}
with open('scoring_info.txt') as input_file:
for line in input_file:
player, stats, outcome, date = (
item.strip() for item in line.split('-', 3))
stats = dict(zip(('kills', 'deaths', 'assists'),
map(int, stats.split('/'))))
date = tuple(map(int, date.split('-')))
info[player] = dict(zip(('stats', 'outcome', 'date'),
(stats, outcome, date)))
print('info:')
for player, record in info.items():
print(' player %r:' % player)
for field, value in record.items():
print(' %s: %s' % (field, value))
# sample usage
player = 'Fizz'
print('\n%s had %s kills in the game' % (player, info[player]['stats']['kills']))
Output:
info:
player 'Shyvana':
date: (2012, 11, 22)
outcome: Loss
stats: {'assists': 5, 'kills': 12, 'deaths': 4}
player 'Miss Fortune':
date: (2012, 11, 22)
outcome: Win
stats: {'assists': 3, 'kills': 12, 'deaths': 4}
player 'Fizz':
date: (2012, 11, 22)
outcome: Win
stats: {'assists': 5, 'kills': 12, 'deaths': 4}
Fizz had 12 kills in the game
Alternatively, rather than holding most of the data in dictionaries, which can make nested-field access a little awkward — info[player]['stats']['kills']
— you could instead use a little more advanced "generic" class to hold them, which will let you write info2[player].stats.kills
instead.
To illustrate, here's almost the same thing using a class I've named Struct
because it's somewhat like the C language's struct
data type:
class Struct(object):
""" Generic container object """
def __init__(self, **kwds): # keyword args define attribute names and values
self.__dict__.update(**kwds)
info2 = {}
with open('scoring_info.txt') as input_file:
for line in input_file:
player, stats, outcome, date = (
item.strip() for item in line.split('-', 3))
stats = dict(zip(('kills', 'deaths', 'assists'),
map(int, stats.split('/'))))
victory = (outcome.lower() == 'win') # change to boolean T/F
date = dict(zip(('year','month','day'), map(int, date.split('-'))))
info2[player] = Struct(champ_name=player, stats=Struct(**stats),
victory=victory, date=Struct(**date))
print('info2:')
for rec in info2.values():
print(' player %r:' % rec.champ_name)
print(' stats: kills=%s, deaths=%s, assists=%s' % (
rec.stats.kills, rec.stats.deaths, rec.stats.assists))
print(' victorious: %s' % rec.victory)
print(' date: %d-%02d-%02d' % (rec.date.year, rec.date.month, rec.date.day))
# sample usage
player = 'Fizz'
print('\n%s had %s kills in the game' % (player, info2[player].stats.kills))
Output:
info2:
player 'Shyvana':
stats: kills=12, deaths=4, assists=5
victorious: False
date: 2012-11-22
player 'Miss Fortune':
stats: kills=12, deaths=4, assists=3
victorious: True
date: 2012-11-22
player 'Fizz':
stats: kills=12, deaths=4, assists=5
victorious: True
date: 2012-11-22
Fizz had 12 kills in the game
Solution 2
There are two ways to read the data out from your textfile example.
First method
You can use python's csv module and specify that your delimiter is -
.
See http://www.doughellmann.com/PyMOTW/csv/
Second method
Alternatively, if you don't want to use this csv module, you can simply use the split
method after you have read each line in your file as a string.
f = open('myTextFile.txt', "r")
lines = f.readlines()
for line in lines:
words = line.split("-") # words is a list (of strings from a line), delimited by "-".
So in your example above, champname
will actually be the first item in the words
list, which is words[0]
.
Solution 3
You want to use split (' - ') to get the parts, then perhaps again to get the numbers:
for line in yourfile.readlines ():
data = line.split (' - ')
nums = [int (x) for x in data[1].split ('/')]
Should get you all the stuff you need in data[] and nums[]. Alternatively, you can use the re module and write a regular expression for it. This doesn't seem complex enough for that, though.
Solution 4
# Iterates over the lines in the file.
for line in open('data_file.txt'):
# Splits the line in four elements separated by dashes. Each element is then
# unpacked to the correct variable name.
champname, score, winloss, timestamp = line.split(' - ')
# Since 'score' holds the string with the three values joined,
# we need to split them again, this time using a slash as separator.
# This results in a list of strings, so we apply the 'int' function
# to each of them to convert to integer. This list of integers is
# then unpacked into the kills, deaths and assists variables
kills, deaths, assists = map(int, score.split('/'))
# Now you are you free to use the variables read to whatever you want. Since
# kills, deaths and assists are integers, you can sum, multiply and add
# them easily.
Solution 5
First, you break the line into data fragments
>>> name, score, result, date = "Fizz - 12/4/5 - Win - 2012-11-22".split(' - ')
>>> name
'Fizz'
>>> score
'12/4/5'
>>> result
'Win'
>>> date
'2012-11-22'
Second, parse your score
>>> k,d,a = map(int, score.split('/'))
>>> k,d,a
(12, 4, 5)
And finally, convert the date string into date object
>>> from datetime import datetime
>>> datetime.strptime(date, '%Y-%M-%d').date()
datetime.date(2012, 1, 22)
Now you have all your parts parsed and normalized to data types.
Kassandra
Updated on July 19, 2022Comments
-
Kassandra almost 2 years
I need to extract data from lines of a text file. The data is name and scoring information formatted like this:
Shyvana - 12/4/5 - Loss - 2012-11-22 Fizz - 12/4/5 - Win - 2012-11-22 Miss Fortune - 12/4/3 - Win - 2012-11-22
This file is generated by another part of my little python program where I ask the user for the name, lookup the name they enter to ensure it's valid from a list of names, and then ask for kills, deaths, assists, and whether they won or lost. Then I ask for confirmation and write that data to the file on a new line, and append the date at the end like that. The code that prepares that data:
data = "%s - %s/%s/%s - %s - %s\n" % ( champname, kills, deaths, assists, winloss, timestamp)
Basically I want to read that data back in another part of the program and display it to the user and do calculations with it like averages over time for a particular name.
I'm new to python and and I'm not very experienced with programming in general so most of the string splitting and formatting examples I find are just too cryptic for me to understand how to adapt to quite what I need here, could anyone help? I could format the written data differently so token finding would be simpler, but I want it to be simple directly in the file.