'list' object has no attribute 'map' in pyspark

20,176

map(filterOut2, data) works:

>>> data = [[1,2,3,5],[1,2,5,2],[3,5,2,8],[6,3,1,2],[5,3,2,5],[4,1,2,5] ]
... def filterOut2(line):
...     return [x for x in line if x != 2]
... list(map(filterOut2, data))
...
[[1, 3, 5], [1, 5], [3, 5, 8], [6, 3, 1], [5, 3, 5], [4, 1, 5]]

map() takes exactly 1 argument (2 given)

Looks like you redefined map. Try __builtin__.map(filterOut2, data).

Or, use a list comprehension:

>>> [filterOut2(line) for line in data]
[[1, 3, 5], [1, 5], [3, 5, 8], [6, 3, 1], [5, 3, 5], [4, 1, 5]]
Share:
20,176
zahra
Author by

zahra

Updated on July 09, 2022

Comments

  • zahra
    zahra almost 2 years

    I'm new in pyspark . I write this code in pyspark:

    def filterOut2(line):
        return [x for x in line if x != 2]
    filtered_lists = data.map(filterOut2)
    

    but I get this error:

    'list' object has no attribute 'map'
    

    How do I perform a map operation specifically on my data in PySpark in a way that allows me to filter my data to only those values for which my condition evaluates to true?