Unable to use StructField with PySpark

21,922

It works for following code. Document for StructField and StringType. While 1.3 is pretty old.

from pyspark.sql.types import *
schemaString = "name age"

fields = [StructField(field_name, StringType(), True) 
    for field_name in schemaString.split()]
Share:
21,922
simplycoding
Author by

simplycoding

Updated on July 05, 2022

Comments

  • simplycoding
    simplycoding over 1 year

    I'm running the PySpark shell and unable to create a dataframe. I've done

    import pyspark
    from pyspark.sql.types import StructField
    from pyspark.sql.types import StructType
    

    all without any errors returned.

    Then I tried running these commands:

    schemaString = "name age"
    fields = [StructField(field_name, StringType(), True) for field_name in schemaString.split()]
    

    And keep getting the error: ` name 'StructField' is not defined

    Basically, I'm following the Spark documentation here: https://spark.apache.org/docs/1.3.0/sql-programming-guide.html

    Weird, if I remove the for loop and do this, it works:

    fields = [StructField('field1', StringType(), True)]