SparkSQL Dataframe Error: value show is not a member of org.apache.spark.sql.DataFrameReader

15,043

Try this:

val df = sqlContext.read.option("header", "true").option("inferSchema", "true").csv(csvfile)

sqlContext.read gives you a DataFrameReader, and option and format both set some options and give you back a DataFrameReader. You need to call one of the methods that gives you a DataFrame (like csv) before you can do things like show with it.

See https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameReader for more info.

Share:
15,043

Related videos on Youtube

stud333
Author by

stud333

Updated on June 04, 2022

Comments

  • stud333
    stud333 almost 2 years

    I'm new to Spark/Scala/Dataframes. I'm using Scala 2.10.5, Spark 1.6.0. I am trying to load in a csv file and then create a dataframe from it. Using the scala shell I execute the following in the order below. Once I execute line 6, I get an error that says:

    error: value show is not a member of org.apache.spark.sql.DataFrameReader

    Could someone advise what I might be missing? I understand I don't need to import sparkcontext if I'm using the REPL (shell) so sc will be automatically created, but any ideas what I'm doing wrong?

    1.import org.apache.spark.sql.SQLContext

    1. import sqlContext.implicits._

    2. val sqlContext = new SQLContext(sc)

    3. val csvfile = "path_to_filename in hdfs...."

    4. val df = sqlContext.read.format(csvfile).option("header", "true").option("inferSchema", "true")

    5. df.show()

  • stud333
    stud333 about 6 years
    Thank you!! I'm going to try that now
  • stud333
    stud333 about 6 years
    ..I just tried your suggestion and it's giving me the error: value csv is not a member of org.apache.spark.sql.DataFrameReader. Do you think it's b/c I'm not importing something I should?
  • Joe K
    Joe K about 6 years
    Ah, looks like that method was added relatively recently. Sorry, don't have a 1.6.0 installation handy to test... maybe try .format("csv").load(csvFile) instead?
  • stud333
    stud333 about 6 years
    omg thank you thank you! I even tried spark-shell --packages com.databricks:spark-csv_2.10:1.5.0 but that didn't work, but your suggestion did. I'm working on a project that's due tomorrow so I hope you wouldn't mind me reaching out again with more questions (if any)!
  • stud333
    stud333 about 6 years
    Actually, to clarify, I also needed to add spark-shell --packages com.databricks:spark-csv_2.10:1.5.0 and then use your suggestion in order for it to work. Thanks again!