AttributeError: 'NoneType' object has no attribute 'sc'

16,657

This should work (except in the code you have a missing ')' in the end of sc creation which I imagine is a type). You can try creating sc as follows:

conf = SparkConf().setAppName("app1").setMaster("local")
sc = SparkContext(conf=conf)

BTW sc.stop means you already have a spark context which is true if you used pyspark but not if you use spark-submit. It is better to use SparkContext.getOrCreate which works in both cases.

Share:
16,657
Admin
Author by

Admin

Updated on June 21, 2022

Comments

  • Admin
    Admin almost 2 years

    Excuse me.Today i want to run a program about how to create DataFrame with sqlContext in Pyspark.The result is a AttributeError,which is"AttributeError: 'NoneType' object has no attribute 'sc'" My computer is win7,spark's version is 1.6.0 ,and API is python3 .I had google several times and read the Spark Python API Docs,and can not solved the problems.So i look for your help.

    my code is that:

       #python version is 3.5
       sc.stop()
       import pandas as pd
       import numpy as np
       sc=SparkContext("local","app1"
       data2=[("a",5),("b",5),("a",5)]
       df=sqlContext.createDataFrame(data2)
    

    And the result is that:


        AttributeError                            Traceback (most recent call last)
        <ipython-input-19-030b8faadb2c> in <module>()
        5 data2=[("a",5),("b",5),("a",5)]
        6 print(data2)
        ----> 7 df=sqlContext.createDataFrame(data2)
    
        D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in  createDataFrame(self, data, schema, samplingRatio)
        426             rdd, schema = self._createFromRDD(data, schema, samplingRatio)
        427         else:
        --> 428             rdd, schema = self._createFromLocal(data, schema)
        429         jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
        430         jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
    
        D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in _createFromLocal(self, data, schema)
       358         # convert python objects to sql data
       359         data = [schema.toInternal(row) for row in data]
       --> 360         return self._sc.parallelize(data), schema
       361 
       362     @since(1.3)
    
        D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in parallelize(self, c, numSlices)
       410         [[], [0], [], [2], [4]]
       411         """
       --> 412         numSlices = int(numSlices) if numSlices is not None else self.defaultParallelism
       413         if isinstance(c, xrange):
       414             size = len(c)
    
       D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in     defaultParallelism(self)
      346         reduce tasks)
      347         """
      --> 348         return self._jsc.sc().defaultParallelism()
      349 
      350     @property
    
     AttributeError: 'NoneType' object has no attribute 'sc'
    

    I am so fuzzed that i had created the "sc" in fact,why does it show the Error of"'NoneType' object has no attribute 'sc'"?