"TypeError: an integer is required (got type bytes)" when importing pyspark on Python 3.8
11,371
you must downgrade your python version from 3.8 to 3.7 because pyspark doesn't support this version of python.
Related videos on Youtube
Author by
Dmitry Deryabin
Updated on June 04, 2022Comments
-
Dmitry Deryabin over 1 year
- Created a conda environment:
conda create -y -n py38 python=3.8 conda activate py38
- Installed Spark from Pip:
pip install pyspark # Successfully installed py4j-0.10.7 pyspark-2.4.5
- Try to import pyspark:
python -c "import pyspark" Traceback (most recent call last): File "<string>", line 1, in <module> File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/__init__.py", line 51, in <module> from pyspark.context import SparkContext File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/context.py", line 31, in <module> from pyspark import accumulators File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/accumulators.py", line 97, in <module> from pyspark.serializers import read_int, PickleSerializer File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/serializers.py", line 72, in <module> from pyspark import cloudpickle File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 145, in <module> _cell_set_template_code = _make_cell_set_template_code() File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code return types.CodeType( TypeError: an integer is required (got type bytes)
It seems that Pyspark comes with pre-packaged version of
cloudpickle
package that had some issues on Python 3.8, which are now resolved (at least as of version 1.3.0) on pip version, however Pyspark version is still broken. Did anyone face the same issue/had any luck resolving this?-
10465355 almost 4 yearsSpark doesn't support Python 3.8 until 3.0.0
-
blackbishop almost 4 yearsDoes this answer your question? How to fix 'TypeError: an integer is required (got type bytes)' error when trying to run pyspark after installing spark 2.4.4
-
Dmitry Deryabin almost 4 years@blackbishop, No unfortunately it doesn't since downgrading is not an options for my use case.
-
blackbishop almost 4 years@cricket_007 See this issue
-
OneCricketeer almost 4 years@Dmitry Why not? Looks like you're creating your own env, so you're going to have to if you want to use pyspark
-
Dmitry Deryabin almost 4 years@cricket_007 Our library needs to support Python 3.8 and it also relies on Pyspark. Python 3.7 is already supported :) So it seems clear that for now 3.8 is not an option (at least until Spark 3.0 is released)
-
Megan over 3 yearsIs there a way of downgrading to 3.7 this for aws emr clusters? Docs for this just seem to be pointing to python 3.4 -> 3.6 transitions...
-
Paul Watson over 3 yearsCan confirm this was my issue to, python 3.8 failing, python 3.7.8 working.
-
brajesh jaishwal over 3 yearsdoesn't work with python 3.8 needs 3.7 to be installed
-
Kubra Altun over 2 yearsIt did not work with python3.9, but worked with python3.7.