Databricks dbutils.fs.ls shows files. However, reading them throws an IO error

19,807

In order to access files on a DBFS mount using local file APIs you need to prepend /dbfs to the path, so in your case it should be

with open('/dbfs/mnt/test_file.json', 'r') as f:
  for line in f:
    print(line)

See more details in the docs at https://docs.databricks.com/data/databricks-file-system.html#local-file-apis especially regarding limitations. With Databricks Runtime 5.5 and below there's a 2GB file limit. With 6.0+ there's no longer such a limit as the FUSE mount has been optimized to deal with larger file sizes.

Share:
19,807
Sains
Author by

Sains

Updated on June 14, 2022

Comments

  • Sains
    Sains almost 2 years

    I am running a Spark Cluster and when I'm executing the below command on Databricks Notebook, it gives me the output:

    dbutils.fs.ls("/mnt/test_file.json")
    
    [FileInfo(path=u'dbfs:/mnt/test_file.json', name=u'test_file.json', size=1083L)]
    

    However, when I'm trying to read that file, I'm getting the below mentioned error:

    with open("mnt/test_file.json", 'r') as f:
      for line in f:
        print line
    
    IOError: [Errno 2] No such file or directory: 'mnt/test_file.json'
    

    What might be the issue here? Any help/support is greatly appreciated.