How to move files of same extension in databricks files system?
16,600
Solution 1
Wildcards are currently not supported with dbutils. You can move the whole directory:
dbutils.fs.mv("dbfs:/tmp/test", "dbfs:/tmp/test2", recurse=True)
or just a single file:
dbutils.fs.mv("dbfs:/tmp/test/test.csv", "dbfs:/tmp/test2/test2.csv")
As mentioned in the comments below, you can use python to implement this wildcard-logic. See also some code examples in my following answer.
Solution 2
Since the wildcards are not allowed, we need to make it work in this way (list the files and then move or copy - slight traditional way)
import os
def db_list_files(file_path, file_prefix):
file_list = [file.path for file in dbutils.fs.ls(file_path) if os.path.basename(file.path).startswith(file_prefix)]
return file_list
files = db_list_files('dbfs:/your/src_dir', 'foobar')
for file in files:
dbutils.fs.cp(file, os.path.join('dbfs:/your/tgt_dir', os.path.basename(file)))
Related videos on Youtube
Author by
Krishna Reddy
Updated on June 04, 2022Comments
-
Krishna Reddy almost 2 years
I am facing file not found exception when i am trying to move the file with * in DBFS. Here both source and destination directories are in DBFS. I have the source file named "test_sample.csv" available in dbfs directory and i am using the command like below from notebook cell,
dbutils.fs.mv("dbfs:/usr/krishna/sample/test*.csv", "dbfs:/user/abc/Test/Test.csv")
Error:
java.io.FileNotFoundException: dbfs:/usr/krishna/sample/test*.csv
I appreciate any help. Thanks.