Spark getting current date in string

15,989

After the date_format, you can convert it into anonymous Dataset and just use first function to get that into a string variable. Check this out

scala> val dateFormat = "yyyyMMdd_HHmm"
dateFormat: String = yyyyMMdd_HHmm

scala> val dateValue = spark.range(1).select(date_format(current_timestamp,dateFormat)).as[(String)].first
dateValue: String = 20190320_2341

scala> val fileName = "TestFile_" + dateValue+ ".csv"
fileName: String = TestFile_20190320_2341.csv

scala>

Without creating df, you can use expr() and get the results.

scala> val ts = (current_timestamp()).expr.eval().toString.toLong
ts: Long = 1553106289387000

scala> new java.sql.Timestamp(ts/1000)
res74: java.sql.Timestamp = 2019-03-20 23:54:49.387

The above gives the result in normal scala, so you can format using date/time libraries

EDIT1:

Here is one more way, with the formatting in normal scala.

scala> val dateFormat = "yyyyMMdd_HHmm"
dateFormat: String = yyyyMMdd_HHmm

scala> val ts = (current_timestamp()).expr.eval().toString.toLong
ts: Long = 1553108012089000

scala> val dateValue = new java.sql.Timestamp(ts/1000).toLocalDateTime.format(java.time.format.DateTimeFormatter.ofPattern(dateFormat))
dateValue: String = 20190321_0023

scala> val fileName = "TestFile_" + dateValue+ ".csv"
fileName: String = TestFile_20190321_0023.csv

scala>

Using pyspark

>>> dateFormat = "%Y%m%d_%H%M"
>>> import datetime
>>> ts=spark.sql(""" select current_timestamp() as ctime """).collect()[0]["ctime"]
>>> ts.strftime(dateFormat)
'20190328_1332'
>>> "TestFile_" +ts.strftime(dateFormat) + ".csv"
'TestFile_20190328_1332.csv'
>>>
Share:
15,989

Related videos on Youtube

Sauron
Author by

Sauron

Updated on May 29, 2022

Comments

  • Sauron
    Sauron almost 2 years

    I have the code below to get the date in the proper format to then be able to append to a filename string.

    %scala
    
    // Getting the date for the file name
    import org.apache.spark.sql.functions.{current_timestamp, date_format}
    val dateFormat = "yyyyMMdd_HHmm"
    val dateValue = spark.range(1).select(date_format(current_timestamp,dateFormat)).collectAsList().get(0).get(0)
    
    val fileName = "TestFile_" + dateValue+ ".csv"
    

    I feel this is faily heavy handed, is there an easier way to simply get the current date to a string?

  • Sauron
    Sauron about 5 years
    Thank you. In spark is there a way to simply call current_timestamp() AS string and set to a variable?
  • stack0114106
    stack0114106 about 5 years
    yes.. it can be done.. but it returns Epoch.. let me try and update
  • Sauron
    Sauron about 5 years
    That would be great to see, just for reference
  • stack0114106
    stack0114106 about 5 years
    @Sauron.. added one more way by pushing the formatting to scala.. check my EDIT1
  • Sauron
    Sauron about 5 years
    Anyway to covert this into python?
  • Sauron
    Sauron about 5 years
    Thank you for the update, I ended up doing in straight python as: import datetime dateValue = str(datetime.datetime.now().strftime("%Y%m%d_%H%M"))