Add leading zeros to Columns in a Spark Data Frame
12,155
Solution 1
You can simply do that by using concat
inbuilt function
df.withColumn("iD", concat(lit("00"), col("iD")))
.withColumn("val", concat(lit("0"), col("val")))
Solution 2
This solved it for me, thank you all for the help
val df2 = df
.withColumn("idLong", format_string("%03d", $"iD"))
Comments
-
fletchr almost 2 years
In short, I'm leveraging spark-xml to do some parsing of XML files. However, using this is removing the leading zeros in all the values I'm interested in. However, I need the final output, which is a DataFrame, to include the leading zeros. I'm unsure/can not figure out a way to add leading zeros to the columns I'm interested in.
val df = spark.read .format("com.databricks.spark.xml") .option("rowTag", "output") .option("excludeAttribute", true) .option("allowNumericLeadingZeros", true) //including this does not solve the problem .load("pathToXmlFile")
Example output that I'm getting
+------+---+--------------------+ |iD |val|Code | +------+---+--------------------+ |1 |44 |9022070536692784476 | |2 |66 |-5138930048185086175| |3 |25 |805582856291361761 | |4 |17 |-9107885086776983000| |5 |18 |1993794295881733178 | |6 |31 |-2867434050463300064| |7 |88 |-4692317993930338046| |8 |44 |-4039776869915039812| |9 |20 |-5786627276152563542| |10 |12 |7614363703260494022 | +------+---+--------------------+
Desired output
+--------+----+--------------------+ |iD |val |Code | +--------+----+--------------------+ |001 |044 |9022070536692784476 | |002 |066 |-5138930048185086175| |003 |025 |805582856291361761 | |004 |017 |-9107885086776983000| |005 |018 |1993794295881733178 | |006 |031 |-2867434050463300064| |007 |088 |-4692317993930338046| |008 |044 |-4039776869915039812| |009 |020 |-5786627276152563542| |0010 |012 |7614363703260494022 | +--------+----+--------------------+
-
fletchr almost 6 yearsthanks, that worked. I also tried another way as well, I posted it
-
Ramesh Maharjan almost 6 yearsthat was really good :) thanks for accepting
-
Bhaskar about 4 yearsPerfect! Thanks. For reference for those who visit after me here is the documentation link for format_string