How to do OUTER JOIN in scala

16,210

From your expected output, you need LEFT OUTER JOIN.

val groupedData =  df1.join(df2, $"id" === $"idValue", "left_outer").
       select(df1("id"), df1("count"), df2("count")).
       take(10).foreach(println)
Share:
16,210
Newbie
Author by

Newbie

Updated on June 04, 2022

Comments

  • Newbie
    Newbie almost 2 years

    I havce two data frames : df1 and df2

    df1

    |--- id---|---value---|
    |    1    |    23     |
    |    2    |    23     |
    |    3    |    23     |
    |    2    |    25     |
    |    5    |    25     |
    

    df2

    |-idValue-|---count---|
    |    1    |    33     |
    |    2    |    23     |
    |    3    |    34     |
    |    13   |    34     |
    |    23   |    34     |
    

    How do I get this ?

    |--- id--------|---value---|---count---|
    |    1         |    23     |    33     |
    |    2         |    23     |    23     |
    |    3         |    23     |    34     |
    |    2         |    25     |    23     |
    |    5         |    25     |    null   |
    

    I am doing :

     val groupedData =  df1.join(df2, $"id" === $"idValue", "outer") 
    

    But I don't see the last column in the groupedData. Is this correct way of doing ? Or Am I doing any thing wrong ?