How to write a LEFT JOIN in BigQuery's Standard SQL?

20,224

Solution 1

Based on your recent update in question and comments - try below

WITH Table_L AS (
SELECT 1 AS Row, 'A' AS Hour UNION ALL
SELECT 2 AS Row, 'B' AS Hour UNION ALL
SELECT 3 AS Row, 'C' AS Hour 
),
Table_R AS (
SELECT 1 AS Row, 10 AS Value UNION ALL
SELECT 2 AS Row, 20 AS Value UNION ALL
SELECT 3 AS Row, 30 AS Value 
)
SELECT 
  Row, 
  Hour, 
  (SELECT AVG(Value) FROM Table_R) AS AverageOfR,
  1 AS Key
FROM Table_L 

Above is for testing

the query you should run in "production" is

SELECT 
  Row, 
  Hour, 
  (SELECT AVG(Value) FROM Table_R) AS AverageOfR,
  1 AS Key
FROM Table_L 

In case, if for some reason you are bound to JOIN, use below CROSS JOIN version

SELECT 
  Row, 
  Hour, 
  AverageOfR,
  1 AS Key
FROM Table_L
CROSS JOIN ((SELECT AVG(Value) AS AverageOfR FROM Table_R))

or below LEFT JOIN version with Key field involved (in case if Key really important for your logic - which somehow I feel is true)

SELECT 
  Row, 
  Hour, 
  AverageOfR,
  L.Key AS Key
FROM (SELECT 1 AS Key, Row, Hour FROM Table_L) AS L
LEFT JOIN ((SELECT 1 AS Key, AVG(Value) AS AverageOfR FROM Table_R)) AS R
ON L.Key = R.Key

Solution 2

Your error message suggests that key is not a column in table_L. If no, then don't include it in the query.

It looks like you simply want the average of the total from table_R. You can approach this as:

SELECT l.*, r.average
FROM test.table_L as l CROSS JOIN
     (SELECT Avg(Total) as average 
      FROM test.table_R
     ) R 
ORDER BY l.hour ASC;
Share:
20,224
Praxiteles
Author by

Praxiteles

Updated on December 27, 2020

Comments

  • Praxiteles
    Praxiteles over 3 years

    We have a query that works in BigQuery's Legacy SQL. How do we write it in Standard SQL so it works?

    SELECT Hour, Average, L.Key AS Key FROM
    (SELECT 1 AS Key, * 
    FROM test.table_L AS L)
    LEFT JOIN 
    (SELECT 1 AS Key, Avg(Total) AS Average 
    FROM test.table_R) AS R 
    ON L.Key = R.Key ORDER BY Hour ASC
    

    Currently the error it gives is:

    Equality is not defined for arguments of type ARRAY<INT64> at [4:74]
    

    BigQuery has two modes for queries: Legacy SQL and Standard SQL. We have looked at the BigQuery Standard SQL documentation and also see just one SO answer on Standard SQL joins in BigQuery - but so far, it is unclear to us what the key change needed might be.

    Table_L looks like this:

    Row    Hour
     1      A
     2      B
     3      C
    

    Table_R looks like this:

    Row    Value
     1      10
     2      20
     3      30
    

    Results Desired:

    Row  Hour  Average(OfR)  Key
     1     A      20          1
     2     B      20          1 
     3     C      20          1
    

    How do we rewrite this BigQuery Legacy SQL query to work in Standard SQL?

  • Praxiteles
    Praxiteles over 7 years
    Just as a clarification - there are over 5 million rows. Does this answer require explicitly selecting each row?
  • Mikhail Berlyant
    Mikhail Berlyant over 7 years
    no, these are just presented as an example for you to test and confirm that it is what you need (ot not). in reality you should just run only select part of it
  • Praxiteles
    Praxiteles over 7 years
    This answer worked for us as did the other answer below as well. (Thanks!)
  • Praxiteles
    Praxiteles over 7 years
    This answer worked for us. The different variations are a great addition. (Thanks!)
  • Gordon Linoff
    Gordon Linoff over 7 years
    @Praxiteles . . . I would consider this the simpler and more direct approach.