Extracting data using regexp_extract in Google BigQuery

15,153

It's very simple to do:

select regexp_extract(input,r'he=(.{32})');

or as example:

select regexp_extract('http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234355;he=5e3152eafc50ed0346df7f10095d07c4;catname=Horoscope',r'he=(.{32})')
Share:
15,153
Teja
Author by

Teja

Updated on June 05, 2022

Comments

  • Teja
    Teja almost 2 years

    I am trying to extract data from a column which has multiple characters and I am only interested in getting the specific string from the input string. My sample input and outputs are as below. How can I implement this using regexp_extract function.Can someone share their thoughts on this if you have worked on GBQ.Thanks.

    **

    • SQL:-

    **

       SELECT request.url AS url 
        FROM [xyz.abc]
        WHERE regexp_extract(input,r'he=(.{32})') 
    

    **

    • Input:-

    **

    http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234355;he=5e3152eafc50ed0346df7f10095d07c4;catname=Horoscope  
    2   http://mpp.xyz.com/conv/v=5;m=1;t=16901;ts=20150516234335;he=5e3152eafc50ed0346df7f10095d07c4;catname=High+Speed+Internet   
    

    **

    • Output :-

    ** **

    5e3152eafc50ed0346df7f10095d07c4
    5e3152eafc50ed0346df7f10095d07c4
    

    **

  • Teja
    Teja almost 9 years
    Thanks for the response Pentium. But I get this error when I run the same. Argument type mismatch in function LOGICAL_AND: first argument is type bool, second argument is type string
  • Pentium10
    Pentium10 almost 9 years
    You should check your data types, these work with STRING columns. Share your query if you still have issue to be able to help.
  • Pentium10
    Pentium10 almost 9 years
    That's a confused SQL, you cannot use this way the REGEXP as where. SELECT regexp_extract(request.url,r'he=(.{32})') as output AS url FROM [xyz.abc]
  • Teja
    Teja almost 9 years
    My bad... I removed it from where clause and tried running keeping it in SELECT clause... but it is returning me NULLs.
  • Teja
    Teja almost 9 years
    What does .{32} mean?