How to drop multiple tables in Big query using Wildcards TABLE_DATE_RANGE()?

12,067

Solution 1

I just used python to loop this and solve it using Graham example:

 from subprocess import call


   return_code = call('bq  rm -f -t dataset.' + table_name +'_'+  period + '', shell=True)

Solution 2

For dataset stats and tables like daily_table_20181017 keeping dates conventions, I would go with simple script and gcloud Command-Line Tool:

for table in `bq ls --max_results=10000000 stats |grep TABLE |grep daily_table |awk '{print $1}'`; do echo stats.$table; bq rm -f -t stats.$table; done

Solution 3

DDL e.g. DROP TABLE doesn't exist yet in BigQuery. However, I know Google are currently working on it.

In the meantime, you'll need to use the API to delete tables. For example, using the gCloud tool:

bq rm -f -t dataset.table

If you want to do bulk deletes, then you can use some bash/awk magic. Or, if you prefer, call the Rest API directly with e.g. the Python client.

See here too.

Solution 4

For a long time @graham's approach worked for me. Just recently the BQ CLI stopped working effectively and froze everytime I ran the above command. Hence I dug around for a new approach and used some parts of Google cloud official documentation. I followed the following approach using a Jupyter notebook.

from google.cloud import bigquery

# TODO(developer): Construct a BigQuery client object.
client = bigquery.Client.from_service_account_json('/folder/my_service_account_credentials.json')

dataset_id = 'project_id.dataset_id'
dataset = client.get_dataset(dataset_id)

# Creating a list of all tables in the above dataset
tables = list(client.list_tables(dataset))  # API request(s)


## Filtering out relevant wildcard tables to be deleted
## Mention a substring that's common in all your tables that you want to delete

tables_to_delete = ["{}.{}.{}".format(dataset.project, dataset.dataset_id, table.table_id) 
                    for table in tables if "search_sequence_" in format(table.table_id)]

for table in tables_to_delete:
    client.delete_table(table)
    print("Deleted table {}".format(table))  ```

Solution 5

DROP TABLE [table_name]; is now supported in bigquery. So here is a purely SQL/bigquery UI solution.

select concat("drop table ",table_schema,".",   table_name, ";" )
from <dataset-name>.backerkit_production_follower.INFORMATION_SCHEMA.TABLES
where table_name like "partial_table_name%"
order by table_name desc

Audit that you are dropping the correct tables. Copy and paste back into bigquery to drop listed tables.

Share:
12,067
jfk83
Author by

jfk83

Updated on June 07, 2022

Comments

  • jfk83
    jfk83 almost 2 years

    I was looking at the documentation but I haven't found the way to Drop multiple tables using wild cards.

    I was trying to do something like this but it doesn't work:

    DROP TABLE
     TABLE_DATE_RANGE([clients.sessions_], 
                      TIMESTAMP('2017-01-01'), 
                      TIMESTAMP('2017-05-31'))