Get the sysdate -1 in Hive
Solution 1
Try something like:
select * from table1
where dt >= from_unixtime(unix_timestamp()-1*60*60*24, 'yyyyMMdd');
This works if you don't mind that hive scans the entire table. from_unixtime
is not deterministic, so the query planner in Hive won't optimize for you. For many cases (for example log files), not specifying a deterministic partition key can cause a very large hadoop job to start since it will scan the whole table, not just the rows with the given partition key.
If this matters to you, you can launch hive with an additional option
$ hive -hiveconf date_yesterday=20150331
And in the script or hive terminal use
select * from table1
where dt >= ${hiveconf:date_yesterday};
The name of the variable doesn't matter, nor does the value, you can set them in this case to get the prior date using unix commands. In the specific case of the OP
$ hive -hiveconf date_yesterday=$(date --date yesterday "+%Y%m%d")
Solution 2
In mysql:
select DATE_FORMAT(curdate()-1,'%Y%m%d');
In sqlserver :
SELECT convert(varchar,getDate()-1,112)
Use this query:
SELECT FROM_UNIXTIME(UNIX_TIMESTAMP()-1*24*60*60,'%Y%m%d');
Solution 3
It looks like DATE_SUB
assumes date in format yyyy-MM-dd
. So you might have to do some more format manipulation to get to your format. Try this:
select * from table1
where dt = FROM_UNIXTIME(
UNIX_TIMESTAMP(
DATE_SUB(
FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd')
, 1)
)
, 'yyyyMMdd') limit 10;
arsenal
profile for ferhan on Stack Exchange, a network of free, community-driven Q&A sites http://stackexchange.com/users/flair/335839.png
Updated on August 21, 2022Comments
-
arsenal over 1 year
Is there any way to get the
current date -1
in Hive meansyesterdays date
always? And in this format-20120805
?I can run my query like this to get the data for
yesterday's date
as today isAug 6th
-select * from table1 where dt = '20120805';
But when I tried doing this way with
date_sub function
to get the yesterday's date as the below table is partitioned on date(dt) column.select * from table1 where dt = date_sub(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(), 'yyyyMMdd')) , 1) limit 10;
It is looking for the data in all the partitions? Why? Something wrong I am doing in my query?
How I can make the evaluation happen in a subquery to avoid the whole table scanned?