SQL Passthrough in SAS

sql sas proc-sql

14,772

Solution 1

Although this question is overly broad, I can provide an overly broad answer.

The pass-through SQL in SAS allows you to communicate directly with a database. This becomes very advantageous when you are using database specific functions. An example would be Oracle's stats functions. You do not have to worry about how SAS will handle your coding or translate your SQL.

Additionally, it has also been a benefit to us that Pass-through SQL requires very little processing on the SAS side. If you have an extremely busy SAS box, you can opt to send the processing logic directly to the database. This is possible without using Pass-through SQL, but you have a higher degree of control when utilizing it.

This is by no means an exhaustive list of benefits, simply a few high level perks to using pass-through SQL. If you have a more concrete use case, we can discuss the specific differences in coding techniques.

Solution 2

PROC SQL will try and pass as much of the logic as it can to the database, but there are various times that it cannot. Using SAS functions that do not have equivalent in the database (or in the SAS/ACCESS engine for the database), will prevent passing the whole query to the database. When the query is not fully passed to the database, then the data is pulled into SAS and processed there. The more complicated your SQL is the more likely it will end up being processed in SAS. Here is a case that makes a larger difference than you might realize.

libname db <database> path=dbserver user=... password=...;
proc sql;
   create table db.new as
   select * from db.largedata where flag=1;
quit;

This will actually (at least thru SAS 9.1.3) pull all the data that matches flag=1 down to SAS and then load it back into the database. It this is millions of rows it really slows down.

You would find explicit pass through much faster in this case.

proc sql;
   connect dbase (server=dbserver user=... password=...);
   execute (create table db.new as
   select * from db.largedata where flag=1) as dbase;
   disconnect dbase;
quit;

I recently did an example using Oracle and a table with about 250,000 rows. The first way took 20 seconds and the second way to 2 seconds.

Solution 3

If you don't use the pass-through, then you have to import all the records (that you need for the processing) from the database to sas. By using the pass-through, you can have some processing done on the database side and bring over only the resulting records into sas. The difference (in terms of processing time and network usage) can very, from tiny to huge, depending on what you do.

14,772