Parser for Oracle SQL

34,468

Solution 1

Have you considered General SQL Parser? I don't have any experience with it myself but browsing their website it has potential. Personally I have rolled my own built on the parser in Eclipse Data Tools Platform (sorry I can't share, it's proprietary), but now I will have to evaluate the one I linked above because it claims to have more coverage of Oracle SQL than my parser does.

Solution 2

The ANTLR (v3, v4) parser generator has had a number of Oracle SQL and PL/SQL grammars written for it; see the grammar list (v3) for details. Of those:

  • I've used Andrey Kharitonkin's "Oracle PL/SQL Grammar for ANTLR v3"; from memory it supported most SQL and PL/SQL syntax from the 8i era, with a few bits and pieces that appeared in 9i and 10g
  • Patrick Higgins' "PL/SQL" grammar is newer and claims to support most 11g syntax, but it appears to just swallow most DML statements - not too useful if you're specifically interested in SQL

Solution 3

After working the same issue, I managed to get a SQL parser working:

My code looks like this:

import oracle.jdeveloper.db.DatabaseConnections;
import oracle.javatools.db.sql.SQLQueryBuilder;
import oracle.javatools.db.Database;
...
// load the database connections
// this is specific to Oracle SQL developer
DatabaseConnections connections = DatabaseConnections.getPrivateInstance(
    (new File("src/test/resources/connection.xml")).toURI().toURL(),
    "somePassword");
// get the one we are interested in
Database database = connections.getDatabase("the-name-of-a-sqldeveloper-connection");
SQLQueryBuilder queryBuilder = SQLQueryBuilderFactory.createBuilder(
      database, new Schema("OPTIONAL_SCHEMA"), "select * from some_table");

The challenges to get this working are:

  • Getting the Oracle SQL Developer is a challenge. To do this, you will need to hack the files Oracle SQL Developer creates to persist those connections; the connection.xml in the above example looks something like this:
<?xml version = '1.0' encoding = 'UTF-8'?>
<References xmlns="http://xmlns.oracle.com/adf/jndi">
   <Reference name="the-name-of-a-sqldeveloper-connection"     className="oracle.jdeveloper.db.adapter.DatabaseProvider" xmlns="">
      <Factory      className="oracle.jdevimpl.db.adapter.DatabaseProviderFactory1212"/>
  <RefAddresses>
     <StringRefAddr addrType="password">
        <Contents>HSx10FtlsPc=</Contents>
     </StringRefAddr>
     <StringRefAddr addrType="oraDriverType">
        <Contents>thin</Contents>
     </StringRefAddr>
...

To get such a file you will need to dig into the folder where Oracle SQL Developer settings are stored and just copy-paste that content into your own file.

Now, assuming you managed to get this far here are the problems and the points where I got disappointed by the end solution:

  • The API in the builder is decent but parsing will perform a query execution (and this might be a big issue - in my case I needed the parsing to be fast).
  • The API is not officially public. Without being able to quote here the precise wording, I got an Oracle answer that stated there is no officially supported Oracle parser (the alluded reason was that this is a very valuable technology that will not be sold or licensed).
  • While this is more of a hack than a solution, I realized that it might be useful for some cases (not mine). I consider that using it in real life scenarios might be highly risky from both technical and legal perspectives.

The reason I posted this answer is to drive community attention to the fact that having an Oracle SQL parser is perfectly feasible and maybe one day Oracle will consider exposing the parser as a competitive advantage (I am sure there are users out there that would happily pay some fees to get a license).

Solution 4

Will, why not use the Oracle parser?

create global temporary table plans as select * from table(dbms_xplan.display_cursor());
--/
declare
c number;
i varchar2(30);
l number;
stmt varchar2(4000);
begin
delete from plans;
stmt:= 'select z.* from z,skew1 where z.z = skew1.fillblocks';
l:= length(stmt);
c:=dbms_sql.open_cursor();
dbms_sql.parse (c, stmt,dbms_sql.native);
select distinct sql_id into i from v$open_cursor where sid in (select sid from v$mystat) and substr(sql_text,1,l) = substr(stmt,1,l);
insert into plans select * from table(dbms_xplan.display_cursor(i));
dbms_output.put_Line ('sql_id:'||i);
end;
/
select * from plans;

PLAN_TABLE_OUTPUT                                                             
----------------------------------------------------------------------------  
SQL_ID  97qc3ynmw1pa4, child number 0                                         
-------------------------------------                                         
select z.* from z,skew1 where z.z = skew1.fillblocks                          

Plan hash value: 942457544                                                    

----------------------------------------------------------------------------  
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |  
----------------------------------------------------------------------------  
|   0 | SELECT STATEMENT   |       |       |       |    85 (100)|          |  
|*  1 |  HASH JOIN         |       |     1 |   410 |    85   (2)| 00:00:02 |  
|   2 |   TABLE ACCESS FULL| Z     |     1 |     9 |     2   (0)| 00:00:01 |  
|   3 |   TABLE ACCESS FULL| SKEW1 |  6000 |  2349K|    82   (0)| 00:00:01 |  
----------------------------------------------------------------------------  

Predicate Information (identified by operation id):                           
---------------------------------------------------                           

   1 - access("Z"."Z"=INTERNAL_FUNCTION("SKEW1"."FILLBLOCKS"))   

You do need an oracle database connection. If the output is what you want, it is the easiest way to get what you want, without re-inventing an other color for a wheel. In this example I limited the sql to 4000 characters but you could feed a pl/sql array of varchar2 into to dbms_sql.parse function, doing so allows you to parse sql's of unimaginable sizes.

Solution 5

Given that Oracle Corporation couldn't keep the SQL parser for the SQL and PL/SQL VM's in sync when the two had different SQL parsers, it's unlikely that a third party would be able to create a "fully compliant" parser.

What data are you trying to extract from the query? The Oracle database itself may have other facilities that would allow you to extract that information without parsing the query first.

Share:
34,468

Related videos on Youtube

Will
Author by

Will

Updated on October 17, 2020

Comments

  • Will
    Will over 3 years

    For my current project I need a SQL parser that parses Oracle SQL statements. Currently I've been using jsqlparser, which worked well for simple queries. But when specific functions occur (e.g. cast() or (+)) the parser fails.

    Can anyone suggest a parser that is fully compliant to Oracle SQL?

    Best, Will

    • ik_zelf
      ik_zelf about 13 years
      what exactly do you want to get out of this parser?
    • Will
      Will about 13 years
      basically its enough if i get the AST of the statement, as I've already wrote my visitor for the jsqlparser, which I will need to adapt to the new parser, if there is any better.
    • ik_zelf
      ik_zelf about 13 years
      AST Automatic Statement Tuning ?
  • Will
    Will about 13 years
    I don't need to parse PL/SQL and I am restricted to the sql statements only (without access to a database).
  • Justin Cave
    Justin Cave about 13 years
    I'm not talking about a PL/SQL parser. Oracle has separate PL/SQL and SQL VM's internally. For many years, they each had their own SQL parser. And for many years, you'd find that you couldn't embed a particular valid SQL statement in a PL/SQL block because the PL/SQL VM's SQL parser couldn't parse it. But when you passed that same SQL statement to the SQL VM, the SQL VM's SQL parser had no problem with it. That's not a problem any longer because the two VM's share a single SQL parser. But it does show the difficulty of keeping up with the evolution of the SQL grammar.
  • Will
    Will about 13 years
    Yeah I also discovered their website and already tried parsing some queries using this parser. However their trial version only allows parsing 2000 characters so I can't fully test and evaluate whether it will be worth it to buy it. So I was thinking maybe someone knows some good alternatives or has some experience to share :)
  • Will
    Will about 13 years
    Thanks for the info.That's why I was saying that I only need a SQL parser. The problem that they couldn't keep their two different parsers in sync, does not affect the feasibilty for a third party to create a parser that can handle all the SQL.
  • Will
    Will about 13 years
    I actually only have the sql statement and nothing else. So I'm really restricted to statically analyze the SQL. Questions I need to answer include, what tables are in that sql statement.
  • Will
    Will about 13 years
    I eventually decided to get the license for this parser. Seems fine so far, but already discovered a bug in its grammar.
  • ChrisH
    ChrisH about 13 years
    It will be interesting to see how responsive they are to fixing the bug.
  • Will
    Will almost 13 years
    They are not very responsive, the first guy contacted me after one week and provided me with a link to download "improved" version of the parser. Tried it again and did not solve my problem. Contacted him again, and haven't gotten any response for over 6 weeks now..
  • darcyy
    darcyy over 11 years
    "Grammar list" link is broken
  • SimonJ
    SimonJ over 11 years
    Updated. Looks like things got moved around after the ANTLR 4 release.
  • Ira Baxter
    Ira Baxter over 11 years
    Flagger: You could have the courtesy to explain why you dinged this answer. It directly answers OP's question.
  • Silas Hansen
    Silas Hansen almost 11 years
    I wouldn't exactly say its easy to convert it into C#. At least not for a C# programmer.