Why does CONNECT BY LEVEL on a table return extra rows?

80,388

Solution 1

In the first query, you connect by just the level. So if level <= 1, you get each of the records 1 time. If level <= 2, then you get each level 1 time (for level 1) + N times (where N is the number of records in the table). It is like you are cross joining, because you're just picking all records from the table until the level is reached, without having other conditions to limit the result. For level <= 3, this is done again for each of those results.

So for 3 records:

  • Lvl 1: 3 record (all having level 1)
  • Lvl 2: 3 records having level 1 + 3*3 records having level 2 = 12
  • Lvl 3: 3 + 3*3 + 3*3*3 = 39 (indeed, 13 records each).
  • Lvl 4: starting to see a pattern? :)

It's not really a cross join. A cross join would only return those records that have level 2 in this query result, while with this connect by, you get the records having level 1 as well as the records having level 2, thus resulting in 3 + 3*3 instead of just 3*3 record.

Solution 2

When connect by is used without start with clause and prior operator, there is no restriction on joining children row to a parent row. And what Oracle does in this situation, it returns all possible hierarchy permutations by connecting a row to every row of level higher.

SQL> select b
  2       , level as lvl
  3       , sys_connect_by_path(b, '->') as ph
  4     from a
  5  connect by level <= 2
  6  ;

         B        LVL PH
       ---------- ---------- 
         1          1 ->1
         1          2 ->1->1
         2          2 ->1->2
         3          2 ->1->3
         2          1 ->2
         1          2 ->2->1
         2          2 ->2->2
         3          2 ->2->3
         3          1 ->3
         1          2 ->3->1
         2          2 ->3->2
         3          2 ->3->3

12 rows selected

Solution 3

you're comparing apples to oranges when comparing the final query to the others as the LEVEL is isolated in that to the 1-row dual table.

lets consider this query:

 select id, level as lvl
   from a
connect by level <= 2
  order by id, level

what that is saying is, start with the table set (select * From a). then, for each row returned connect this row to the prior row. as you have not defined a join in the connect by, this is in effect a Cartesian join, so when you have 3 rows of (1,2,3) 1 joins to 2, 1->3, 2->1, 2->3, 3->1 and 3->2 and they also join to themselves 1->1,2->2 and 3->3. these joins are level=2. so we have 9 joins there, which is why you get 12 rows (3 original "level 1" rows plus the Cartesian set).

so the number of rows output = rowcount + (rowcount^2)

in the last query you are isolating level to this

select level  as lvl
           from dual
        connect by level  <= 2

which of course returns 2 rows. this is then cartesianed to the original 3 rows, giving 6 rows as output.

Share:
80,388

Related videos on Youtube

Ben
Author by

Ben

I've grown up a lot through using these sites and I have a lot of thoughts about how people can interact as individuals or groups in order to learn more and learn better. Some of these thoughts are scattered around the various meta sites.

Updated on January 09, 2020

Comments

  • Ben
    Ben over 4 years

    Using CONNECT BY LEVEL seems to return too many rows when performed on a table. What is the logic behind what's happening?

    Assuming the following table:

    create table a ( id number );
    
    insert into a values (1);
    insert into a values (2);
    insert into a values (3);
    

    This query returns 12 rows (SQL Fiddle).

     select id, level as lvl
       from a
    connect by level <= 2
      order by id, level
    

    One row for each in table A with the value of column LVL being 1 and three for each in table A where the column LVL is 2, i.e.:

    ID | LVL 
    ---+-----
     1 |  1 
     1 |  2 
     1 |  2 
     1 |  2 
     2 |  1 
     2 |  2 
     2 |  2 
     2 |  2 
     3 |  1 
     3 |  2 
     3 |  2 
     3 |  2 
    

    It is equivalent to this query, which returns the same results.

     select id, level as lvl
       from dual
      cross join a
    connect by level <= 2
      order by id, level
    

    I don't understand why these queries return 12 rows or why there are three rows where LVL is 2 and only one where LVL is 1 for each value of the ID column.

    Increasing the number of levels that are "connected" to 3 returns 13 rows for each value of ID. 1 where LVL is 1, 3 where LVL is 2 and 9 where LVL is 3. This seems to suggest that the rows returned are the number of rows in table A to the power of the value of LVL minus 1.

    I would have though that these queries would be the same as the following, which returns 6 rows

    select id, lvl
      from ( select level  as lvl
               from dual
            connect by level  <= 2
                    )
     cross join a
     order by id, lvl
    

    The documentation isn't particularly clear, to me, in explaining what should occur. What's happening with these powers and why aren't the first two queries the same as the third?

  • Oto Shavadze
    Oto Shavadze about 9 years
    sys_connect_by_path() <- this keyword explained everything. One of the best answer on this question (not only in this topic).