Split a string into rows using pure SQLite

12,163

Solution 1

This is possible with a recursive common table expression:

WITH RECURSIVE split(s, last, rest) AS (
  VALUES('', '', 'C:\Users\fidel\Desktop\Temp')
  UNION ALL
  SELECT s || substr(rest, 1, 1),
         substr(rest, 1, 1),
         substr(rest, 2)
  FROM split
  WHERE rest <> ''
)
SELECT s
FROM split
WHERE rest = ''
   OR last = '\';

(You did not ask for a reasonable way.)

Solution 2

Recursive CTE:

WITH RECURSIVE cte(org, part, rest, pos) AS (
  VALUES('C:\Users\fidel\Desktop\Temp', '','C:\Users\fidel\Desktop\Temp'|| '\', 0)
  UNION ALL
  SELECT org,
         SUBSTR(org,1, pos + INSTR(rest, '\')),
         SUBSTR(rest, INSTR(rest, '\')+1),
         pos + INSTR(rest, '\')
  FROM cte
  WHERE INSTR(rest, '\') > 0                         
)
SELECT *
FROM cte
WHERE pos <> 0
ORDER BY pos; 

SqlFiddleDemo

Output:

╔═════════════════════════════╗
║            part             ║
╠═════════════════════════════╣
║ C:\                         ║
║ C:\Users\                   ║
║ C:\Users\fidel\             ║
║ C:\Users\fidel\Desktop\     ║
║ C:\Users\fidel\Desktop\Temp ║
╚═════════════════════════════╝

How it works:

org  - original string does not change
part - simply `LEFT` equivalent of original string taking pos number of chars
rest - simply `RIGHT` equivalent, rest of org string
pos  - position of first `\` in the rest 

Trace:

╔══════════════════════════════╦══════════════════════════════╦════════════════════════════╦═════╗
║             org              ║            part              ║           rest             ║ pos ║
╠══════════════════════════════╬══════════════════════════════╬════════════════════════════╬═════╣
║ C:\Users\fidel\Desktop\Temp  ║ C:\                          ║ Users\fidel\Desktop\Temp\  ║   3 ║
║ C:\Users\fidel\Desktop\Temp  ║ C:\Users\                    ║ fidel\Desktop\Temp\        ║   9 ║
║ C:\Users\fidel\Desktop\Temp  ║ C:\Users\fidel\              ║ Desktop\Temp\              ║  15 ║
║ C:\Users\fidel\Desktop\Temp  ║ C:\Users\fidel\Desktop\      ║ Temp\                      ║  23 ║
║ C:\Users\fidel\Desktop\Temp  ║ C:\Users\fidel\Desktop\Temp  ║                            ║  28 ║
╚══════════════════════════════╩══════════════════════════════╩════════════════════════════╩═════╝

Solution 3

If you want to search for the values ​​individually, use the code below:

WITH RECURSIVE split(content, last, rest) AS (
VALUES('', '', 'value1§value2§value3§value4§value5§value6§value7')
UNION ALL
  SELECT 

    CASE WHEN last = '§' 
            THEN
                substr(rest, 1, 1)
            ELSE
                content || substr(rest, 1, 1)
    END,
     substr(rest, 1, 1),
     substr(rest, 2)
  FROM split
  WHERE rest <> ''
)
SELECT 
       REPLACE(content, '§','') AS 'ValueSplit'     
FROM 
       split
WHERE 
       last = '§' OR rest ='';

Result:

**ValueSplit**

value1
value2
value3
value4
value5
value6
value7

I hope I can help people with the same problem.

Solution 4

There's simpler alternative to the recursive CTE, that also can be applied to a number of file paths in a result set (or generally any delimited strings that you want to "split" into multiple rows by a separator).

SQLite has JSON1 extension. It's compatible with SQLite >= 3.9.0 (2015-10-14), but sqlite3 is almost always compiled with it now (e.g. Ubuntu, Debian, official Python Docker images and so on, and you can check it with PRAGMA compile_options and this answer has a little more detail on it).

JSON1 has json_each, which is one of the two table-valued functions in the extension that:

walk the JSON value provided as their first argument and return one row for each element.

Hence if you can turn your string into a JSON array string, this function will do the rest. And it's not hard to do.

const sql = `
  WITH input(filename) AS (
    VALUES  
      ('/etc/redis/redis.conf'),
      ('/run/redis/redis-server.pid'),
      ('/var/log/redis-server.log')
  ), tmp AS (
    SELECT 
      filename, 
      '["' || replace(filename, '/', '", "') || '"]' as filename_array
    FROM input
  )
  SELECT (
    SELECT group_concat(ip.value, '/') 
    FROM json_each(filename_array) ip 
    WHERE ip.id <= p.id
  ) AS path
  FROM tmp, json_each(filename_array) AS p
  WHERE p.id > 1  -- because the filenames start with the separator
`

async function run() {
  const wasmUrl = 'https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.5.0/sql-wasm.wasm'
  const sqljs = await window.initSqlJs({locateFile: file => wasmUrl})
  const db = new sqljs.Database()
  const results = db.exec(sql) 
  ko.applyBindings(results[0])
}
run()
<script src="https://cdnjs.cloudflare.com/ajax/libs/knockout/3.4.2/knockout-min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.5.0/sql-wasm.min.js"></script>
<table>
  <thead>
    <tr data-bind="foreach: columns"><th data-bind="text: $data"></th></tr>
  </thead>
  <tbody data-bind="foreach: values">
    <tr data-bind="foreach: $data"><td data-bind="text: $data"></td></tr>
  </tbody>
</table>

Solution 5

Inspired from Lukasz Szozda's answer:

 WITH RECURSIVE cte("pre","post") AS (   
   VALUES('C:', 'Users\fidel\Desktop\Temp' || '\')
 UNION ALL
   SELECT "pre" || '\' || left("post",     position('\' in "post")-1),
                     substring("post" from position('\' in "post")+1)
   FROM cte   
   WHERE "post" > ''                             
 ) 
 SELECT "pre" FROM cte

(tested on PostgreSQL)

The idea is now to replace the VALUES line

VALUES('C:', 'Users\fidel\Desktop\Temp' || '\')

with placeholders like

 VALUES(?, ? || '\')

which have been pre-split in the programming language that is going to run the SQL statement above against the data base.

Reading the SQLite docs, I see that substring(... from ...) has to be replaced by substr(..., ...) and position(... in ...) is to be replaced by instr(..., ...) with parameters swapped.

Very annoying for me since I wanted SQL code that runs on both PostgreSQL and SQLite.

Share:
12,163
Fidel
Author by

Fidel

Updated on June 12, 2022

Comments

  • Fidel
    Fidel almost 2 years

    Using SQLite, I'd like to split a string in the following way.

    Input string:

    C:\Users\fidel\Desktop\Temp
    

    and have the query return these rows:

    C:\
    C:\Users\
    C:\Users\fidel\
    C:\Users\fidel\Desktop\
    C:\Users\fidel\Desktop\Temp
    

    In other words, I'd like to split a file path into its constituent paths. Is there a way to do this in pure SQLite?