Split a string into rows using pure SQLite
Solution 1
This is possible with a recursive common table expression:
WITH RECURSIVE split(s, last, rest) AS (
VALUES('', '', 'C:\Users\fidel\Desktop\Temp')
UNION ALL
SELECT s || substr(rest, 1, 1),
substr(rest, 1, 1),
substr(rest, 2)
FROM split
WHERE rest <> ''
)
SELECT s
FROM split
WHERE rest = ''
OR last = '\';
(You did not ask for a reasonable way.)
Solution 2
Recursive CTE:
WITH RECURSIVE cte(org, part, rest, pos) AS (
VALUES('C:\Users\fidel\Desktop\Temp', '','C:\Users\fidel\Desktop\Temp'|| '\', 0)
UNION ALL
SELECT org,
SUBSTR(org,1, pos + INSTR(rest, '\')),
SUBSTR(rest, INSTR(rest, '\')+1),
pos + INSTR(rest, '\')
FROM cte
WHERE INSTR(rest, '\') > 0
)
SELECT *
FROM cte
WHERE pos <> 0
ORDER BY pos;
Output:
╔═════════════════════════════╗
║ part ║
╠═════════════════════════════╣
║ C:\ ║
║ C:\Users\ ║
║ C:\Users\fidel\ ║
║ C:\Users\fidel\Desktop\ ║
║ C:\Users\fidel\Desktop\Temp ║
╚═════════════════════════════╝
How it works:
org - original string does not change
part - simply `LEFT` equivalent of original string taking pos number of chars
rest - simply `RIGHT` equivalent, rest of org string
pos - position of first `\` in the rest
Trace:
╔══════════════════════════════╦══════════════════════════════╦════════════════════════════╦═════╗
║ org ║ part ║ rest ║ pos ║
╠══════════════════════════════╬══════════════════════════════╬════════════════════════════╬═════╣
║ C:\Users\fidel\Desktop\Temp ║ C:\ ║ Users\fidel\Desktop\Temp\ ║ 3 ║
║ C:\Users\fidel\Desktop\Temp ║ C:\Users\ ║ fidel\Desktop\Temp\ ║ 9 ║
║ C:\Users\fidel\Desktop\Temp ║ C:\Users\fidel\ ║ Desktop\Temp\ ║ 15 ║
║ C:\Users\fidel\Desktop\Temp ║ C:\Users\fidel\Desktop\ ║ Temp\ ║ 23 ║
║ C:\Users\fidel\Desktop\Temp ║ C:\Users\fidel\Desktop\Temp ║ ║ 28 ║
╚══════════════════════════════╩══════════════════════════════╩════════════════════════════╩═════╝
Solution 3
If you want to search for the values individually, use the code below:
WITH RECURSIVE split(content, last, rest) AS (
VALUES('', '', 'value1§value2§value3§value4§value5§value6§value7')
UNION ALL
SELECT
CASE WHEN last = '§'
THEN
substr(rest, 1, 1)
ELSE
content || substr(rest, 1, 1)
END,
substr(rest, 1, 1),
substr(rest, 2)
FROM split
WHERE rest <> ''
)
SELECT
REPLACE(content, '§','') AS 'ValueSplit'
FROM
split
WHERE
last = '§' OR rest ='';
Result:
**ValueSplit**
value1
value2
value3
value4
value5
value6
value7
I hope I can help people with the same problem.
Solution 4
There's simpler alternative to the recursive CTE, that also can be applied to a number of file paths in a result set (or generally any delimited strings that you want to "split" into multiple rows by a separator).
SQLite has JSON1 extension. It's compatible with SQLite >= 3.9.0 (2015-10-14), but sqlite3
is almost always compiled with it now (e.g. Ubuntu, Debian, official Python Docker images and so on, and you can check it with PRAGMA compile_options
and this answer has a little more detail on it).
JSON1 has json_each
, which is one of the two table-valued functions in the extension that:
walk the JSON value provided as their first argument and return one row for each element.
Hence if you can turn your string into a JSON array string, this function will do the rest. And it's not hard to do.
const sql = `
WITH input(filename) AS (
VALUES
('/etc/redis/redis.conf'),
('/run/redis/redis-server.pid'),
('/var/log/redis-server.log')
), tmp AS (
SELECT
filename,
'["' || replace(filename, '/', '", "') || '"]' as filename_array
FROM input
)
SELECT (
SELECT group_concat(ip.value, '/')
FROM json_each(filename_array) ip
WHERE ip.id <= p.id
) AS path
FROM tmp, json_each(filename_array) AS p
WHERE p.id > 1 -- because the filenames start with the separator
`
async function run() {
const wasmUrl = 'https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.5.0/sql-wasm.wasm'
const sqljs = await window.initSqlJs({locateFile: file => wasmUrl})
const db = new sqljs.Database()
const results = db.exec(sql)
ko.applyBindings(results[0])
}
run()
<script src="https://cdnjs.cloudflare.com/ajax/libs/knockout/3.4.2/knockout-min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.5.0/sql-wasm.min.js"></script>
<table>
<thead>
<tr data-bind="foreach: columns"><th data-bind="text: $data"></th></tr>
</thead>
<tbody data-bind="foreach: values">
<tr data-bind="foreach: $data"><td data-bind="text: $data"></td></tr>
</tbody>
</table>
Solution 5
Inspired from Lukasz Szozda's answer:
WITH RECURSIVE cte("pre","post") AS (
VALUES('C:', 'Users\fidel\Desktop\Temp' || '\')
UNION ALL
SELECT "pre" || '\' || left("post", position('\' in "post")-1),
substring("post" from position('\' in "post")+1)
FROM cte
WHERE "post" > ''
)
SELECT "pre" FROM cte
(tested on PostgreSQL)
The idea is now to replace the VALUES line
VALUES('C:', 'Users\fidel\Desktop\Temp' || '\')
with placeholders like
VALUES(?, ? || '\')
which have been pre-split in the programming language that is going to run the SQL statement above against the data base.
Reading the SQLite docs, I see that substring(... from ...)
has to be replaced by substr(..., ...)
and position(... in ...)
is to be replaced by instr(..., ...)
with parameters swapped.
Very annoying for me since I wanted SQL code that runs on both PostgreSQL and SQLite.
Fidel
Updated on June 12, 2022Comments
-
Fidel almost 2 years
Using
SQLite
, I'd like to split a string in the following way.Input string:
C:\Users\fidel\Desktop\Temp
and have the query return these rows:
C:\ C:\Users\ C:\Users\fidel\ C:\Users\fidel\Desktop\ C:\Users\fidel\Desktop\Temp
In other words, I'd like to split a file path into its constituent paths. Is there a way to do this in pure SQLite?