Regex to tell if a string contains a linux file path or if a linux file path as part of a string

23,384

Solution 1

Regex for full Linux filesystem paths can be:

^(/[^/ ]*)+/?$

RegEx Demo

Solution 2

The only linux (and unix) not allowed character in a file path is the ascii nul character \0 (it's not allowed as it is used as a string terminator ---in this case a path name terminator--- in the open(2) system call, so you can have always only one, at the end, not counting as a file character). Old unices disallowed the grouping of several / slash characters together, so the right regexp would be (\/?[^\0/])+|\/ (a sequence of an optional slash character followed by a non nul and non slash character, or the / entry alone ---indicating the root directory) That allows all the characters but the ascii nul, and doesn't allow two slashes to appear together. Recent implementations allow grouping of slashes (collapsing them into one) so the valid path regexp would be [^\0]+.

But this matches all the input you have exposed (even, it will match all the input as one file path, as \n characters are allowed as part of a filename), so you'll have to be more precise in your question to expose what you want and what you don't want to accept. "foo.log was written" and "the file " (with that final space) are valid filenames in linux (and in unix). what about other control characters? What about escape sequences, wildcar characters (like * or ?), etc?

Solution 3

(/)+[a-zA-Z0-9\\-_/ ]*(.log)

or

(/)+[a-zA-Z0-9\\-_/ ]*(.cpp) for to match c++ file path in string. It may help

Share:
23,384
jgr208
Author by

jgr208

Updated on August 30, 2020

Comments

  • jgr208
    jgr208 over 3 years

    I am writing a regular expression that will be used to see if a string contains a file path for a linux system as the whole string or a file path for a linux system as only part of the string. So basically when a file path is the whole string I want a match, but when the file path is just part of the string I don't want a match. For example I would want the following string to tell me there is a match

    /home/user/Documents/foo.log

    and this string not be a match

    /home/user/Documents/foo.log was written

    as well as this string not be a match

    the file /home/user/Documents/foo.log was written

    The only thing I have been able to come up with so far is this,

    ^(\/*)

    Which only says ok you have a slash followed by a character but am not sure what else to use to get the regular expression to work as I would like it to. Does anyone have any input on how to expand upon my regular expression to get it to match what I am looking to do?

    EDIT

    Spaces are not part of allowed file names as part of the naming convention. Yes a user could put a space since it is a linux system, however that would then be a user error.