Shell valid function name characters
Solution 1
Since POSIX documentation allow it as an extension, there's nothing prevent implementation from that behavior.
A simple check (ran in zsh
):
$ for shell in /bin/*sh 'busybox sh'; do
printf '[%s]\n' $shell
$=shell -c 'á() { :; }'
done
[/bin/ash]
/bin/ash: 1: Syntax error: Bad function name
[/bin/bash]
[/bin/dash]
/bin/dash: 1: Syntax error: Bad function name
[/bin/ksh]
[/bin/lksh]
[/bin/mksh]
[/bin/pdksh]
[/bin/posh]
/bin/posh: á: invalid function name
[/bin/yash]
[/bin/zsh]
[busybox sh]
sh: syntax error: bad function name
show that bash
, zsh
, yash
, ksh93
(which ksh
linked to in my system), pdksh
and its derivation allow multi-bytes characters as function name.
yash
is designed to support multibyte characters from the beginning, so there's no surprise it worked.
The other documentation you can refer is ksh93
:
A blank is a tab or a space. An identifier is a sequence of letters, digits, or underscores starting with a letter or underscore. Identifiers are used as components of variable names. A vname is a sequence of one or more identifiers separated by a . and optionally preceded by a .. Vnames are used as function and variable names. A word is a sequence of characters from the character set defined by the current locale, excluding non-quoted metacharacters.
So setting to C
locale:
$ export LC_ALL=C
$ á() { echo 1; }
ksh: á: invalid function name
make it failed.
Solution 2
Note that functions share the same namespace as other commands including commands in the file system, which on most systems have no limitation on the characters or even bytes they may contain in their path.
So while most shells restrict the characters of their functions, there's no real good reason why they would do that. That means in those shells, there are commands you can't replace with a function.
zsh
and rc
allow anything for their function names including some with /
and the empty string. zsh
even allows NUL bytes.
$ zsh
$ $'\0'() echo nul
$ ^@
nul
$ ""() uname
$ ''
Linux
$ /bin/ls() echo test
$ /bin/ls
test
A simple command in shell is a list of arguments, and the first argument is used to derive the command to execute. So, it's just logical that those arguments and function names share the same possible values and in zsh
arguments to builtins and functions can be any byte sequence.
There's not security issue here as the functions you (the script author) define are the ones you invoke.
Where there may be security issues is when the parsing is affected by the environment, for instance with shells where the valid names for functions is affected by the locale.
Related videos on Youtube
![Admin](/assets/logo_square_200-5d0d61d6853298bd2a4fe063103715b4daf2819fc21225efa21dfb93e61952ea.png)
Admin
Updated on September 18, 2022Comments
-
Admin almost 2 years
Using extended Unicode characters is (no-doubt) useful for many users.
Simpler shells (ash (busybox), dash) and ksh do fail with:
tést() { echo 34; } tést
But bash, mksh, lksh, and zsh seem to allow it.
I am aware that POSIX valid function names use this definition of Names. That means this regex:
[a-zA-Z_][a-zA-Z0-9_]*
However, in the first link it is also said:
An implementation may allow other characters in a function name as an extension.
The questions are:
- Is this accepted and documented?
- Where?
- For which shells (if any)?
Related questions:
Its possible use special characters in a shell function name?
I am not interested in using meta-characters (>) in function names.Upstart and bash function names containing “-”
I do not believe that an operator (subtraction "-") should be part of a name. -
mr.spuratic over 8 yearsOne may play games in bash too, starting with
function /bin/sh { echo "$0: $FUNCNAME: Permission denied"; return 126; }
, and potentially useful things too with functions named--
,//
,@
or%
etc. -
mikeserv over 8 yearsbut dont shells tend to bypass a hash-table lookup when
/
is found in a name? and a function isnt just an executable name - its code. i would think a simple implementation could encounter a lot of parse problems if its stored function names included metacharacters. -
schily about 6 years
posh
isn't worth to be listed in such a list. It depends on Linux specific bugs inlibc
and will not work on other platforms. -
schily about 6 yearsI cannot repeat your claims about
ksh93
using a self compiled ksh93 from original sources. Whileksh88
seems to accept non-7-Bit-ASCII letters for function names, only theksh93
binary from Ubuntu seems to accept them. -
cuonglm about 6 years@schily ksh I used in this test is the binary in Debian (so it may be the same with one on Ubuntu)