Segmentation fault when calling a recursive bash function

13,249

There are many reasons for a segmentation fault. The most common low-level cause is that the process tried to access a memory address which isn't defined, i.e. an invalid pointer dereference. This is often a bug in the program.

Here, you're running a shell program. The shell is a high-level programming language, without pointers, so your script can't cause an invalid pointer dereference as such.

Many programs have limited space for their call stack and die of a segmentation fault is the stack size is exceeded. In most cases, the stack size limit is large enough for any reasonable data, but an infinite recursion can blow the stack.

In bash, infinite recursion in a function call does cause a segmentation fault. (The same goes for dash and mksh; ksh and zsh are smarter and apply a maximum function call nesting depth at the shell level so that they don't segfault.)


Your script has several bugs. The one that's biting you is that in the case of a regular file, you always call recurse at the end, whereas you clearly meant to do it only for zip files.

Don't use && or || when you mean if. It's clearer to write what you mean; brevity through obscurity is not a good idea and it bit you here.

if [[ ${extension} = "zip" ]]; then
  unzip -uq $currentItem -d "${extractionDirectory}"
  recurse ${extractionDirectory}
fi

Another bug is that you're missing double quotes around variable substitutions, so your program will choke on file names containing whitespace (among others). Always use double quotes around variable substitutions unless you know that you need to leave them off.

Use parameter expansion instead of calling basename and dirname. It's easier to deal with special cases (e.g. file name beginning with -) and it's faster.

Another bug I happened to spot is that the pattern +(sh|xslt|dtd|log|txt) is clearly meant to be @(sh|xslt|dtd|log|txt) (match these extensions, not shsh, dtdtxtshdtd etc.).

Here's the regular file case, with the bugs above fixed and rewritten with case for clarity:

case "$extension" in
  sh|xslt|dtd|log|txt) break;;
  zip)
    extractionDirectory=$"{currentItem%.zip}"
    unzip -uq "$currentItem" -d "${extractionDirectory}"
    recurse "${extractionDirectory}"
esac

Note that I haven't verified the logic or tested the code. This seems to be a complicated way of writing

find -type f -name '*.zip' -exec sh -c 'unzip -uq "$0" -d "${0%.zip}"' {} \;
Share:
13,249
Noel Alex Makumuli
Author by

Noel Alex Makumuli

Updated on September 18, 2022

Comments

  • Noel Alex Makumuli
    Noel Alex Makumuli over 1 year

    I have hundreds of multiple folders which contains thousands of zip files which contain nested within the zip files like show on three below

    start tree structure
    012016/
    ├── 2016-01
    │   └── 2016-01
    │       ├── build
    │       ├── DOC
    │       │   ├── WONWA1
    │       │   │   ├── WO1NWA1
    │       │   │   │   ├── WO2016000001NWA1.xml
    │       │   │   ├── WO1NWA1.zip
    │       │   │   ├── WO2NWA1
    │       │   │   │   ├── WO2016000002NWA1_tr.xml
    │       │   │   ├── WO2NWA1.zip
    └── 2016-01.zip
    
    end tree structure
    

    I have created a short script below which check for the folder and contents recursively, and if it finds any zip files it extracts the contents and then continues to check the contents of the extracted folder.

    When I try to run the script below:

    recurse() {
        for i in "$1"/*;
        do
            currentItem="$i"
            extension="${currentItem##*.}"
    
            if [ -d "$i" ]; then
                #echo "dir: $i"
                recurse "$i"
            elif [ -f "$i" ];   then
                #echo "file: $i"
                #echo "ext: $extension"
    
                [[ ${extension} = +(sh|xslt|dtd|log|txt) ]] && break
    
                extractionDirectory=$(dirname $currentItem)/$(basename -s .zip $currentItem )
    
                [[ ${extension} = "zip" ]] && unzip -uq $currentItem -d "${extractionDirectory}"
    
                recurse ${extractionDirectory}
            fi
        done }
        recurse $PWD
    

    However, when i run the above script I am getting the error:

    Segmentation fault (core dumped)

    • Kenpachi
      Kenpachi almost 8 years
      So basically you just recurse and unzip the archives in the same directory as the archive right?
    • Gilles 'SO- stop being evil'
      Gilles 'SO- stop being evil' almost 8 years
      @MelBurslan Bash does core dump if the function call stack grows too large, which I think is what happens here due to an out-of-control recursion.
    • Gilles 'SO- stop being evil'
      Gilles 'SO- stop being evil' almost 8 years
      @NoelAlexMakumuli Attempting to read from a file when fopen returned NULL is just one example among many, many, many of a segfault.
  • Noel Alex Makumuli
    Noel Alex Makumuli almost 8 years
    Thanks a bunch for pointing out my bugs. The extension block is unnecessary, just putting the zip extension check in the a condition solve my problem as you pointed it out. if [[ ${extension} = "zip" ]]; then unzip -uq $currentItem -d "${extractionDirectory}" recurse ${extractionDirectory} fi