Parameter expansion in variable assigned with a wildcard

5,735

The command Forward=*R1*.at.fastq sets the variable Forward to the string *R1*.at.fastq (star, capital R, digit 1, star, dot, lowercase A, etc.). Wildcards are only expanded in contexts that allow multiple words; the right-hand size of a variable assignment expects a single word, so no wildcard expansion occurs.

In a command like cat $Forward, the wildcards in the value of Forward are expanded. When a variable is expanded outside double quotes, its value is intepreted as a whitespace-delimited list of wildcard patterns, and if any pattern matches one or more files, it's replaced by the list of files.

In ${Forward/.at/.atqt}, first the variable's value is looked up: *R1*.at.fastq. Then the text substitution is applied to this string, yielding *R1*.atqt.fastq. The result is an unquoted variable expansion so it is interpreted as a wildcard pattern. But *R1*.atqt.fastq doesn't match any file, so it's left unchanged.

To expand the wildcard when setting Forward, you could make it an array.

Forward=(*R1*.at.fastq)

This sets Forward to a 1-element array, the element being the string MA502_TAAGGCGA-TCGCAGG_L001_R1_001.at.fastq. The wildcard pattern is expanded to the list of matches because it's in a context (the parentheses of array assignment) where multiple words are expected.

In bash, $Forward when Forward is an array is equivalent to ${Forward[0]} — referencing an array variable with the same syntax as a scalar variable refers to the first element of the array. So you can leave your awk command unchanged.

Share:
5,735

Related videos on Youtube

Admin
Author by

Admin

Updated on September 18, 2022

Comments

  • Admin
    Admin almost 2 years

    I have two files in my current folder (MA502) whose names are -

    MA502_TAAGGCGA-TCGCAGG_L001_R1_001.at.fastq
    MA502_TAAGGCGA-TCGCAGG_L001_R2_001.at.fastq
    

    I have many such folders - ex MA503, MA504 etc, and I want to loop over those.

    I assign my variable names using wild cards -

    Forward=*R1*.at.fastq
    Reverse=*R2*.at.fastq
    

    I want to process these files in a script, and I want my output to replace .at to .atqt, so that the final name would look like -

    MA502_TAAGGCGA-TCGCAGG_L001_R1_001.atqt.fastq
    MA502_TAAGGCGA-TCGCAGG_L001_R2_001.atqt.fastq
    

    I tried

    awk 'script' $Forward > ${Forward/.at/.atqt}
    

    My final file name looks like -

    *R1*.atqt.fastq
    

    instead of my expectation which was

    MA502_TAAGGCGA-TCGCAGG_L001_R1_001.atqt.fastq
    

    I've learnt everything by necessity on unix, so I'm not sure how variable names are processed. Any help is appreciated!