Using AWK to read line from file and create a variable

21,029

The output of the AWK command is assigned to the variable. To see the contents of the variable, do this:

echo "$INFILE"

You should use single quotes for your AWK command so you don't have to escape the literal dollar sign (the literal string should be quoted, see below if you want to substitute a shell variable instead):

awk 'NR == "$Line"' /myPath/fileList.txt

The $() form is much preferred over the backtick form (I don't understand why you have the backticks escaped, by the way). Also, you should habitually use lowercase or mixed case variable names to avoid name collision with shell or environment variables.

infile=$(awk 'NR == "$Line"' /myPath/fileList.txt)
echo "$infile"

If your intention is that the value of a variable named $Line should be substituted rather than the literal string "$Line" being used, then you should use AWK's -v variable passing feature:

infile=$(awk -v "line=$Line" 'NR == line' /myPath/fileList.txt)
Share:
21,029
ghoti
Author by

ghoti

My name here is my nick on Libera.chat if you want to reach me there. Oh, and #SOreadytohelp . :)

Updated on July 06, 2022

Comments

  • ghoti
    ghoti almost 2 years

    I have a text file with a list of filenames. I would like to create a variable from a specific line number using AWK. I get the correct output using:

    awk "NR==\$Line" /myPath/fileList.txt
    

    I want to assign this output to a variable and from documentation I found I expected the following to work:

    INFILE=$(awk "NR==\$Line" /myPath/fileList.txt)
    

    or

    INFILE=`awk "NR==\$Line" /myPath/fileList.txt`
    

    However,

    echo "\$INFILE" 
    

    is blank. I am new to bash scripting and would appreciate any pointers.

    • Mat
      Mat about 12 years
      Neither of those commands are supposed to give any output, they set the variable INFILE. (First version is "better".)
    • Admin
      Admin about 12 years
      Sorry, I should have clarified. echo "\$INFILE" is blank.
    • ghoti
      ghoti about 12 years
      If you escape your dollar sign, you don't expand the INFILE variable. Try echo "$INFILE" instead.
    • Admin
      Admin about 12 years
      @ghoti: I found when I tested this by hard coding the line number that the escape was necessary. I think this is because I am submitting my script to a job scheduler.
    • Kaz
      Kaz about 12 years
      echo "\$INFILE" is definitely not blank. It means echo the characters $INFILE literally: dollar sign, followed by INFILE.
    • Admin
      Admin about 12 years
      @Kaz: Normally that is true. That's how I know that there's a problem.
    • Admin
      Admin about 12 years
      I did not find out why I need to escape the variables, this is probably an SGE issue. I did figure out how to get them to be evaluated correctly. I needed another escape in front of the first $ and the quotes removed. The following works for me: infile=\$(awk -v line=\$SGE_TASK_ID 'NR == line' /myPath/my_outfile_list.txt)
  • SourceSeeker
    SourceSeeker about 12 years
    No, that won't produce INFILE ten times. It will produce INFILE with a "0" after it. In this context, the curly braces aren't used for both parameter expansion and brace expansion - only the former. You could do INFILE=foo; echo "${INFILE}"{0..9} which would output foo0 foo1 foo2 foo3 foo4 foo5 foo6 foo7 foo8 foo9. By the way, you should always quote variables for output. And those aren't apostrophes (single quotes), they're double quotes.
  • Admin
    Admin about 12 years
    the $line variable is an environmental variable set by my job scheduler, SGE. I believe I need to escape the variable name in my submit script. I attempted infile=$(awk -v "line=$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt) but am getting the same result from echo $infile.
  • SourceSeeker
    SourceSeeker about 12 years
    @Sara: Then probably my last example would be the way to go. Then you can echo "$infile" or use the variable in other ways, of course.
  • Admin
    Admin about 12 years
    I can get this to work by hard coding the line number: infile=$(awk "NR==1" /myPath/fileList.txt). But I can only see the value using echo \$infile. echo $infile is blank. I'm still working on understanding why this is.
  • Admin
    Admin about 12 years
    Thanks, your example makes sense and looks like it should work, but after following it I'm still getting the same blank result from echo "$infile".
  • Peter.O
    Peter.O about 12 years
    Dennis Williamson's example should (and does) output abc; assuming you have set SGE_TASK_ID to a suitable value, eg. SGE_TASK_ID=1; echo abc >file; infile=$(awk -v "line=$SGE_TASK_ID" 'NR == line' file); echo "$infile"
  • Admin
    Admin about 12 years
    Thanks for verifying. I suspect I have some additional problem since I cannot duplicate these results. I think its interesting that I need to escape both my variables and the environmental variable when using echo. I am wondering if the job scheduler could cause this behavior, although I cannot find this in the documentation. Any thoughts or suggestions would be appreciated.
  • SourceSeeker
    SourceSeeker about 12 years
    @Sara: I just noticed that you used the phrase "submit script" and made the connection with all the extra escaping and a light went off. You're submitting this script to some program for it to actually execute it rather than running the script in a more usual way - is that correct? I would then recommend that you make your script in a separate file and only submit the filename to the program that's executing it. Then no special escaping should be necessary.
  • Admin
    Admin about 12 years
    I found that: echo awk -v "line=\$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt produces: awk -v line=undefined NR == line /myPath/fileList.txt. Adding an escape before the $ produces: awk -v line=3 NR == line /myPath/fileList.txt. This is exactly what I want to execute and assign to infile. But infile=$(awk -v "line=\$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt) echo "$infile" produces a blank. The problem is definitely the variable assignment because I can hard code a number and get the correct result.
  • Admin
    Admin about 12 years
    @Dennis: Thanks for your input. My script is saved as a file and submitted to the job scheduler which uses the script to create an array of jobs. I'm not completely clear on how to apply what you're recommending to this case?
  • user unknown
    user unknown about 12 years
    @DennisWilliamson: If INFILE is 7, it will produce 70, that's what I meant. But my first impression, that the content of a line is a number, is wrong, and while I realized it in the end, I was interrupted while writing, and forgot to adjust the first example. I don't agree about "you should always ..." - that's cargo cult programming.
  • user unknown
    user unknown about 12 years
    @Sara: Do you want to echo the command, or the result of the command? I updated my answer to clarify Dennis questions (it was really bad explained with 10 times) and also added a short variation, which takes a parameter for $LINE. Note, that $LINE isn't masked in my example too, but maybe I get you completely wrong.
  • ghoti
    ghoti about 12 years
    @Sara - Why not simplify things, write your automation in your own script somewhere else in the filesystem, then have that script called by a small wrapper that you submit to your job scheduler?
  • Admin
    Admin about 12 years
    @ghoti: I'm a beginner and am not familiar with what you're describing, so this would probably not be simple for me. I hoped I was getting close to a solution here, since I just need to get awk to properly interpret my variable. Please correct me if I'm wrong and need to try a different approach.
  • ghoti
    ghoti about 12 years
    @Sara - Far be it for me to push you in a new direction if you feel you're getting close. :-) But the largest part of your struggle seems to be making things work with your scheduler. If you can avoid the issue of escaping $ characters by running a more "pure" script stored in a separate file, then the help you get here would be more about the code and less about the analysis of your scheduler. OTOH, it's great that you've attracted some folks who seem to know your environment. Good luck with it. :)
  • Admin
    Admin about 12 years
    I tried echoing the command from inside the $() to see how it was being evaluated. The variable was undefined w/ out the escape, and looked correct with the escape. I think this is because I am submitting my script through a job scheduler. Since the command looked correct using echo, I'm not sure why it's not being evaluated correctly when I put it inside the $().
  • user unknown
    user unknown about 12 years
    Which job scheduler? There are some pitfalls with cronjobs: No.1: No path is set (by default). No.2: Your (or the crontabs owner) environment variables aren't set. Test your script without cron, and specify where the problem is, or specify your scheduler problem, if it only exists in the scheduler. Note, that you tagged your question bash. Use env -i to test your script how it is started without environment.
  • SourceSeeker
    SourceSeeker about 12 years
    @userunknown: It's not "cargo cult programming". It protects against unexpected behavior when the value contains whitespace. "Also unfortunately, quoting in shell programming is extremely important. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues." The scheduler isn't cron, it's something called "SGE" (which may be the Sun Grid Engine) which Sara reveals in one of her comments.
  • SourceSeeker
    SourceSeeker about 12 years
    @Sara: If SGE is the Sun Grid Engine, you may find the information and example scripts here to be instructive. Basically, you create a shell script file with its first line as #!/bin/sh or #!/bin/bash and put the rest of your script after that without any extra escaping (read: "perhaps almost none"). Save the file with a name you choose and in a directory that the scheduler has access to. Then submit /dir/where/script/is/scriptname (substituting the actual path and file name) to the scheduler.
  • SourceSeeker
    SourceSeeker about 12 years
    You'll notice that the example scripts at the link don't have any escaping.
  • Admin
    Admin about 12 years
    @Dennis: That's correct, I'm using Sun Grid Engine as my job scheduler. I do have these lines in my script, so I'm not sure why I'm still needing to use escapes. I noticed the online examples did not use them while examples from another user on my compute cluster did. I will look into whether this could be system specific or if there is another error in my submit script.
  • user unknown
    user unknown about 12 years
    @DennisWilliamson: Quoting everything is often a technique to avoid learning when it is necessary to quote and when it is not. I often see people quoting literal constants which don't need a quote.
  • SourceSeeker
    SourceSeeker about 12 years
    @userunknown: You'll note that I said "on output". I see lots of unnecessary quoting, too, however it's safer to quote variables than not and it doesn't hurt to do so. I certainly know when it's safe not to. Perhaps the warning should be modified to "quote all variables until you learn when you don't have to and don't come crying to me when you don't quote one and it bites you." ;-)