Reading from a csv file and extracting certain data columns based on first column value

60,344

Solution 1

(
FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
  if "%%~A"=="RH" echo %%~B
  if "%%~A"=="SH" echo %%~D
 )
)>youroutputfilename

Should work - no need to assign all the values to different variables - BUT if you plan to use them, then

FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
...
Set _var17=%%Q
Set _var18=%%R
CALL :PROCESS
)
...
GOTO :EOF

:PROCESS
IF %_var1%=="RH" echo %_var2%
IF %_var1%=="SH" echo %_var4%
GOTO :EOF

Note that with this method, since you are assigning %%x to _varx then if %%x is quoted, the quotes will be INCLUDED in the value assigned. To remove the enclosing quotes (if they exist) use SET _varx=%%~x.


Addendum 20130703-1956Z for OP's problem

@ECHO OFF
SETLOCAL
SET _Inputfile=u:\noname1.txt
(
FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
  SET "RH="
  SET "SH="
  ECHO(%%A|FINDSTR /l /c:"\"RH\"" >NUL
  IF NOT ERRORLEVEL 1 SET RH=Y
  ECHO(%%A|FINDSTR /l /c:"\"SH\"" >NUL
  IF NOT ERRORLEVEL 1 SET SH=Y
  if DEFINED RH echo %%~B
  if DEFINED SH echo %%~D
 )
)>u:\youroutputfilename
TYPE u:\youroutputfilename
del u:\youroutputfilename
echo========First way

(
FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
  SET _var1=%%A
  SET "RH="
  SET "SH="
  CALL :process
  if DEFINED RH echo %%~B
  if DEFINED SH echo %%~D
 )
)>u:\youroutputfilename

TYPE u:\youroutputfilename
del u:\youroutputfilename
echo========Second way

SETLOCAL ENABLEDELAYEDEXPANSION 
(
FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
  SET _var1=%%A
  IF "!_var1:~-4!"==""RH"" echo %%~B
  IF "!_var1:~-4!"==""SH"" echo %%~D
 )
)>u:\youroutputfilename

TYPE u:\youroutputfilename
del u:\youroutputfilename
echo========Third way
ENDLOCAL

GOTO :EOF

:process
IF "%_var1:~-4%"==""RH"" SET RH=Y
IF "%_var1:~-4%"==""SH"" SET SH=Y
GOTO :EOF

Solution 2

You have a parsing issue. First end the for loop with ), after this you can use the new variables:

@echo off
:: Set input file in variable
::Set _InputFile=%1

:: Store input line into different variables
FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
    Set "_var1=%%A"
    Set "_var2=%%B"
    Set "_var3=%%C"
    Set "_var4=%%D"
    Set "_var5=%%E"
    Set "_var6=%%F"
    Set "_var7=%%G"
    Set "_var8=%%H"
    Set "_var9=%%I"
    Set "_var10=%%J"
    Set "_var11=%%K"
    Set "_var12=%%L"
    Set "_var13=%%M"
    Set "_var14=%%N"
    Set "_var15=%%O"
    Set "_var16=%%P"
    Set "_var17=%%Q"
    Set "_var18=%%R"
)

IF "%_var1%"=="RH" echo %var2%

Solution 3

You need to enable delayed expansion:

@echo off

setlocal EnableDelayedExpansion

set "_InputFile=..."

for /f "tokens=1-18* delims=," %%A in (%_InputFile%) do (
  Set _var1=%%A
  Set _var2=%%B
  ...

  if "!_var1!"=="RH" echo !_var2!
)

Solution 4

as there was no answer to the "why does my line starts with "RH"", I'll do some gravedigging.

So, the  comes from the BOM (Byte Order Mark) which indicates the file is in UTF, and the way the bytes are written if necessary. for the answer: you can use

if x%_var1:RH=%x NEQ x%_var1%x (echo %_var2%)

this will check if RH is in %_var1% (if after replacing RH in the var, it is unchanged, RH is not in the var) which means, whether the Bom is here or not is not important. Though, you'll have problems if you want an exact match.

another way to deal with it is to not include the bom in your file, which means saving either in ASCII or UTF-8 without BOM; Or using a tool to strip the bom from your UTF-8 file.

Share:
60,344
user2550880
Author by

user2550880

Updated on February 05, 2020

Comments

  • user2550880
    user2550880 about 4 years

    This is my first batch program and I have been searching online but still struggling to write up a solution.

    I have the following CSV file:

    "RH",2013/06/15 02:14:58 -0400,"X","LQ3SUEEWPWKL6",005,
    "FH",01
    "SH",2013/06/14 00:00:00 -0400,2013/06/14 23:59:59 -0400,"LQ3SUEEWPWKL6",""
    "CH","TransactionID","InvoiceID", 
    ......
    

    I'm trying to write a simple program to do the following:

    • If column1 = "RH", then extract column2 value (2013/06/15 02:14:58 -0400)
    • If column1 = "SH", then extract column4 value (LQ3SUEEWPWKL6)

    and pipe output to a file.


    This is my code so far but the if condition is not working for me

    @echo off
    :: Set input file in variable
    ::Set _InputFile=%1
    
    :: Store input line into different variables
    FOR /F "tokens=1-18* delims=," %%A IN (%_InputFile%) DO (
    Set _var1=%%A
    Set _var2=%%B
    Set _var3=%%C
    Set _var4=%%D
    Set _var5=%%E
    Set _var6=%%F
    Set _var7=%%G
    Set _var8=%%H
    Set _var9=%%I
    Set _var10=%%J
    Set _var11=%%K
    Set _var12=%%L
    Set _var13=%%M
    Set _var14=%%N
    Set _var15=%%O
    Set _var16=%%P
    Set _var17=%%Q
    Set _var18=%%R
    
    
    IF "%_var1%"=="RH" echo %var2%
    
    )
    

    My CSV file looks fine in Excel and Notepad but when I execute the script to display the first variable, it looks like there's some garbage characters just before the "RH" on the first record - I cannot bypass it since I need to extract additional column data if var1 = "RH":

    "RH"
    FH
    01
    SH
    CH
    TransactionID,PaymentTrackingID,
    SF
    SF
    SC
    RF
    CAD,CR,0
    RF
    USD,CR,0
    RC
    FF