Split a string containing command-line parameters into a String[] in Java

21,121

Solution 1

Here is a pretty easy alternative for splitting a text line from a file into an argument vector so that you can feed it into your options parser:

This is the solution:

public static void main(String[] args) {
    String myArgs[] = Commandline.translateCommandline("-a hello -b world -c \"Hello world\"");
    for (String arg:myArgs)
        System.out.println(arg);
}

The magic class Commandline is part of ant. So you either have to put ant on the classpath or just take the Commandline class as the used method is static.

Solution 2

If you need to support only UNIX-like OSes, there is an even better solution. Unlike Commandline from ant, ArgumentTokenizer from DrJava is more sh-like: it supports escapes!

Seriously, even something insane like sh -c 'echo "\"un'\''kno\"wn\$\$\$'\'' with \$\"\$\$. \"zzz\""' gets properly tokenized into [bash, -c, echo "\"un'kno\"wn\$\$\$' with \$\"\$\$. \"zzz\""] (By the way, when run, this command outputs "un'kno"wn$$$' with $"$$. "zzz").

Solution 3

/**
 * [code borrowed from ant.jar]
 * Crack a command line.
 * @param toProcess the command line to process.
 * @return the command line broken into strings.
 * An empty or null toProcess parameter results in a zero sized array.
 */
public static String[] translateCommandline(String toProcess) {
    if (toProcess == null || toProcess.length() == 0) {
        //no command? no string
        return new String[0];
    }
    // parse with a simple finite state machine

    final int normal = 0;
    final int inQuote = 1;
    final int inDoubleQuote = 2;
    int state = normal;
    final StringTokenizer tok = new StringTokenizer(toProcess, "\"\' ", true);
    final ArrayList<String> result = new ArrayList<String>();
    final StringBuilder current = new StringBuilder();
    boolean lastTokenHasBeenQuoted = false;

    while (tok.hasMoreTokens()) {
        String nextTok = tok.nextToken();
        switch (state) {
        case inQuote:
            if ("\'".equals(nextTok)) {
                lastTokenHasBeenQuoted = true;
                state = normal;
            } else {
                current.append(nextTok);
            }
            break;
        case inDoubleQuote:
            if ("\"".equals(nextTok)) {
                lastTokenHasBeenQuoted = true;
                state = normal;
            } else {
                current.append(nextTok);
            }
            break;
        default:
            if ("\'".equals(nextTok)) {
                state = inQuote;
            } else if ("\"".equals(nextTok)) {
                state = inDoubleQuote;
            } else if (" ".equals(nextTok)) {
                if (lastTokenHasBeenQuoted || current.length() != 0) {
                    result.add(current.toString());
                    current.setLength(0);
                }
            } else {
                current.append(nextTok);
            }
            lastTokenHasBeenQuoted = false;
            break;
        }
    }
    if (lastTokenHasBeenQuoted || current.length() != 0) {
        result.add(current.toString());
    }
    if (state == inQuote || state == inDoubleQuote) {
        throw new RuntimeException("unbalanced quotes in " + toProcess);
    }
    return result.toArray(new String[result.size()]);
}

Solution 4

Expanding on Andreas_D's answer, instead of copying, use CommandLineUtils.translateCommandline(String toProcess) from the excellent Plexus Common Utilities library.

Share:
21,121
Kaleb Pederson
Author by

Kaleb Pederson

Software Craftsman, Husband, Father.

Updated on April 19, 2020

Comments

  • Kaleb Pederson
    Kaleb Pederson about 4 years

    Similar to this thread for C#, I need to split a string containing the command line arguments to my program so I can allow users to easily run multiple commands. For example, I might have the following string:

    -p /path -d "here's my description" --verbose other args
    

    Given the above, Java would normally pass the following in to main:

    Array[0] = -p
    Array[1] = /path
    Array[2] = -d
    Array[3] = here's my description
    Array[4] = --verbose
    Array[5] = other
    Array[6] = args
    

    I don't need to worry about any shell expansion, but it must be smart enough to handle single and double quotes and any escapes that may be present within the string. Does anybody know of a way to parse the string as the shell would under these conditions?

    NOTE: I do NOT need to do command line parsing, I'm already using joptsimple to do that. Rather, I want to make my program easily scriptable. For example, I want the user to be able to place within a single file a set of commands that each of which would be valid on the command line. For example, they might type the following into a file:

    --addUser admin --password Admin --roles administrator,editor,reviewer,auditor
    --addUser editor --password Editor --roles editor
    --addUser reviewer --password Reviewer --roles reviewer
    --addUser auditor --password Auditor --roles auditor
    

    Then the user would run my admin tool as follows:

    adminTool --script /path/to/above/file
    

    main() will then find the --script option and iterate over the different lines in the file, splitting each line into an array that I would then fire back at a joptsimple instance which would then be passed into my application driver.

    joptsimple comes with a Parser that has a parse method, but it only supports a String array. Similarly, the GetOpt constructors also require a String[] -- hence the need for a parser.