Python: Typehints for argparse.Namespace objects

12,288

Solution 1

Typed argument parser was made for exactly this purpose. It wraps argparse. Your example is implemented as:

from tap import Tap


class ArgumentParser(Tap):
    somearg: str


parsed = ArgumentParser().parse_args(['--somearg', 'someval'])
the_arg = parsed.somearg

Here's a picture of it in action. enter image description here

It's on PyPI and can be installed with: pip install typed-argument-parser

Full disclosure: I'm one of the creators of this library.

Solution 2

Consider defining an extension class to argparse.Namespace that provides the type hints you want:

class MyProgramArgs(argparse.Namespace):
    def __init__():
        self.somearg = 'defaultval' # type: str

Then use namespace= to pass that to parse_args:

def process_argv():
    parser = argparse.ArgumentParser()
    parser.add_argument('--somearg')
    nsp = MyProgramArgs()
    parsed = parser.parse_args(['--somearg','someval'], namespace=nsp)  # type: MyProgramArgs
    the_arg = parsed.somearg  # <- Pycharm should not complain

Solution 3

I don't know anything about how PyCharm handles these typehints, but understand the Namespace code.

argparse.Namespace is a simple class; essentially an object with a few methods that make it easier to view the attributes. And for ease of unittesting it has a __eq__ method. You can read the definition in the argparse.py file.

The parser interacts with the namespace in the most general way possible - with getattr, setattr, hasattr. So you can use almost any dest string, even ones you can't access with the .dest syntax.

Make sure you don't confuse the add_argument type= parameter; that's a function.

Using your own namespace class (from scratch or subclassed) as suggested in the other answer may be the best option. This is described briefly in the documentation. Namespace Object. I haven't seen this done much, though I've suggested it a few times to handle special storage needs. So you'll have to experiment.

If using subparsers, using a custom Namespace class may break, http://bugs.python.org/issue27859

Pay attention to handling of defaults. The default default for most argparse actions is None. It is handy to use this after parsing to do something special if the user did not provide this option.

 if args.foo is None:
     # user did not use this optional
     args.foo = 'some post parsing default'
 else:
     # user provided value
     pass

That could get in the way type hints. Whatever solution you try, pay attention to the defaults.


A namedtuple won't work as a Namespace.

First, the proper use of a custom Namespace class is:

nm = MyClass(<default values>)
args = parser.parse_args(namespace=nm)

That is, you initial an instance of that class, and pass it as the parameter. The returned args will be the same instance, with new attributes set by parsing.

Second, a namedtuple can only created, it can't be changed.

In [72]: MagicSpace=namedtuple('MagicSpace',['foo','bar'])
In [73]: nm = MagicSpace(1,2)
In [74]: nm
Out[74]: MagicSpace(foo=1, bar=2)
In [75]: nm.foo='one'
...
AttributeError: can't set attribute
In [76]: getattr(nm, 'foo')
Out[76]: 1
In [77]: setattr(nm, 'foo', 'one')    # not even with setattr
...
AttributeError: can't set attribute

A namespace has to work with getattr and setattr.

Another problem with namedtuple is that it doesn't set any kind of type information. It just defines field/attribute names. So there's nothing for the static typing to check.

While it is easy to get expected attribute names from the parser, you can't get any expected types.

For a simple parser:

In [82]: parser.print_usage()
usage: ipython3 [-h] [-foo FOO] bar
In [83]: [a.dest for a in parser._actions[1:]]
Out[83]: ['foo', 'bar']
In [84]: [a.type for a in parser._actions[1:]]
Out[84]: [None, None]

The Actions dest is the normal attribute name. But type is not the expected static type of that attribute. It is a function that may or may not convert the input string. Here None means the input string is saved as is.

Because static typing and argparse require different information, there isn't an easy way to generate one from the other.

I think the best you can do is create your own database of parameters, probably in a dictionary, and create both the Namespace class and the parsesr from that, with your own utility function(s).

Let's say dd is dictionary with the necessary keys. Then we can create an argument with:

parser.add_argument(dd['short'],dd['long'], dest=dd['dest'], type=dd['typefun'], default=dd['default'], help=dd['help'])

You or someone else will have to come up with a Namespace class definition that sets the default (easy), and static type (hard?) from such a dictionary.

Solution 4

If you are in a situation where you can start from scratch there are interesting solutions like

However, in my case they weren't an ideal solution because:

  1. I have many existing CLIs based on argparse, and I cannot afford to re-write them all using such args-inferred-from-types approaches.
  2. When inferring args from types it can be tricky to support all advanced CLI features that plain argparse supports.
  3. Re-using common arg definitions in multiple CLIs is often easier in plain imperative argparse compared to alternatives.

Therefore I worked on a tiny library typed_argparse that allows to introduce typed args without much refactoring. The idea is to add a type derived from a special TypedArg class, which then simply wraps the plain argparse.Namespace object:

# Step 1: Add an argument type.
class MyArgs(TypedArgs):
    foo: str
    num: Optional[int]
    files: List[str]


def parse_args(args: List[str] = sys.argv[1:]) -> MyArgs:
    parser = argparse.ArgumentParser()
    parser.add_argument("--foo", type=str, required=True)
    parser.add_argument("--num", type=int)
    parser.add_argument("--files", type=str, nargs="*")
    # Step 2: Wrap the plain argparser result with your type.
    return MyArgs(parser.parse_args(args))


def main() -> None:
    args = parse_args(["--foo", "foo", "--num", "42", "--files", "a", "b", "c"])
    # Step 3: Done, enjoy IDE auto-completion and strong type safety
    assert args.foo == "foo"
    assert args.num == 42
    assert args.files == ["a", "b", "c"]

This approach slightly violates the single-source-of-truth principle, but the library performs a full runtime validation to ensure that the type annotations match the argparse types, and it is just a very simple option to migrate towards typed CLIs.

Solution 5

Most of these answers involve using another package to handle the typing. This would be a good idea only if there wasn't such a simple solution as the one I am about to propose.

Step 1. Type Declarations

First, define the types of each argument in a dataclass like so:

from dataclasses import dataclass

@dataclass
class MyProgramArgs:
    first_var: str
    second_var: int

Step 2. Argument Declarations

Then you can set up your parser however you like with matching arguments. For example:

import argparse

parser = argparse.ArgumentParser("This CLI program uses type hints!")
parser.add_argument("-a", "--first-var")
parser.add_argument("-b", "--another-var", type=int, dest="second_var")

Step 3. Parsing the Arguments

And finally, we parse the arguments in a way that the static type checker will know about the type of each argument:

my_args = MyProgramArgs(**vars(parser.parse_args())

Now the type checker knows that my_args is of type MyProgramArgs so it knows exactly which fields are available and what their type is.

Share:
12,288

Related videos on Youtube

Billy
Author by

Billy

I'm a big man, and I need a big Shredder.

Updated on June 07, 2022

Comments

  • Billy
    Billy about 2 years

    Is there a way to have Python static analyzers (e.g. in PyCharm, other IDEs) pick up on Typehints on argparse.Namespace objects? Example:

    parser = argparse.ArgumentParser()
    parser.add_argument('--somearg')
    parsed = parser.parse_args(['--somearg','someval'])  # type: argparse.Namespace
    the_arg = parsed.somearg  # <- Pycharm complains that parsed object has no attribute 'somearg'
    

    If I remove the type declaration in the inline comment, PyCharm doesn't complain, but it also doesn't pick up on invalid attributes. For example:

    parser = argparse.ArgumentParser()
    parser.add_argument('--somearg')
    parsed = parser.parse_args(['--somearg','someval'])  # no typehint
    the_arg = parsed.somaerg   # <- typo in attribute, but no complaint in PyCharm.  Raises AttributeError when executed.
    

    Any ideas?


    Update

    Inspired by Austin's answer below, the simplest solution I could find is one using namedtuples:

    from collections import namedtuple
    ArgNamespace = namedtuple('ArgNamespace', ['some_arg', 'another_arg'])
    
    parser = argparse.ArgumentParser()
    parser.add_argument('--some-arg')
    parser.add_argument('--another-arg')
    parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: ArgNamespace
    
    x = parsed.some_arg  # good...
    y = parsed.another_arg  # still good...
    z = parsed.aint_no_arg  # Flagged by PyCharm!
    

    While this is satisfactory, I still don't like having to repeat the argument names. If the argument list grows considerably, it will be tedious updating both locations. What would be ideal is somehow extracting the arguments from the parser object like the following:

    parser = argparse.ArgumentParser()
    parser.add_argument('--some-arg')
    parser.add_argument('--another-arg')
    MagicNamespace = parser.magically_extract_namespace()
    parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace
    

    I haven't been able to find anything in the argparse module that could make this possible, and I'm still unsure if any static analysis tool could be clever enough to get those values and not bring the IDE to a grinding halt.

    Still searching...


    Update 2

    Per hpaulj's comment, the closest thing I could find to the method described above that would "magically" extract the attributes of the parsed object is something that would extract the dest attribute from each of the parser's _actions.:

    parser = argparse.ArgumentParser()
    parser.add_argument('--some-arg')
    parser.add_argument('--another-arg')
    MagicNamespace = namedtuple('MagicNamespace', [act.dest for act in parser._actions])
    parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace
    

    But this still does not cause attribute errors to get flagged in static analysis. This is true also true if I pass namespace=MagicNamespace in the parser.parse_args call.

    • aghast
      aghast over 7 years
      A quick google says that you can use type hints on the first use of local variables. Try it on parser = argparse.ArgumentParser() # type: argparse.Namespace and see if it works.
    • Billy
      Billy over 7 years
      @Austin: parser in this case is an argparse.ArgumentParser object, not an argparse.Namespace object. I want the parsed object to be populated with the args as attributes.
    • aghast
      aghast over 7 years
      You're right. I missed parsed vs. parser. What you really want seems to be that PyCharm parses the method arguments when building your ArgumentParser. I doubt that works well.
    • hpaulj
      hpaulj over 7 years
      add_argument returns the Action object it just created. Look at its attributes. parser._actions is a list of all these actions, which the parser uses during parsing. I've mentioned them in previous SO answers.
    • hpaulj
      hpaulj over 7 years
      In your new edits, are you passing the new namespace to the parse_args?
    • hpaulj
      hpaulj over 7 years
    • hpaulj
      hpaulj over 7 years
      I added a discussion of namedtuple to my answer.
  • hpaulj
    hpaulj over 7 years
    The defaultval defined in this class over rides any default parameters defined in the parser methods. That's probably is desirable. But it's a detail to watch out for when using custom namespaces.
  • Paul Biggar
    Paul Biggar about 2 years
    This is a solid library, but very basic. It lacks things like callbacks and doesn't support typed subparsers well, which are problems I found after spending an hour on it. While it seems like an exciting project (and they seem to have plans to resolve these issues), as of April 2022, imo it isn't a usable replacement for a moderately complex use of argparse.