How does one ignore extra arguments passed to a data class?
Solution 1
I would just provide an explicit __init__
instead of using the autogenerated one. The body of the loop only sets recognized value, ignoring unexpected ones.
Note that this won't complain about missing values without defaults until later, though.
@dataclass(init=False)
class Config:
VAR_NAME_1: str
VAR_NAME_2: str
def __init__(self, **kwargs):
names = set([f.name for f in dataclasses.fields(self)])
for k, v in kwargs.items():
if k in names:
setattr(self, k, v)
Alternatively, you can pass a filtered environment to the default Config.__init__
.
field_names = set(f.name for f in dataclasses.fields(Config))
c = Config(**{k:v for k,v in os.environ.items() if k in field_names})
Solution 2
Cleaning the argument list before passing it to the constructor is probably the best way to go about it. I'd advice against writing your own __init__
function though, since the dataclass' __init__
does a couple of other convenient things that you'll lose by overwriting it.
Also, since the argument-cleaning logic is very tightly bound to the behavior of the class and returns an instance, it might make sense to put it into a classmethod
:
from dataclasses import dataclass
import inspect
@dataclass
class Config:
var_1: str
var_2: str
@classmethod
def from_dict(cls, env):
return cls(**{
k: v for k, v in env.items()
if k in inspect.signature(cls).parameters
})
# usage:
params = {'var_1': 'a', 'var_2': 'b', 'var_3': 'c'}
c = Config.from_dict(params) # works without raising a TypeError
print(c)
# prints: Config(var_1='a', var_2='b')
Solution 3
I used a combination of both answers; setattr
can be a performance killer. Naturally, if the dictionary won't have some records in the dataclass, you'll need to set field defaults for them.
from __future__ import annotations
from dataclasses import field, fields, dataclass
@dataclass()
class Record:
name: str
address: str
zip: str = field(default=None) # won't fail if dictionary doesn't have a zip key
@classmethod
def create_from_dict(cls, dict_) -> Record:
class_fields = {f.name for f in fields(cls)}
return Record(**{k: v for k, v in dict_.items() if k in class_fields})
Related videos on Youtube
Comments
-
Californian almost 2 years
I'd like to create a
config
dataclass
in order to simplify whitelisting of and access to specific environment variables (typingos.environ['VAR_NAME']
is tedious relative toconfig.VAR_NAME
). I therefore need to ignore unused environment variables in mydataclass
's__init__
function, but I don't know how to extract the default__init__
in order to wrap it with, e.g., a function that also includes*_
as one of the arguments.import os from dataclasses import dataclass @dataclass class Config: VAR_NAME_1: str VAR_NAME_2: str config = Config(**os.environ)
Running this gives me
TypeError: __init__() got an unexpected keyword argument 'SOME_DEFAULT_ENV_VAR'
. -
Californian over 5 yearsYeah that was my concern, it looked like the function was a little more complicated with some checks etc (but I only looked for a second). Is there any way to just rip out the autogenerated function and wrap it? I also don't really want the other environment variables in there.
-
chepner over 5 yearsYou don't want to wrap the autogenerated function; you want to replace it. That said, you can always filter the environment mapping before calling the default
__init__
:c = Config({k:v for k,v in kwargs if k in set(f.name for f in dataclasses.fields(Config))})
-
Californian over 5 yearsFiltering the arguments before initializing the instance worked great! If you make that into a separate answer I'll accept it. Code I ended up with:
from dataclasses import dataclass, fields
...config = Config(**{k:v for k,v in os.environ.items() if k in set(f.name for f in fields(Config))}
. -
Martijn Pieters almost 5 yearsDon't use
cls.__annotations__
, usedataclass.fields()
so you can introspect their configuration (e.g. ignoreinit=False
fields). -
Arne almost 5 yearsBut you'd want
InitVar
s in this context, no? They also get skipped bydataclasses.fields()
, so there might be a bit more I'll have to fix here. -
Arne almost 5 years@MartijnPieters
cls.__dataclass_fields__
works withInitVar
inclusion and has access to theinit
field. -
Martijn Pieters almost 5 yearsUnfortunately that mapping also includes
ClassVar
fields and theinit
flag is not set toFalse
for those.. -
Martijn Pieters almost 5 yearsI don’t see a way to reliably achieve this without using the private API of the
dataclasses
module, actually :-/ -
Martijn Pieters almost 5 yearsUnless you instead introspected the
__init__
method. -
Arne almost 5 yearsUpdated with an
_is_classvar
check. I found no way to get it to work without it that didn't include essentially writing my own buggy version of it =( Introspecting__init__
sounds even riskier, or do you see a way that doesn't boil down to using regexes? -
Arne almost 5 yearshere is an alternative with
inspect.getsource
, which sadly can't give me__init__
. It's worse than the current one imo becausetyping
types aliases are quite common in my experience. -
Martijn Pieters almost 5 yearsThat's not what I meant.
inspect.signature()
will give you aSignature
instance which will let you trivially create a set of acceptable parameter names. -
Arne almost 5 yearsI wasn't aware of
inspect.signature()
, thanks for the hint. The version right now seems to just work for all my test cases, and it gets rid of all the private attribute/function accesses. -
Martijn Pieters almost 5 yearsI've applied this to my metaclass in the other post; it is a little more complex than just verifying that the argument exists as there may be positional-only arguments.
-
tboschi over 2 yearsIf performance is a concern, it's much faster to check directly
cls.__dataclass_fields__
. Performance can be improved also by assigninginspect.signature(cls).parameters
to a variable outside the dictionary comprehension. -
Arne over 2 years@tboschi see revision 5 of my post, it's not easy to get right. You're right about the condition into a variable though.
-
tboschi over 2 years@Arne ah sweet, thanks for linking your revision!
-
Elysiumplain about 2 yearsFollowing "favor composition over inheritance" you may want iterative calls to this as a helper function (e.g., when pulling an already joined query that might be a pain to splice) to properly separate base dataclasses.
-
jonathan almost 2 yearsyou are losing all of the magic that dataclass is doing in 'init'. This is not a solution to this problem!