Evaluating pytorch models: `with torch.no_grad` vs `model.eval()`
TL;DR:
Use both. They do different things, and have different scopes.
-
with torch.no_grad
- disables tracking of gradients inautograd
. -
model.eval()
changes theforward()
behaviour of the module it is called upon - eg, it disables dropout and has batch norm use the entire population statistics
with torch.no_grad
The torch.autograd.no_grad
documentation says:
Context-manager that disabled [sic] gradient calculation.
Disabling gradient calculation is useful for inference, when you are sure that you will not call
Tensor.backward()
. It will reduce memory consumption for computations that would otherwise haverequires_grad=True
. In this mode, the result of every computation will haverequires_grad=False
, even when the inputs haverequires_grad=True
.
model.eval()
The nn.Module.eval
documentation says:
Sets the module in evaluation mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.
The creator of pytorch said the documentation should be updated to suggest the usage of both, and I raised the pull request.
Related videos on Youtube
Tom Hale
Manjaro & Arch (Raspberry Pi Zero) Linux user on travel laptop and home server with a bunch of disk attached. I enjoy spending lots of time bash scripting saving time on repeated functions, as well as Python programming. When not staring at pixels, I'm known to practice movement arts, and give my cat long tummy rubs.
Updated on September 14, 2022Comments
-
Tom Hale about 1 year
When I want to evaluate the performance of my model on the validation set, is it preferred to use
with torch.no_grad:
ormodel.eval()
?-
Nagabhushan S N almost 4 yearsSimilar discussion here: discuss.pytorch.org/t/model-eval-vs-with-torch-no-grad/19615/2
-