How to increase nvme_core.io_timeout on my c5 EC2 instacnce
Based on my own experimentation, we do this while building our AMIs.
cp /etc/default/grub /tmp/grub
cat >>/tmp/grub <<'EOF'
GRUB_CMDLINE_LINUX="${GRUB_CMDLINE_LINUX} nvme_core.io_timeout=255"
EOF
sudo mv /tmp/grub /etc/default/grub
sudo update-grub
Then create an AMI from the instance. When you start a new EC2 instance from the AMI, it comes up with the correct setting.
Obviously this can be modify to set any kernel parameter.
mchawre
Updated on September 18, 2022Comments
-
mchawre over 1 year
We have mesos cluster where we're running centos7
c5
instances on aws. The kernel version is the latest4.16.1-1
.In
c5
instance type the volumes usesnvme drivers
. The nvme volumes seems to have a behavior as mentioned here whereif there is an io timeout on a volume, the volume mount becomes read only and no further writes can happen. So if there is heavy read-write operations on your device like on root drive then after the io timeout no further writes can happen so its dangerous.
In AWS documentation it mentioned to set an io timeout as high as possible and it seems to be
4294967295 sec
.AWS doc specify that
default io timeout
is30sec
, but it is max255 sec
forkernel prior to 4.15 version
and4294967295 sec for kernel 4.15+.
As we havelatest 4.16.1 kernel
we should set it to max4294967295 sec
.But when I try to set the
nvme_core.io_timeout
parameter to the max value, it didn't get refelected. I tried thissh-4.2# modprobe nvme_core io_timeout=123457 sh-4.2# cat /sys/module/nvme_core/parameters/io_timeout 30 sh-4.2#
What is the correct way to set
nvme_core.io_timeout
I tried lot of other things like- setting it in
/etc/default/grub
file - sysctl command
- Overriding
/sys/module/nvme_core/parameters/io_timeout
file
But Nothing helped.
- setting it in