My server was not responding today, some of the services were responding but ssh was not working. When I logged in from console it , I got below message:
*********************************************
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
INFO: task java:23541 blocked for more than 120 seconds.
Tainted: G W -- ----------------------------- 2.6.32-
*********************************************
*********************************************
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
INFO: task java:23541 blocked for more than 120 seconds.
Tainted: G W -- ----------------------------- 2.6.32-
*********************************************
Root cause: By default Linux
uses up to 40% of the available memory for file system caching. After this mark
has been reached the file system flushes all outstanding data to disk causing
all following IOs going synchronous. For flushing out this data to disk this
there is a time limit of 120 seconds by default. In this case the IO
subsystem is not fast enough to flush the data within 120 second. As IO
subsystem responds slowly and more requests are served, System Memory gets
filled up resulting this issue.
Fix:
1>
Lower this value of
40% to 10%. Ideally, it should not impact performance as it will use direct IO
and bypass the file system cache completely.
e set “vm.dirty_ratio=10″ in /etc/sysctl.conf to make it persistent across reboots.
2>
Increase the memory (however go for this if memory is less on the server).
No comments:
Post a Comment