Introduction
In this blog we are going to see basic troubleshooting commands, whenever customer is telling that filesystem full or system is very slow, we are going check what are the necessary steps we follow or the quick action and remediation.In this blog we are going to see basic troubleshooting commands, whenever customer is telling that filesystem full or system is very slow, we are going check what are the necessary steps we follow or the quick action and remediation.
find filesystem greater than 80%
Suppose we have scenario filesystem using maximum space due to which services or application not working, execute the below command to find out filesystem greater than 80% or what ever percentage you need.
[root@rhel8s log]# df -Ph | awk '0+$5 >=80 {print}' /dev/mapper/rhel-root 18G 17G 15G 90% /
cleanup the system log files
We have a scenario that /var/log files taking up lot of space due to which the application or service is not coming up
[root@rhel8s log]# ls -lh messages -rw-------. 1 root root 6G Feb 15 10:18 messages
Method 1
In the above example we could see the messages files occupies 6GB of space,, hence we need move or rotate the log file, in normal scenario if the filesystem is not 100% full, we can just execute the below command
[root@rhel8s log]# logrotate /etc/logrotate.conf [root@rhel8s log]#
we can see the below log files are achieved
[root@rhel8s log]# ls -lrt | grep messages -rw-------. 1 root root 86055 Feb 15 09:43 messages-20230215.gz -rw-------. 1 root root 621 Feb 15 10:18 messages
Method 2
Archive the log file that is occupying maximum space, do not move or remove the file using rm command, instead archive to a different filesystem that has space
[root@rhel8s ~]# gzip -c /var/log/messages > /opt/bkp/messages.gz [root@rhel8s ~]# ls -lrt /opt/bkp/messages.gz -rw-r--r--. 1 root root 300M Feb 15 10:33 /opt/bkp/messages.gz
after compressing the file null the file that is compressed
[root@rhel8s log]# >messages [root@rhel8s log]# ls -l messages -rw-------. 1 root root 0 Feb 15 10:52 messages
make sure the new logs has been writing in the file, execute a test alert inn the log file with logger command.
[root@rhel8s ~]# logger test test [root@rhel8s ~]# tail -f /var/log/messages Feb 16 07:30:22 rhel8s systemd[1]: fprintd.service: Succeeded. Feb 16 07:30:35 rhel8s systemd[1]: Started Session 3 of user root. Feb 16 07:30:35 rhel8s systemd-logind[855]: New session 3 of user root. Feb 16 07:30:41 rhel8s root[1547]: test test
find files greater than 100MB
find the files that are greater than the required size, in our example we need files greater than 100MB
[root@rhel8s ~]# find /usr -xdev -type f -size +100M /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-4.el8.x86_64/jre/lib/rt.jar /usr/share/GeoIP/GeoLite2-City.mmdb
if we cannot able to delete the files or folders, extend the filesystem
cleanup sar files
In few scenarios we can see sar files taking some space, we can archive the older files that are 15 days before
[root@rhel8s ~]# find /var/log/sa -mtime +15 -type f |xargs gzip
cleanup the yum cache
if we are doing any patch or package installation the rpm will be downloaded and kept in local, we can cleanup temporarily
[root@rhel8s ~]# yum clean all
check files and folders space
du command to estimate the files and folder usage
[root@rhel8s usr]# du -sh * | egrep 'M|G' 214M bin 22M include 893M lib 357M lib64 220M libexec 64M sbin 901M share 76M src
in the above example shows files/folders size with MB/GB
check the inode usage
In few scenarios we could see still filesystem is 100% full, even after removing the folder files
[root@rhel8s usr]# df -ih Filesystem Inodes IUsed IFree IUse% Mounted on devtmpfs 223K 358 222K 1% /dev tmpfs 228K 1 228K 1% /dev/shm tmpfs 228K 559 227K 1% /run tmpfs 228K 17 228K 1% /sys/fs/cgroup /dev/mapper/rhel-root 8.7M 87K 8.6M 1% / /dev/sda1 512K 309 512K 1% /boot tmpfs 228K 5 228K 1% /run/user/0
In this case check the lsof command to check the deleted files
[root@rhel8s usr]# lsof|grep deleted systemd-u 707 root 8r REG 253,0 6940392 51201777 /var/lib/sss/mc/group (deleted) systemd-u 707 root 9r REG 253,0 9253600 51201776 /var/lib/sss/mc/passwd (deleted) auditd 798 root 4r REG 253,0 6940392 51201777 /var/lib/sss/mc/group (deleted) auditd 798 799 auditd root 4r REG 253,0 6940392 51201777 /var/lib/sss/mc/group (deleted) auditd 798 801 auditd root 4r REG 253,0 6940392 51201777 /var/lib/sss/mc/group (deleted) dbus-daem 822 dbus 5r REG 253,0 9253600 51201776 /var/lib/sss/mc/passwd (deleted)
Either kill the process or restart the process, make sure the process is not used by anyone
conclusion
Hope this article helps you to quickly check the filesystem usage and troubleshooting steps , please do check the below links that might help you
Quick Health check on Linux servers v1.0 – https://computercarriage.com/2020/05/25/qhealth-check-on-linux-servers/
On-Demand sar command usage – https://computercarriage.com/2020/06/01/on-demand-sar-command-usage/
Easy listing block devices using lsblk – https://computercarriage.com/2020/05/18/list-block-devices-using-lsblk-command/
3 thoughts on “Troubleshooting commands in Linux”