The df and du commands provide different system information and I can not write to a partition that df says is 100% full. Which is correct and why does the system not allow any data writes to this partition?
环境
- Red Hat Enterprise Linux (RHEL)
问题
- Unable to find which files are consuming disk space.
- Why is there a mismatch between df and du outputs?
-
The
dfandducommands provide different information and I can not write to a partition thatdfsays is 100% full. Which is correct and why does the system not allow any data writes to this partition? -
dfsays that the system is out of disk space on one of my partitions, butdushows plenty of space left. For example:# df -h Filesystem Size Used Avail Use% Mounted on /dev/md1 7.9G 7.8G 0 100% /var # du -sh /var 50M var
决议
There are multiple possible causes for a difference in 'df' and 'du' output
- [Mounted-over] - Files in a directory that now has another file system mounted over it. Please see the Resolution section of How can I see what is consuming space underneath a mounted partition?.
- [Open/deleted] - Running processes holding open deleted files. To free up the space, the processes holding open the deleted files must exit. Please see the Diagnostic Steps section for more information on locating such processes.
- [Filesystem-state] - A filesystem corruption could lead to such a difference. If the filesystem in question is a LVM volume and there is free space in its VolumeGroup, then creating a snapshot of the volume and running "fsck" is recommended to find out if the filesystem is corrupted.
根源
Running processes holding open deleted files [Open/deleted]
df may report that the partition is full if the inodes or disk space are consumed by a running process. This may happen when a service such as samba has filled up the log file. When logrotate runs nightly, it deletes the big log file but does not restart the service, so the space is still being reserved by the process. Restarting the process should release the handle on the now-deleted file, and the space should be freed.
-
[Open/Deleted] - Check
lsofoutput for files listed as(deleted). These files have been deleted (or more properly, unlinked) from the filesystem tree but because one or more processes still have them open, the disk space they occupy cannot be reclaimed. So thedfcommand will still account for these open/deleted files, whileduwhich scans the filesystem won't see them any more and will not account for them. For example, below we see four deleted files listed inlsofbut an attempt to list the files fails since they have been removed/unlinked from the filesystem's directory structure.# lsof | grep COMMANMD ; lsof | grep deleted COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME oracle 21869 1000 16w REG 253,5 7476824 130802 /oracle/diag/EIPDB2_ping_21869.trc (deleted) oracle 21869 1000 17w REG 253,5 8676828 130804 /oracle/diag/EIPDB2_ping_21869.trm (deleted) oracle 21927 1000 18w REG 253,5 7375910 130983 /oracle/diag/EIPDB2_asmb_21927.trc (deleted) oracle 21927 1000 19w REG 253,5 6748612 130658 /oracle/diag/EIPDB2_asmb_21927.trm (deleted) # ls -lh /oracle/diag/EIPDB2*.tr* #
Files in a directory that now has another file system mounted over it [Mounted-over]
If a path given to df / du contains a secondary file system, mounted on top of a primary file system, and the primary file system contains files which are now hidden, the output will be different. For example, suppose /opt/foo is a separate file system mounted on top of /opt. Prior to mounting the secondary file system /opt/foo, the primary file system contained a directory of /opt/foo which had files in it. Once the secondary file system is mounted onto /opt/foo, the files from the primary file system inside the /opt/foo directory are no longer visible, but still, consume space.
-
[Mounted Over] - There may be files hidden under a mounted file system. For instance, if the directory
/mnt/testcontained large files in it, and then an NFS file system was mounted on/mnt/test,dfwould continue to account for the size of those files butduwould not. This can be verified by bind mounting the suspected filesystem on a different directory and inspecting its contents if it's not possible to unmount the NFS filesystem mounted on/mnt/testand inspect due to production outage issues.$ mkdir /tmp/root_chk $ mount --bind / /tmp/root_chk $ du -h /tmp/root_chk/mnt/test -
For example, starting with the following, we see df command reports
/tmpis using 3.1G of allocated space butdfis only reporting 11M:
# df -TPh /tmp/* | sort -u -k 1 -r Filesystem Type Size Used Avail Use% Mounted on /dev/sdc2 ext3 147G 188M 140G 1% /tmp/mnt /dev/mapper/vga-lvtmp ext4 167G 3.1G 156G 2% /tmp # du -sxkh /tmp 11M /tmp
- Now we run the above bind commands and find that there are files hiding under the
/tmp/mntdirectory used as a mount point. These files exist within the/tmpfilesystem, but once another filesystem is mounted on that directory, then those files become invisible toducommand's filesystem scanning functions.
# mkdir /mnt_check # [1] # mount --bind /tmp /mnt_check # [2] # du -sxkh /mnt_check # [3] 3.1G /mnt_check # ls -lh /mnt_check/mnt # [4] total 3.1G -rw-rw-r--. 1 user user 1.0G Nov 19 14:08 file1 -rw-rw-r--. 1 user user 2.0G Nov 19 14:22 file2 # umount /mnt_check # [5] # rmdir /mnt_check # [6]
-
Notes on above:
- [1] create a temporary directory on to which we can bind mount the main filesystem that another filesystem is mounted on top of. In this case, the main filesystem is
/tmpand another filesystem is mounted on top of/tmp/mnta directory within the/tmpfilesystem. The temporary directory must be created on a different filesystem, in this case,/(root). - [2]
--bindmount the main filesystem to the temporary directory created above. This will only mount the main filesystem of interest -- any mounted filesystems on top of that main filesystem won't follow to this new mount point. - [3] perform a
ducommand on the new--bindmounted filesystem and we can see the expected 3.1G shows up (vs the starting point above of just 11M). This means that one or more mount points on the main filesystem, here that is/tmp, contains files as part of the main (/tmp) filesystem which becomes invisible to theducommand once another filesystem is mounted over the top of the/tmpdirectory. - [4] check the
/tmp/mntdirectory, its the place where another filesystem is mounted. Thels -lhcommand shows that there are two pre-existing files within the/tmpfilesystem within the/tmp/mntdirectory of the/tmpfilesystem. Once another filesystem is mounted over that directory, those two files cannot be seen by theducommand but are still accounted for by thedfcommand against the main/tmpfilesystem. Thefile1andfile2need to be removed -- moved or deleted -- from the/tmp/mntdirectory. These actions can be performed on the--bindmount point (/mnt_check/mntin this example which is equivalent to/tmp/mntbefore another filesystem is mounted onto that directory). - [5] unmount the
--bindfilesystem after correcting the issue with files being present within the mount point. - [6] remove the temporary directory used as a mount point.
- [1] create a temporary directory on to which we can bind mount the main filesystem that another filesystem is mounted on top of. In this case, the main filesystem is
-
To correct the above problem, the two files --
file1andfile2-- that are contained within the/tmp/mntdirectory need to be moved or deleted so as to empty/tmp/mntdirectory before a filesystem is mounted on top of the/tmp/mntdirectory mount point. See step [4] above. -
Note that even after the files under the mount point are deleted,
dfwill not exactly matchduoutput. There are filesystem-specific allocated "stuff" that is internal to the filesystem and does not show up within visible files of the filesystem. Typically the extra filesystem-specific "stuff" makes df output 2-5% larger thanducommand output. That is normal and expected. As an example, after deletingfile1andfile2above,dfshowed 98M in use, butduonly 11M. That extra 87M is 0.5% of the total allocatable blocks of the filesystem -- well below the threshold for concern.
诊断步骤
Files in a directory that now has another file system mounted over it
To check and remedy this condition, please see How can I see what is consuming space underneath a mounted partition?.
Running processes holding open deleted files
Use the lsof command as follows:
# lsof | grep deleted
nmbd 16408 root cwd DIR 9,1 0 163846 /var/log/samba (deleted)
nmbd 16408 root 13w REG 9,1 924442067 163964 /var/log/samba/nmbd.log (deleted)
Note: In order to obtain all files which may be holding open files, run the command as root.
The 7th column in the output tells you how big the file is in bytes. The 9th column tells you which file is being held open. The 1st column tells which process is holding this file descriptor open. In the above example, the size of /var/log/samba/nmbd.log is 924442067 bytes, which is almost 1 GB. Since the nmbd service is a part of samba, running /sbin/service smb restart should fix the problem.