问题现象:通过巡检发现/u01/ogg目录下100G 空间,使用率97%
1.立即清空几个rpt进程日志文件,空间释放一部分。
$cd /u01/ogg/dirrpt $ls -lrt $> xx.rpt $ more Rxx.rpt Operating System Version: Linux Version #1 SMP Tue Feb 26 12:53:17 EST 2018, Release 2.6.32-696.el6.x86_64 Node: dsapdb21 Machine: x86_64 soft limit hard limit Address Space Size : unlimited unlimited Heap Size : unlimited unlimited File Size : unlimited unlimited CPU Time : unlimited unlimited Process id: 656457 ······ 清空rpt日志,应急
2.检查哪些文件占用的?
通过df -h used- /u01/ogg du -sm 大小,发现存在50G空间不见了???
通过
$ lsof|grep deleted >> lsof_deleted_20200508.log 发现存在大量大量的日志信息,类似僵死无法被删除
简短 more 观察 oracle ocssd.bin这些暂时忽略,最大的问题是replicat的进程非常非常多的文件删除操作!!!
ohasd.bin
gpnpd.bin
osysmond.
ocssd.bin
ocssd.bin
oracle /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_vmb0_310782.trc (deleted)
replicat
replicat
replicat
replicat
replicat
replicat 520060 oracle 81w REG 251,23553 9673428 37733 /u01/ogg/adapter/ndslogs/NdsJdbcTrace.log (deleted)
replicat 520060 oracle 85w REG 251,23553 5397502500 39821 /u01/ogg/adapter/ndslogs/sgcc.nds.jdbc.driver.NdsConnection@xxx.log (deleted)
replicat 650196 oracle 81w REG 251,23553 9673428 37733 /u01/ogg/adapter/ndslogs/NdsJdbcTrace.log (deleted)
replicat 650196 oracle 85w REG 251,23553 37155915949 34364 /u01/ogg/adapter/ndslogs/sgcc.nds.jdbc.driver.NdsConnection@xxx.log (deleted)
# ps -ef|grep 650196
oracle 650196 78557 3 May07 ? 01:03:07 /u01/ogg/replicat PARAMFILE /u01/ogg/dirprm/a.prm REPORTFILE a.rpt PROCESSID a USESUBDIRS
# ps -ef|grep 520060
oracle 520060 1 36 Apr30 ? 2-22:30:12 /u01/ogg/replicat PARAMFILE /u01/ogg/dirprm/b.prm REPORTFILE b.rpt PROCESSID b USESUBDIRS
$ogg >info all --观察进程都是正常的!!!
$ogg>info * 备份进程rba
$ogg>stop R*
$ lsof|grep deleted
null --记录都被清空
GGSCI > start r* --重启ogg 进程问题解决
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/asm/acfsvol-99999999 100G 25G 76G 25% /u01/ogg
Redhat6.9 ogg Version 12.2.0.1.160823版本,复制应用进程REP,大量进程删除操作无法正常删除,由于ogg进程占用导致无法正常删除,具体内部机制为什么被占用,此问题暂无法分析。