问题背景:
客户反馈数据库凌晨两点宕机,需协助排查宕机原因
1> 观察宕机时间段alert日志:
1 Tue Jan 14 02:12:31 2020 2 AUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail 3 Tue Jan 14 02:12:31 2020 4 AUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail 5 Tue Jan 14 02:12:31 2020 6 Tue Jan 14 02:12:31 2020 7 AUD: Audit Commit Delay exceeded, written a copy to OS Audit TrailAUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail 8 Tue Jan 14 02:30:35 2020 9 10 Tue Jan 14 02:12:31 2020 11 Process 0x0x2421f86848 appears to be hung while dumpingAUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail 12 13 Current time = 2940567245, process death time = 2940493559 interval = 60000 14 Attempting to kill process 0x0x2421f86848 with OS pid = 117667 --smon进程被kill 15 OSD kill skipped for process 0x2421f86848 16 Tue Jan 14 02:30:39 2020 17 Error occured while spawning process m001; error = 12751 18 Tue Jan 14 02:30:40 2020
关键点
1 Process 0x0x2421f86848 appears to be hung while dumpingAUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail 2 Error occured while spawning process m001; error = 12751
关于审计信息 mos解释及解决方案:
In this Document
Symptoms
Changes
Cause
Solution
References
--------------------------------------------------------------------------------
Applies to:
Oracle Server - Enterprise Edition - Version 10.2.0.3 to 11.2.0.1 [Release 10.2 to 11.2]
Information in this document applies to any platform.
Checked for relevance on 17-Sep-2011
Symptoms
You see the following messages appear in your alert.log:
AUD: Audit Commit Delay exceeded, written a copy to OS Audit Trail
Changes
You have applied the Audit Cleanup Patch or any superceding patch as referenced from note 731908.1.
Cause
This is a change that was introduced within the audit functionality to support Audit Vault, these messages can appear in your alert.log occasionally
even if this database is not a source of Audit Vault, the reason is as follows:
The database will guarantee that the transaction writing the audit record will commit within a pre-defined maximum allowed interval which
is called the Audit Commit Delay interval. If the transaction takes more than Audit Commit Delay to commit the audit record,
the Database will write the same record to the OS audit trail. This is a fallback mechanism to make sure there's always written evidence
of an audited action within the defined timeframe, a such it is a feature to enhance audit security.
The commit delay is fixed at 5 seconds and cannot be tuned.
Solution
The problem is happening because the audit functionality was not able to commit an audit record within 5 seconds, this means at the time the message
was written to the alert.log your database was under stress. The cause of the problem is not the auditing layer and the messages seen in the alert.log
are only showing that the auditing is suffering because of the generic performance problems of the environment which might
affect other components as well.
These messages are purely informational and no direct action can or should be taken to avoid them. This is most likely because of a resource problem on your database.
If this is seen incidental you can ignore it but if these messages are seen regularly you will likely have a resource problem and also seeing other symptoms of that, you should analyze and solve the generic performance problem first and then these messages will also go away.
Update: the fix to unpublished bug 8642202 changes the behaviour as follows:
Audit Commit Delay increased to 15 seconds and enforced only when AUD$ is initialized for cleanup.
So if you have the fix to Bug 8642202, the delay will be increased to 15 seconds and if you still get these messages and you don't want them
and you are not using package DBMS_AUDIT_MGMT for cleanup, you can now disable this security feature by calling:
set serveroutput on
1 begin 2 if dbms_audit_mgmt.IS_CLEANUP_INITIALIZED( 3 audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD) then 4 dbms_audit_mgmt.DEINIT_CLEANUP( 5 audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD); 6 dbms_output.put_line('DEINIT_CLEANUP for AUDIT_TRAIL_AUD_STD'); 7 end if; 8 end; 9 /
This may alleviate the problem in some cases but there can still be an underlying performance problem.
Bug 8642202 is fixed in patchset 10.2.0.5, PSU 11.2.0.1.1 and patchset 11.2.0.2 and future releases.
Merge patches that include this fix:
11.1.0.7: Patch 9821987
On Windows this is fixed in 11.1.0.7 patch bundle 40 and higher, see Note 161549.1 for more info.
References
@ BUG:8642202 - LX64: TOO MANY AUDIT FILES GENERATED, 500,000 AUD FILES AFTER 2 DAYS
NOTE:161549.1 - Oracle Database, Networking and Grid Agent Patches for Microsoft Platforms
NOTE:731908.1 - New Feature DBMS_AUDIT_MGMT To Manage And Purge Audit Information
NOTE:8642202.8 - Bug 8642202 - Lots of audit files due to "Audit Commit Delay exceeded"