转载自:http://blog.csdn.net/tianlesoftware/article/details/6015400
前几天一个朋友问了我一个问题。 说她在alert log里面看到了例如以下信息:
Thread 1 cannot allocate new log, sequence 415
Private strand flush not complete
Current log# 4 seq# 414 mem# 0: /dev/rora_redotb04
Thread 1 advanced to log sequence 415
Current log# 5 seq# 415 mem# 0: /dev/rora_redotb05
Thu Nov 11 16:01:51 2010
我遇到的是:Checkpoint not complete。 有关这方面的解释參考我的Blog:
Redo Log 和Checkpoint not complete
http://blog.csdn.net/tianlesoftware/archive/2009/12/01/4908066.aspx
在oracle 官网搜了一下:
Alert Log Messages: Private Strand Flush Not Complete [ID 372557.1] |
||
|
||
Modified 01-SEP-2010 Type PROBLEM Status MODERATED |
In this Document
Symptoms
Cause
Solution
References
Platforms: 1-914CU;
This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process and therefore has not been subject to an independent technical review. |
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.1.0.7 - Release: 10.2 to 11.1
Information in this document applies to any platform.
"Checked for relevance on 04-Dec-2007"
Private strand flush not complete
"Private strand flush not complete" messages are being populated to the alert log for unknown
reasons.
ALERT LOG EXAMPLE:
>>>
Fri May 19 12:47:29 2006
Thread 1 cannot allocate new log, sequence 18358
Private strand flush not complete
Current log# 7 seq# 18357 mem# 0: /u03/oradata/bitst/redo07.log
Thread 1 advanced to log sequence 18358
Current log# 8 seq# 18358 mem# 0: /u03/oradata/bitst/redo08.log
<<<
>>
The message means that we haven't completed writing all the redo information to the log when we are trying to switch. It is similar in nature to a "checkpoint not complete" except that is only involves the redo being written
to the log. The log switch can not occur until all of the redo has been written.
A "strand" is new terminology for 10g and it deals with latches for redo .
Strands are a mechanism to allow multiple allocation latches for processes to write redo more efficiently in the redo buffer and is related to the log_parallelism parameter present in 9i.
The concept of a strand is to ensure that the redo generation rate for an instance is optimal and that when there is some kind of redo contention then the number of strands is dynamically adjusted to compensate.
The initial allocation for the number of strands depends on the number of CPU's and is started with 2 strands with one strand for active redo generation.
For large scale enterprise systems the amount of redo generation is large and hence these strands are *made active* as and when the foregrounds encounter this redo contention (allocated latch related contention) when this concept of dynamic strands comes into
play.
There is always shared strands and a number of private strands .
Oracle 10g has some major changes in the mechanisms for redo (and undo), which seem to be aimed at reducing contention.
Instead of redo being recorded in real time, it can be recorded 'privately' and pumped into the redo log buffer on commit.
Similary the undo can be generated as 'in memory undo' and applied in bulk.
This affect the memory used for redo management and the possibility to flush it in pieces.
The message you get is related to internal Cache Redo File management.
You can disregard these messages as normal messages.
When you switch logs all private strands have to be flushed to the current log before the switch is allowed to proceed.
These messages are not a cause for concern unless there is a significant gap in seq# between the "cannot allocate new log" message and the "advanced to log sequence" message.
This issue is infact not a bug and is expected behavior.
In some cases, this message can be resolved by increasing db_writer_process value.
这里面涉及到一些Redo 的机制问题。 详细參考Blog:
Oracle Redo 并行机制
http://blog.csdn.net/tianlesoftware/archive/2010/11/17/6014898.aspx
一个redo条目包括了对应操作导致的数据库变化的全部信息,全部redo条目终于都要被写入redo文件里去。 Redo log buffer是为了避免Redo文件IO导致性能瓶颈而在sga中分配出的一块内存。 一个redo条目首先在用户内存(PGA)中产生,然后由oracle服务进程复制到log buffer中,当满足一定条件时,再由LGWR进程写入redo文件。
因为log buffer是一块“共享”内存,为了避免冲突,它是受到redo allocation latch保护的,每一个服务进程须要先获取到该latch才干分配redo buffer。因此在高并发且数据改动频繁的oltp系统中,我们通常能够观察到redo allocation latch的等待。
为了降低redo allocation latch等待,在oracle 9.2中,引入了log buffer的并行机制。其基本原理就是,将log buffer划分为多个小的buffer,这些小的buffer被成为Shared Strand。每个strand受到一个单独redo allocation latch的保护。多个shared strand的出现,使原来序列化的redo buffer分配变成了并行的过程,从而降低了redo allocation latch等待。
为了进一步减少redo buffer冲突,在10g中引入了新的strand机制——Private strand。Private strand不是从log buffer中划分的,而是在shared pool中分配的一块内存空间。
Private strand的引入为Oracle的Redo/Undo机制带来非常大的变化。每个Private strand受到一个单独的redo allocation latch保护,每个Private strand作为“私有的”strand仅仅会服务于一个活动事务。获取到了Private strand的用户事务不是在PGA中而是在Private strand生成Redo,当flush private strand或者commit时,Private strand被批量写入log文件里。假设新事务申请不到Private strand的redo allocation latch,则会继续遵循旧的redo buffer机制,申请写入shared strand中。事务是否使用Private strand,能够由x$ktcxb的字段ktcxbflg的新增的第13位鉴定:
对于使用Private strand的事务,无需先申请Redo Copy Latch,也无需申请Shared Strand的redo allocation latch,而是flush或commit是批量写入磁盘,因此降低了Redo Copy Latch和redo allocation latch申请/释放次数、也降低了这些latch的等待,从而降低了CPU的负荷。
看了这些理论知识,我们在来看一下之前的错误:
Private strand flush not complete
当我们flush或者commit的时候,必须先将buffer中的内容写入到redo中,才干去接收新的记录。 这个错误就是发生在这个过程中。 Oracle 对这个问题提了2个方法:
(1) 忽略,在使用之前,必需要等待buffer的信息flush完毕。 这时候进程是会短暂的hang住。
(2) 添加db_writer_process的数据。