《Microsoft Sql server 2008 Internals》读书笔记订阅地址:
http://www.cnblogs.com/downmoon/category/230397.html/rss
《Microsoft Sql server 2008 Internals》索引目录:
《Microsoft Sql server 2008 Internal》读书笔记--目录索引
■Deleting Rows
上一篇我们主要了解Inserting Rows 时SQL Server的内部存储机制,这一篇我们来看看Deleting Rows的内部机制。
当你从一个表中删除行时,你必须考虑数据页和索引页发生了什么。记住:数据实际上是聚集索引的叶级,从表中删除一个带有聚集索引的行意味着同时在非 聚集索引的叶级中删除同样的行。从一个Heap中删除行有些差异,它其实是从一个索引的非叶级页删除。
■Deleting Rows From a Heap
当一行被删除时,SQL Server 2008并不会自动压缩对应页的空间,作为性能优化的一部分,这个压缩直到这个页因为插入一个新行需要附加的连续空间时才会发生。看下面这个例子,我们从 一个页的中部删除一行,然后用DBCC Page观察其中的变化。
go
CREATE TABLE smallrows
(
a int identity,
b char(10)
);
go
INSERT INTO smallrows
VALUES ('row 1');
INSERT INTO smallrows
VALUES ('row 2');
INSERT INTO smallrows
VALUES ('row 3');
INSERT INTO smallrows
VALUES ('row 4');
INSERT INTO smallrows
VALUES ('row 5');
go
TRUNCATE TABLE sp_tablepages;
INSERT INTO sp_tablepages
EXEC ('DBCC IND (TestDb, smallrows, -1)' );
结果:
go
-- Be sure to enter YOUR PagePID:
DBCC PAGE(TestDb, 1, 262, 1);
go
DBCC PAGE 输出:
BUFFER:
BUF @0x03E596E8
bpage = 0x0CA4C000 bhash = 0x00000000 bpageno = (1:262)
bdbid = 15 breferences = 0 bUse1 = 35486
bstat = 0x1c0000b blog = 0x212121bb bnext = 0x00000000
PAGE HEADER:
Page @0x0CA4C000
m_pageId = (1:262) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 105 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594044809216
Metadata: PartitionId = 72057594043236352 Metadata: IndexId = 0
Metadata: ObjectId = 1493580359 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 18 m_slotCnt = 5 m_freeCnt = 7981
m_freeData = 201 m_reservedCnt = 0 m_lsn = (41:433:3)
m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0
m_tornBits = 0
Allocation Status
GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED
PFS (1:1) = 0x61 MIXED_EXT ALLOCATED 50_PCT_FULL DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED
DATA:
Slot 0, Offset 0x60, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x642FC060
00000000: 10001200 01000000 726f7720 31202020 †........row 1
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 1, Offset 0x75, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x642FC075
00000000: 10001200 02000000 726f7720 32202020 †........row 2
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 2, Offset 0x8a, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x642FC08A
00000000: 10001200 03000000 726f7720 33202020 †........row 3
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 3, Offset 0x9f, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x642FC09F
00000000: 10001200 04000000 726f7720 34202020 †........row 4
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 4, Offset 0xb4, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x642FC0B4
00000000: 10001200 05000000 726f7720 35202020 †........row 5
00000010: 20200200 00†††††††††††††††††††††††††† ...
OFFSET TABLE:
Row - Offset
4 (0x4) - 180 (0xb4)
3 (0x3) - 159 (0x9f)
2 (0x2) - 138 (0x8a)
1 (0x1) - 117 (0x75)
0 (0x0) - 96 (0x60)
现在我们删除中间行 Where a=3 ,再看页:
WHERE a = 3;
go
-- Be sure to enter YOUR PagePID:
DBCC PAGE(TestDb, 1, 262, 1);
go
这是DBCC PAGE页面输出:
BUFFER:
BUF @0x03E596E8
bpage = 0x0CA4C000 bhash = 0x00000000 bpageno = (1:262)
bdbid = 15 breferences = 0 bUse1 = 35691
bstat = 0x1c0000b blog = 0x212121bb bnext = 0x00000000
PAGE HEADER:
Page @0x0CA4C000
m_pageId = (1:262) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x8008
m_objId (AllocUnitId.idObj) = 105 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594044809216
Metadata: PartitionId = 72057594043236352 Metadata: IndexId = 0
Metadata: ObjectId = 1493580359 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 18 m_slotCnt = 5 m_freeCnt = 8002
m_freeData = 201 m_reservedCnt = 21 m_lsn = (41:451:2)
m_xactReserved = 21 m_xdesId = (0:2038) m_ghostRecCnt = 0
m_tornBits = 0
Allocation Status
GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED
PFS (1:1) = 0x61 MIXED_EXT ALLOCATED 50_PCT_FULL DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED
DATA:
Slot 0, Offset 0x60, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C060
00000000: 10001200 01000000 726f7720 31202020 †........row 1
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 1, Offset 0x75, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C075
00000000: 10001200 02000000 726f7720 32202020 †........row 2
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 3, Offset 0x9f, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C09F
00000000: 10001200 04000000 726f7720 34202020 †........row 4
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 4, Offset 0xb4, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C0B4
00000000: 10001200 05000000 726f7720 35202020 †........row 5
00000010: 20200200 00†††††††††††††††††††††††††† ...
OFFSET TABLE:
Row - Offset
4 (0x4) - 180 (0xb4)
3 (0x3) - 159 (0x9f)
2 (0x2) - 0 (0x0)
1 (0x1) - 117 (0x75)
0 (0x0) - 96 (0x60)
注意观察行的偏移量,Slot为2的行即a=3的行,在删除后,偏移为0,即slot2处没有行。而Slot3在删除前后没有变化,也就是 说,SQL Server并没有压缩这个页的数据。除了页上的空间没有被回收,Heap中的空页也不会被频繁分配。即使你从Heap中删除了所有的行,SQL Server并没有标记空的页为未分配。因此,这部分空间对于其他的对象仍然是不可用的,目录视图sys.dm_db_Partition_stats表 明这部分空间仍然属于该Heap。
■Deleting Rows From a B-tree
在一个索引的叶级,无论是聚集或非聚集,当行被Delete时,这些行被标记为影子记录(ghost record),这意味着这行仍然在页上,但是行头部的一个二进制位元(bit)被改变,以指示访行被删,只留一个影子。页头部也反映一个页的影子 (ghost)数量。影子数量被用于某些场合,比如可以使事务的回滚更有效(如果行没有物理移走,那么回滚一个Delete操作时,只需要找出带有 ghost标志的记录即可)。另外也可以用用在并发优化(key-rank locking,第十章会提到)在与其他锁模式一起工作时。还有一个用途是支持行级校验,第十章会提到。
影子记录或迟或早总要被清理,取决于系统的负载。有时,它们会在你察看(Inspect)他们前被清理。有一个叫"ghost-cleanup thread"的后台进程,它的工作是移走那些不再被用于支持活动事务(或其他将来内容)的影子记录,在这里的代码中,如果你执行一个Delete,等一到两分钟,再用DBCC PAGE查看,影子记录可能已经消失了。
有一个存储过程sp_clean_db_free_space将移走整个数据库里所有的影子记录(只要它们不是被未提交的事务的一部分),而存储过 程sp_clean_db_file_free_space将对数据库的一个文件做同样的操作。
下面的例子,与前面的表类似,但多了一个主键声明,意味着聚集索引被建,数据变成了聚集索引的叶级,因此行被删除时,被标记为影子。
go
DROP TABLE smallrows;
go
CREATE TABLE smallrows
(
a int identity PRIMARY KEY,
b char(10)
);
go
INSERT INTO smallrows
VALUES ('row 1');
INSERT INTO smallrows
VALUES ('row 2');
INSERT INTO smallrows
VALUES ('row 3');
INSERT INTO smallrows
VALUES ('row 4');
INSERT INTO smallrows
VALUES ('row 5');
go
TRUNCATE TABLE sp_tablepages;
INSERT INTO sp_tablepages
-- RESULTS: (Yours may vary.)
-- PageFID PagePID
-- ------- -----------
-- 1 269
DBCC TRACEON(3604);
go
-- Be sure to enter YOUR PagePID:
DBCC PAGE(TestDb, 1, 269, 1);
go
-- Next, we delete the middle row (WHERE a = 3) and look at
-- the page again:
DELETE FROM smallrows
WHERE a = 3;
go
-- Be sure to enter YOUR PagePID:
DBCC PAGE(TestDb, 1, 269, 1);
go
结果:
BUFFER:
BUF @0x03E607B8
bpage = 0x0CD44000 bhash = 0x00000000 bpageno = (1:269)
bdbid = 15 breferences = 3 bUse1 = 45474
bstat = 0x1c0000b blog = 0x212159bb bnext = 0x00000000
PAGE HEADER:
Page @0x0CD44000
m_pageId = (1:269) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 106 m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594044874752
Metadata: PartitionId = 72057594043301888 Metadata: IndexId = 1
Metadata: ObjectId = 1509580416 m_prevPage = (0:0) m_nextPage = (0:0)
pminlen = 18 m_slotCnt = 5 m_freeCnt = 7981
m_freeData = 201 m_reservedCnt = 0 m_lsn = (41:504:2)
m_xactReserved = 0 m_xdesId = (0:2054) m_ghostRecCnt = 1
m_tornBits = 0
Allocation Status
GAM (1:2) = ALLOCATED SGAM (1:3) = ALLOCATED
PFS (1:1) = 0x68 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED
DATA:
Slot 0, Offset 0x60, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C060
00000000: 10001200 01000000 726f7720 31202020 †........row 1
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 1, Offset 0x75, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C075
00000000: 10001200 02000000 726f7720 32202020 †........row 2
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 2, Offset 0x8a, Length 21, DumpStyle BYTE
Record Type = GHOST_DATA_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C08A
00000000: 1c001200 03000000 726f7720 33202020 †........row 3
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 3, Offset 0x9f, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C09F
00000000: 10001200 04000000 726f7720 34202020 †........row 4
00000010: 20200200 00†††††††††††††††††††††††††† ...
Slot 4, Offset 0xb4, Length 21, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP Record Size = 21
Memory Dump @0x63C4C0B4
00000000: 10001200 05000000 726f7720 35202020 †........row 5
00000010: 20200200 00†††††††††††††††††††††††††† ...
OFFSET TABLE:
Row - Offset
4 (0x4) - 180 (0xb4)
3 (0x3) - 159 (0x9f)
2 (0x2) - 138 (0x8a)
1 (0x1) - 117 (0x75)
0 (0x0) - 96 (0x60)
注意行仍然在页自身(使用DBCC PAGE 样式1),因为表有一个聚集索引。如果你使用不同的输出样式,可以看到slot为空行,类型为GHOST_DATA_RECORD,或两者都有,以作澄 清。行的头部信息也显示这是一个影子记录。页底部的Slot数组 也显示slot2仍然使用相同的偏移,所有的行也在删除前的相同位置。此外,页头部使用值m_ghostRecCount来显示该页中影子记录的数量。要 察看一个表中所有的影子记录的数量,你可以使用sys.dm_index_physical_stats函数。
这时有一个更详细的讨论:
http://www.SQLskills.com/BLOGS/PAUL/post/Inside-the-Storage-Engineer-Ghost-cleanup-in-depth.aspx
■Deleting Rows in the non-leaf Levels of a Index
当你从一个表中删除一行时,所有的非聚集索引必须被维护,因为每一个非聚集索引有一个指针指向该行。非叶级页的索引的行在删除时不被克隆 (ghost),仅被看作Heap页的一部分,这部分空间直到该页中有新的行需要空间时才被压缩。
■Reclaiming Pages
当一个数据页的最后一行被删除时,整个页被影子清理后台进程重新分配。只有一种例外情况是这个表是Heap,(我们之前讨论过,如果一个表只有一个 页,它不会被重分配。一个表总是会保持至少一个页,即使它是空表。)数据页的重分配导致指向数向数据页的索引页的行也被删除,如果索引行被删除,非叶级索 引页也被删除(注意Update也可能引起Delete+Insert。)仅仅在索引页保留一个入口(Entry),这个入口被移到相邻的页,如果刚好有 空间,那么,空的页被分配。
这一篇Delete Rows相对比较简单,下篇我们将看看Update Rows