(Check out my Pluralsight online training course: SQL Server: Detecting and Correcting Database Corruption.)

This is a question that I was sent over email that I thought would make a good post, and I’m waiting to do a user group presentation in Australia at 2am, so this will keep me busy :-)

Imagine you come to work in the morning and notice that some new rows have been entered into the msdb.dbo.suspect_pages table during the night. Usually the first thing you’re going to do is run DBCC CHECKDB, but if your database is a few TB, that could be several hours before you know where the problem is, and which table may have lost data. You want to find out which table is involved as soon as possible so you can explore your disaster recovery options.

Another scenario is troubleshooting a poorly performing query workload, running my script to look at the currently waiting threads using sys.dm_os_waiting_tasks, seeing a lot of PAGELATCH_EX waits and needing to figure out which table is involved from the page ID in the resource_description column in the sys.dm_os_waiting_tasks output.

Going back to the first scenario, getting the data from the suspect_pages table is easy:

1
2
SELECT * FROM [msdb].[dbo].[suspect_pages];
GO
database_id file_id     page_id              event_type  error_count last_update_date
----------- ----------- -------------------- ----------- ----------- -----------------------
6           1           295                  2           2           2014-09-25 01:18:22.910

Finding the table name requires first using DBCC PAGE. The syntax for DBCC PAGE is:

dbcc page ( {'dbname' | dbid}, filenum, pagenum [, printopt={0|1|2|3} ])

You can just use print option 0, as that just displays the page’s header. You also must enable trace flag 3604 to get any output from DBCC PAGE – it’s perfectly safe. So taking the values from our suspect_pages output, that gives us:

1
2
3
4
DBCC TRACEON (3604);
DBCC PAGE (6, 1, 295, 0);
DBCC TRACEOFF (3604);
GO
PAGE: (1:295)
 
BUFFER:
 
BUF @0x00000004FD8C7980
 
bpage = 0x00000004A2D14000          bhash = 0x0000000000000000          bpageno = (1:295)
bdbid = 6                           breferences = 0                     bcputicks = 0
bsampleCount = 0                    bUse1 = 55116                       bstat = 0x809
blog = 0x15ab215a                   bnext = 0x0000000000000000         
 
PAGE HEADER:
 
Page @0x00000004A2D14000
 
m_pageId = (1:295)                  m_headerVersion = 17                m_type = 17
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0x8200
m_objId (AllocUnitId.idObj) = 84    m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594043432960
Metadata: PartitionId = 72057594039042048                                Metadata: IndexId = 0
Metadata: ObjectId = 245575913      m_prevPage = (0:0)                  m_nextPage = (0:0)
pminlen = 8008                      m_slotCnt = 1                       m_freeCnt = 83
m_freeData = 8107                   m_reservedCnt = 0                   m_lsn = (35:200:9)
m_xactReserved = 0                  m_xdesId = (0:0)                    m_ghostRecCnt = 0
m_tornBits = 1093512791             DB Frag ID = 1                     
 
Allocation Status
 
GAM (1:2) = ALLOCATED               SGAM (1:3) = ALLOCATED
PFS (1:1) = 0x64 MIXED_EXT ALLOCATED 100_PCT_FULL                        DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED

We’re interested in the output beginning Metadata:. Those fields are not stored on the page itself. When I rewrote DBCC PAGE for SQL Server 2005, I added the Metadata:output to make it easier to find the object and index ID that the page is part of (as these used to be the m_objId and m_indexId fields in SQL Server 7.0 and 2000).

The Metadata: ObjectId field is what we want. If you see it is 99, then stop as that means the damaged page is part of the allocation system and not part of a table and you’ll need to wait for DBCC CHECKDB to complete to know the extent of the damage.

If you see the ObjectId is 0, that means there was no metadata found. This could be because:

  • The table that the page was part of has been deleted since the page corruption was logged
  • The system catalogs are corrupt in some way
  • The page is corrupt and so incorrect values were used to look up the metadata

In any case, you’ll need to wait for DBCC CHECKDB to complete to know the extent of the damage.

If the ObjectId is not 0 or 99, we can plug it into the OBJECT_NAME function to give us the name of the table:

1
2
SELECT (245575913);
GO
--------------------------------------------------------------------------------------------------------------------------------
NULL

If you get the result above, then there are two possibilities:

  1. You are in the wrong database context
  2. The metadata for the database is corrupt, so wait for DBCC CHECKDB to complete

It’s most likely #1 that’s the problem, at least in my experience with helping people out. You can get the database name by taking the database_id in the suspect_pagesoutput and plugging it into the DB_NAME function. Go into the correct database context and try again.

1
2
3
4
USE [company];
GO
SELECT (245575913);
GO
--------------------------------------------------------------------------------------------------------------------------------
CustomerNames

So there you go – hope this helps!

PS1 Kenneth Fisher commented with some neat code that will do the job on 2012 and 2014 using the new sys.dm_db_database_page_allocations DMF – check it out. That will work as long as there aren’t any metadata or allocation bitmap corruptions.

PS2 Wayne Sheffield reminded me over email and in a comment that DBCC PAGE doesn’t need 3604 if you use the WITH TABLERESULTS option and programmatically crack the DBCC PAGE results.

Related Posts

命令语法信息

So what’s the syntax?

dbcc traceon(3604)

dbcc page ( {‘dbname’ | dbid}, filenum, pagenum [, printopt={0|1|2|3} ])

The filenum and pagenum parameters are taken from the page IDs that come from various system tables and appear in DBCC or other system error messages. A page ID of, say, (1:354) has filenum = 1 and pagenum = 354.

The printopt parameter has the following meanings:

  • 0 – print just the page header
  • 1 – page header plus per-row hex dumps and a dump of the page slot array (unless its a page that doesn’t have one, like allocation bitmaps)
  • 2 – page header plus whole page hex dump
  • 3 – page header plus detailed per-row interpretation