Server-side Query interception with MS SQL Server

Question

I'm researching into intercepting queries that arrive at the SQL Server 2008 process.

SQLOS architecture is divided in the following system DLLs:

sqlmin.dll: Storage, replication, security features,etc.
sqllang.dll: TransactSQL query execution engine, expression evaluation, etc.
sqldk.dll: Task scheduling and dispatch, worked thread creation, message loops, etc.

SQLSERVR service process instances the SQLOS components through sqlboot.dll and sqldk.dll, and the worker threads receive queries through the selected connection method in the server (TCP/IP, local shared memory or named-pipes).

I've debugged the sqlservr.exe process address space searching for textual queries. It seems that query strings are readable, but I could not find a point where queries can be intercepted while they enter the SQLOS scheduler.

Listening to pipes or TCP/IP is not an option at this moment; I would like to inject at a higher level, preferably at SQLOS-component level.

Any idea on where to start looking into?

Why are you ruling out the obvious first choices in your next to last paragraph? – 0xC0000022L Apr 5 '13 at 22:08 — 0xC0000022L, Apr 5 '13 at 22:08
I'm with 0c00l, but my question is why don't use enter a query with some thin shim program (an odbc connector for instance) and do a run trace. connect with a socket and do a run trace(starting with recv or whatever). then check the union between the two traces for when they begin to intersect to see if it's a viable approach? – RobotHumans Apr 6 '13 at 2:14 — RobotHumans, Apr 6 '13 at 2:14
@0xC0000022L Hernan is obviously aware of the other alternatives. He wants to know specifically where to intercept the functions at a higher level (i.e: using hooks). It would be interesting to know it because I searched and couldn't find a resource about it, while exist a lot of resources about hooking other things. I just found a recommendation from Microsoft to not instrument SQL Server: "The use of third-party detours or similar techniques is not supported in SQL Server" support.microsoft.com/kb/920925 – sw. Apr 6 '13 at 15:29 — sw., Apr 6 '13 at 15:29
@0xC0000022L, sorry if I was rude. I think that knowing how to hook SQL Server would be interesting because of what Hernan said "I would like to inject at a higher level, preferably at SQLOS-component level." – sw. Apr 6 '13 at 15:49 — sw., Apr 6 '13 at 15:49
Just like @sw says, I'm aware of other alternatives such as ODBC-level interception. To be more concise, I don't want to patch at multiple points (e.g: shared memory, pipes and TCP/IP I/O) but at a single point where all queries are scheduled for planning and execution, independent of the client-server interface or communication method. – Hernán Apr 8 '13 at 13:37 — Hernán, Apr 8 '13 at 13:37

Brendan Dolan-Gavitt · Accepted Answer · 2013-04-15 01:05:42Z

This seemed like a fun project for a Sunday afternoon, so I had a go at it. To get straight to the point, here's the call stack for a function in SQL server that parses and then executes the query (addresses and offsets taken from SQL Server 2008 R2 running on Windows 7 SP1 32-bit):

0x7814500a msvcr80.i386!memcpy+0x5a
0x013aa370 sqlservr!CWCharStream::CwchGetWChars+0x5c
0x013a9db5 sqlservr!CSQLStrings::CbGetChars+0x35
0x012ffa50 sqlservr!CParser::FillBuffer+0x3d
0x0138bbfd sqlservr!CParser::CParser+0x3c8
0x01352e96 sqlservr!sqlpars+0x7b
0x013530f2 sqlservr!CSQLSource::FParse+0x16d
0x013531ed sqlservr!CSQLSource::FParse+0x268
0x012ff9e8 sqlservr!`string'+0x3c
0x015894b8 sqlservr!CSQLSource::Execute+0x2c8
0x0158ad31 sqlservr!process_request+0x2ac
0x0158a328 sqlservr!process_commands+0x15f
0x015cf8b4 sqlservr!SOS_Task::Param::Execute+0xdd
0x015cf9ea sqlservr!SOS_Scheduler::RunTask+0xb4
0x015cf575 sqlservr!SOS_Scheduler::IsShrinkWorkersNecessary+0x48
0x77f06854 ntdll!ZwSignalAndWaitForSingleObject+0xc
0x77e479e2 kernel32!SignalObjectAndWait+0x82

Based on this, you probably want to take a close look at the CSQLSource class, and particularly its Execute method.

Armed with this information, I was also able to dig up a couple blog posts by someone at Microsoft on how to extract the query string from a memory dump of SQL Server. That post seems to confirm that we're on the right track, and gives you a place to interpose and a way to extract the query string.

Methodology

I felt like this would be most easily tackled using some form of Dynamic Binary Instrumentation (DBI); since we suspect the query string will be processed somewhere in the SQL Server process, we can look at memory reads and writes made by the process, searching for a point that reads or writes the query string. We can then dump the callstack at that point and see what interesting addresses show up, and map them back to symbols (since, as Rolf points out, SQL Server has debug symbols available). It really was basically as simple as that!

Of course, the trick is having something around that lets you easily instrument a process. I solved this using a (hopefully soon-to-be-released) whole-system dynamic analysis framework based on QEMU; this let me avoid any unpleasantness involved in getting SQL Server to run under, e.g., PIN. Because the framework includes record and replay support, I also didn't have to worry about slowing down the server process with my instrumentation. Once I had the callstack, I used PDBParse to get the function names.

just to clarify: does your method only work if the server runs inside QEMU? and side question: can you actually modify the query? – Ange Apr 15 '13 at 10:15 — Ange, Apr 15 '13 at 10:15
The particular method I used only works if the server runs in QEMU, yes. But you could do the same thing with, e.g., PIN, by instrumenting memory reads/writes. I'm not sure if you could modify the query at this point; I saw at least one other place where it uses the query string to construct an MD5 hash (CSQLStrings::GenerateDurableSqlHandle+0x40) so you'd have to modify it before that point. – Brendan Dolan-Gavitt Apr 15 '13 at 21:55 — Brendan Dolan-Gavitt, Apr 15 '13 at 21:55
Sophisticated, very interesting approach, and technically correct. Thank you for your time and help! – Hernán Apr 17 '13 at 1:05 — Hernán, Apr 17 '13 at 1:05
NOTE about debugging symbols: SQL Server 2012 RTM (11.0.2100.60) has public debugging symbols. At this moment, I could not obtain PDBs for 11.0.3128.0, plus I dont know if they are available for SP1 yet (11.0.3000). So keep this in mind when playing with SQL 2012 system DLLs. For build information seesqlserverbuilds.blogspot.com.ar – Hernán Apr 17 '13 at 20:20 — Hernán, Apr 17 '13 at 20:20
Following Brendan answer, Hernan working example is available at: github.com/nektra/SQLSvrIntercept – sw.Jun 27 '13 at 12:58 — sw., Jun 27 '13 at 12:58

Community · Answer 2 · 2017-04-13 12:49:39Z

Sniffing traffic only ... is easy

If you merely wanted to sniff the traffic you could use the TDS protocol sniffer that comes withWireShark.

Let the laziness guide you - laziness is the reverser's friend

Listening to pipes or TCP/IP is not an option at this moment; I would like to inject at a higher level, preferably at SQLOS-component level.

I don't know why you insist on doing this a particular way when all information is readily available and all you need to do is put the jigsaw pieces together. This would seem to be the easiest, fastest - in short: laziest - method. Besides TCP/IP is the higher level, because you can intercept it even before it reaches the actual SQL server machine if you can hijack the IP/name of the SQL server and put a "proxy" in between. How high level do you want it? What you insist on is actually drilling down into the lower level guts of the MS SQL Server.

MS SQL Server uses a documented protocol and using an LSP you should/would be able to sniff, intercept and even manipulate that traffic. As far as I recall LSPs run within the process space of the application whose traffic they're filtering. You can consider them a makeshift application-level firewall, literally.

Alternatively - and probably the better choice anyway - you could write a proxy based on the existing and free FreeTDS (licensed under LGPL). The tdspool program would be a good point to start this endeavor. And yes, this should be suitable for actual interception, not just sniffing forwarded traffic. You can use the library (FreeTDS) to decode and re-encode the queries. That library would also be the one to use inside your LSP, obviously.

I'll save the time to go into details of the disassembly, although I installed MS SQL Server 2008 and briefly looked at it in IDA Pro. Brendan's answer provides a good overview, even if I disagree with this overly involved method where an easier one is available. But then, you (Hernán) asked for it.

Rolf Rolles · Answer 3 · 2013-04-13 00:19:47Z

In general, what I would say is that problems like this one are application-specific. Therefore, despite the fact that the user broadway was down-voted for his answer, it was exactly the same advice I'd give if I wasn't aware of any nice, special solutions specific to the problem. What you're going to have to do is watch the data come into the process and then follow it as it is copied and manipulated throughout the program. This task will be easier than the general case owing to the fact that debug symbols are available for SQL Server. Have you attempted anything along these lines? Say, setting a breakpoint on network receive-type functions in the context of SQL Server, setting a hardware RW breakpoint on the data that comes in over the network, and then watching how the data moves through the mass of code?

Yea, if there is some application specific knowledge someone already has for this case, then perhaps they'll reveal it, though short of that, this is the obvious path of investigation. However, SQL servers may have fairly complex systems that execute SQL statements, so reversing may be time consuming. OR maybe it's simple. One imagines some function that accepts the SQL statement as input ;). – bitsum Apr 14 '13 at 17:31 — bitsum, Apr 14 '13 at 17:31

broadway · Answer 4 · 2013-04-11 14:34:55Z

up vote2down vote

I don't have any specific knowledge about that target, but the approach I would probably take is to send the same message over a pipe, tcp, and shared memory and trace them with pin, looking for where the basic block's hit converge with all traces should give you some starting points for fine tuning a good injection point.

answered Apr 11 '13 at 14:34

broadway

1,421517

2

Maybe it's just me, but I can't follow your solution. Perhaps you could flesh this out with links, more information, descriptions, and specifics? – Lizz Apr 12 '13 at 0:49

I think he is just saying execute SQL commands, make notes of the code that seems to be executing, dive into that code, and trace it back to the first function in the chain of SQL command execution. Of course, who knows if it's 'that simple', it probably has a fairly complex execution system. Anyway, not my answer, just clarifying what I think I read. – bitsum Apr 14 '13 at 17:30

Not exactly. I am saying use each of the input types to input the same query and examine the path through the program. Pin is a very accessible tool to do this sort of thing (although of course it's not the only one). – broadway Apr 14 '13 at 19:21

Server-side Query interception with MS SQL Server

4 Answers

Methodology

Sniffing traffic only ... is easy

Let the laziness guide you - laziness is the reverser's friend