I'm researching into intercepting queries that arrive at the SQL Server 2008 process. SQLOS architecture is divided in the following system DLLs:
SQLSERVR service process instances the SQLOS components through sqlboot.dll and sqldk.dll, and the worker threads receive queries through the selected connection method in the server (TCP/IP, local shared memory or named-pipes). I've debugged the sqlservr.exe process address space searching for textual queries. It seems that query strings are readable, but I could not find a point where queries can be intercepted while they enter the SQLOS scheduler. Listening to pipes or TCP/IP is not an option at this moment; I would like to inject at a higher level, preferably at SQLOS-component level. Any idea on where to start looking into? |
|||||||||||||||||||||
|
This seemed like a fun project for a Sunday afternoon, so I had a go at it. To get straight to the point, here's the call stack for a function in SQL server that parses and then executes the query (addresses and offsets taken from SQL Server 2008 R2 running on Windows 7 SP1 32-bit):
Based on this, you probably want to take a close look at the Armed with this information, I was also able to dig up a couple blog posts by someone at Microsoft on how to extract the query string from a memory dump of SQL Server. That post seems to confirm that we're on the right track, and gives you a place to interpose and a way to extract the query string. MethodologyI felt like this would be most easily tackled using some form of Dynamic Binary Instrumentation (DBI); since we suspect the query string will be processed somewhere in the SQL Server process, we can look at memory reads and writes made by the process, searching for a point that reads or writes the query string. We can then dump the callstack at that point and see what interesting addresses show up, and map them back to symbols (since, as Rolf points out, SQL Server has debug symbols available). It really was basically as simple as that! Of course, the trick is having something around that lets you easily instrument a process. I solved this using a (hopefully soon-to-be-released) whole-system dynamic analysis framework based on QEMU; this let me avoid any unpleasantness involved in getting SQL Server to run under, e.g., PIN. Because the framework includes record and replay support, I also didn't have to worry about slowing down the server process with my instrumentation. Once I had the callstack, I used PDBParse to get the function names. |
|||||||||||||||||||||
|
Sniffing traffic only ... is easyIf you merely wanted to sniff the traffic you could use the TDS protocol sniffer that comes withWireShark. Let the laziness guide you - laziness is the reverser's friend
I don't know why you insist on doing this a particular way when all information is readily available and all you need to do is put the jigsaw pieces together. This would seem to be the easiest, fastest - in short: laziest - method. Besides TCP/IP is the higher level, because you can intercept it even before it reaches the actual SQL server machine if you can hijack the IP/name of the SQL server and put a "proxy" in between. How high level do you want it? What you insist on is actually drilling down into the lower level guts of the MS SQL Server. MS SQL Server uses a documented protocol and using an LSP you should/would be able to sniff, intercept and even manipulate that traffic. As far as I recall LSPs run within the process space of the application whose traffic they're filtering. You can consider them a makeshift application-level firewall, literally. Alternatively - and probably the better choice anyway - you could write a proxy based on the existing and free FreeTDS (licensed under LGPL). The I'll save the time to go into details of the disassembly, although I installed MS SQL Server 2008 and briefly looked at it in IDA Pro. Brendan's answer provides a good overview, even if I disagree with this overly involved method where an easier one is available. But then, you (Hernán) asked for it. |
||||
In general, what I would say is that problems like this one are application-specific. Therefore, despite the fact that the user broadway was down-voted for his answer, it was exactly the same advice I'd give if I wasn't aware of any nice, special solutions specific to the problem. What you're going to have to do is watch the data come into the process and then follow it as it is copied and manipulated throughout the program. This task will be easier than the general case owing to the fact that debug symbols are available for SQL Server. Have you attempted anything along these lines? Say, setting a breakpoint on network receive-type functions in the context of SQL Server, setting a hardware RW breakpoint on the data that comes in over the network, and then watching how the data moves through the mass of code? |
|||||
|
I don't have any specific knowledge about that target, but the approach I would probably take is to send the same message over a pipe, tcp, and shared memory and trace them with pin, looking for where the basic block's hit converge with all traces should give you some starting points for fine tuning a good injection point. |
|||||||||||||
|