Use P/Invoke to Develop a .NET Base Class Library for Serial Device Communications
Out-of-the-box, the only way of coding RS232 serial communications applications in the .NET environment is to import the outdated and somewhat limited MSComm ActiveX control. This article describes the development of a lean, multithreaded, and modern RS232 base class library in C# managed code. The library uses Platform Invocation Services to interact with the Win32 API directly. Application programmers can use the library from any .NET language through inheritance; the article explores examples written in C# and Visual Basic .NET.
he Microsoft® .NET Framework Class Library (FCL) provides reasonably comprehensive coverage of the functionality of the underlying Win32® API, greatly contributing to the sophistication of the C# and Visual Basic® .NET languages. However, RS232 serial communications is one area that is conspicuously absent from the library. To be fair, most people probably now regard these ports as legacy baggage. These days, you interact with serial modems via software layers such as TAPI or PPP. Other devices that once used these ports are now migrating to USB. Nevertheless, the need for device drivers for specialized RS232 devices such as GPS receivers, barcode and swipe card readers, programmable controllers, and device programmers will continue for the foreseeable future. (For RS232 port specs, see the sidebar "Hardware Specs".)
Platform Invocation Services (P/Invoke) is the .NET technology that enables managed code in the common language runtime (CLR) to make calls into unmanaged DLLs, including those that implement the Win32 API. In this article, I will wrap the API functions provided for RS232 communications in CLR managed classes using C#. The resultant base class library will make it relatively easy to develop drivers for specific devices using any .NET language. Full source code for the library and examples is available for download from the link at the top of this article.
Design Principles
There are at least four design options that you should consider when encapsulating the Win32 serial communications functions in managed classes:
- Use P/Invoke to wrap the API functions, constants, and structure definitions as static members of a managed class. While I use this approach internally, I do not expose this class to application programmers.
- 2.Write a stream handler. This is a generalized, extensible abstraction used by the framework for file, console, and network communications. At first sight, this is attractive, but on closer examination it is better suited to traditional modem communications than to the command-response syntaxes of modern serial devices.
- 3.Build a direct replacement for the MSComm OLE Control Extension (OCX). In other words, create a class that encapsulates the API file handle and provides a number of generalized methods and events (Open, Close, Read, Write, and so on). You could reuse this functionality by instantiating an object of the library class within the application class—that is, by aggregation, COM-style.
- 4.Write a base class from which the application code inherits. This is a truly object-oriented approach that exploits the wonderful .NET trick of language-independent run-time inheritance. The generalized methods are inherited into the application object and virtual methods are used rather than events. The application object provides a public interface in terms appropriate to the actual RS232 device (for example, a GPS receiver driver might have public properties for latitude and longitude).
I will employ the fourth approach. The library will contain two base classes declared as abstract (they can't be instantiated), but I will use them as base classes for building application-specific classes through inheritance. Figure 1 illustrates the inheritance hierarchy.
Figure 1 Inheritance Hierarchy
The first library class, CommBase, makes no assumptions about data formatting and provides facilities for opening and closing the communications port, sending and receiving bytes of data, and interacting with the control inputs and outputs.
The second library class, CommLine, inherits from CommBase and makes two assumptions: that the coding of the bytes sent and received is ASCII and that a reserved ASCII control code will mark the termination of variable-length lines of data, enabling transmission and reception of strings. Of course, this model is extensible; you could write alternative versions of CommLine for Unicode communications, for example.
Using the Base Classes
Two example applications, BaseTerm and LineTerm, are available for download. These are general-purpose terminal emulators allowing experimentation with just about any serial device, including a modem. I will take a brief look from a user perspective at BaseTerm, then analyze the source code for LineTerm in detail.
Figure 2 BaseTerm
BaseTerm (see Figure 2) is a full Windows® Forms-based application that inherits from CommBase to provide a byte-oriented diagnostic terminal. The Settings button brings up a form to enable editing of the full range of communications settings (see Figure 3). The menus on this form enable the user to save and load settings to XML-structured files, as well as a number of presets for common flow control scenarios. Tooltips explain the use of the individual settings. Once saved to XML, you can specify the file on the command line when you start the program again. Once the link is online, typed characters get sent to the remote device immediately. Keyboard keys send the appropriate ASCII byte, and if you want you can send non-keyboard codes using an escape facility.
Figure 3 Comm Settings
To start an escape, type the open angle bracket (<) character. Then, type either an ASCII control code name or a decimal number between 0 and 255. To terminate the escape, type the > character, which will cause the appropriate ASCII code to be sent immediately. The < character can be sent by typing it twice. You can view a list of valid ASCII control code names in the "Xon Code" dropdown on the settings dialog, and sites like http://www.asciitable.com provide additional useful information. The larger text box on the terminal window displays all received bytes in either ASCII or hexadecimal without further interpretation.
You can use display settings to break received lines either after reception of a specific ASCII character or after a fixed number of characters. The Status button displays a report of the status of the transmission and reception queues.
LineTerm uses CommLine as its base class and illustrates how to use the library in source code. You will need to run it from the Visual Studio® IDE because there is no user interface for settings. In Visual Studio .NET, create a new Visual Basic console application. Remove the supplied default module from the project. Copy LineTerm.vb, CommBase.dll, and CommBase.xml into the project directory (the XML file provides IntelliSense® information for the library). Use Add Existing Item in the project explorer to import LineTerm.vb and Add Reference to set a reference to CommBase.dll. You should now be able to build and run the project.
Figure 4 shows the complete source code for this example. In the first line, I import the namespace for the library. Then I create a new class, LineTerm, that inherits from CommLine. This provides the public methods Open and Close (actually inherited from CommBase), and the protected method Send, which I make public as SendCommand. In my new class, I override a number of virtual methods from the base classes. The CommSettings method is invoked by Open to configure the communications port; it must return an initialized CommBaseSettings object.
I actually use CommLineSettings here, which is fine because it inherits from CommBaseSettings. In the last two lines of this function, I first pass the object to the Setup method, which is inherited from CommLine, and then return it to CommBase. All the settings are public members and can be set directly, but there is also a helper method, SetStandard, which automatically configures the most common settings for CommBase. You will probably need to edit the parameters to this function and the line terminator and filter members to suit the device you have available for testing.
The Main method for the application simply creates an instance of my class, invokes the Open method, and provides a console harness for sending strings and displaying received strings. There are two methods of doing this: blocking and non-blocking. Using SendCommand starts non-blocking communications. This function returns immediately, then sometime later the send will complete and the OnTxDone override will report this. Later still, when the remote device has transmitted a complete response line, the OnRxLine override will display it on the console. During this time, the Main routine waits for user input, but it could have been getting on with other work. If you comment out SendCommand and enable TransactCommand instead, the same action will be performed using blocking communications. Here the Main routine will block at this line until the response is available. You will still see the message from OnTxDone, but instead of the RECEIVED message from OnRxLine you will see the RESPONSE message from TransactCommand.
Figure 5 Flow Control for GPS
In a real application, such as a driver for a GPS receiver, you would not simply make the Send or Transact methods public in the same way as I did in this example. Instead, you would provide public methods and properties exposing the functionality of the device (for example, properties like Speed and Altitude, or an event like PositionChanged). These methods should assemble the necessary command, use the Transact method, and then parse the response to extract the return value. Figure 5 illustrates the flow control for such a device driver.
Sending
In serial communications, as in so much of life, sending messages is much easier than receiving them. For reception, you are at the whim of the remote device, whereas for transmission, you remain in control of the timing. However, at typical speeds of between 2 and 20,000 baud compared to computer speeds in the gigahertz range, you do not want to be cooling your heels waiting for transmission operations to complete. The Win32 API treats serial communications as a special case of file operations and uses a technique that is known as overlapped I/O to provide non-blocking operations.
The CommBase class provides the public method Open, which uses the Win32 API function CreateFile to open a serial port and store the resulting operating system handle as a private member variable:
hPort = Win32Com.CreateFile(cs.port, Win32Com.GENERIC_READ | Win32Com.GENERIC_WRITE, 0, IntPtr.Zero, Win32Com.OPEN_EXISTING, Win32Com.FILE_FLAG_OVERLAPPED, IntPtr.Zero);The first parameter is the port name as a string, usually COM1: or COM2:, but in theory the name could be anything, so I have used a string rather than a number. I haven't found any way to determine a list of valid port names, so I have chosen to let the caller attempt to open any port, accepting the fact that this may fail. Failure may also occur if the port exists but is already in use by another application, in which case Open returns false. I use FILE_FLAG_OVERLAPPED to specify that all operations on this file handle will be non-blocking, and the rest of the parameters are boilerplate for serial communications.
Win32Com is a helper class used as a container for static definitions of the API functions, structures, and constants that I will be using via P/Invoke. CreateFile is declared as follows in C#:
[DllImport("kernel32.dll", SetLastError=true)] internal static extern IntPtr CreateFile(String lpFileName, UInt32 dwDesiredAccess, UInt32 dwShareMode, IntPtr lpSecurityAttributes, UInt32 dwCreationDisposition, UInt32 dwFlagsAndAttributes, IntPtr hTemplateFile);The various constants are also defined here. For example:
internal const UInt32 FILE_FLAG_OVERLAPPED = 0x40000000;
Since there is currently very little tool support for P/Invoke, I have had to handcraft these definitions. The key resources include the Win32 API documentation and the header files provided for use with C++. (The excellent file search facilities in Visual Studio .NET are invaluable for finding definitions in the header files. I use these for documentation purposes only; you don't need them to compile the library.) A full discussion of interop marshaling, the process by which managed data types are translated into the unmanaged C definitions used by the API, is beyond the scope of this article. However, you can get an idea of what is going on under the hood from another piece of code in the Open method:
wo.Offset = 0; wo.OffsetHigh = 0; if (checkSends) wo.hEvent = writeEvent.Handle; else wo.hEvent = IntPtr.Zero; ptrUWO = Marshal.AllocHGlobal(Marshal.SizeOf(wo)); Marshal.StructureToPtr(wo, ptrUWO, true);
Here, wo is a local variable of type Win32Com.OVERLAPPED, and ptrUWO is a private class variable of type IntPtr. Marshal is a static object in System.Runtime.InteropServices that provides access to the interop marshaler. In this code, I am doing manually what the marshaler normally does automatically when an external function is called. The first step is to allocate an appropriately sized block of unmanaged memory, then copy the contents of the managed structure into it, remapping the memory layout as required. After the function call, the marshaler would normally use Marshal.PtrToStructure to perform the reverse copy, then Marshal.FreeHGlobal to release the memory. I perform this operation manually because of the special way the API uses the OVERLAPPED structure. I will specify it in a WriteFile call, but then the operating system will continue to use it after the call returns.
Some time later, I will call GetOverlappedResult, specifying the same structure again. If I left this up to automatic marshaling, the risk would be that the unmanaged memory would be reallocated between the two calls. In this case, unmarshaling is not necessary since the fields do not need to be accessed again. The memory must, however, be deallocated when the port is closed:
if (ptrUWO != IntPtr.Zero) Marshal.FreeHGlobal(ptrUWO);With this groundwork in place, actually sending an array of bytes is quite straightforward:
if (!Win32Com.WriteFile(hPort, tosend, (uint)writeCount, out sent, ptrUWO)) if (Marshal.GetLastWin32Error != Win32Com.ERROR_IO_PENDING) ThrowException("Unexpected failure");
The tosend parameter is a pointer to an array of bytes; writeCount is the number of bytes in the array; the sent parameter will return the number of bytes actually sent; and ptrUWO is a pointer to the unmanaged version of OVERLAPPED, as already created. Normally, the function will return false and the error code will be ERROR_IO_PENDING. This is a pseudo-error which indicates that the operation has been queued because it could not be completed immediately. Any other error code means that the operation could not be queued. With buffered serial hardware and short send strings, it is possible that the operation will be completed immediately, in which case the function will return true.
Before sending new data, the result of any previous Send should be checked, enabling the detection of error conditions or timeouts. (Strangely, the API treats a pending operation as an error, but a timeout as perfectly normal—the only way of detecting it is that fewer bytes are sent than were queued. Ironing out this kind of anomaly is one of the hidden pleasures of writing a wrapper library!) Although I could allow multiple pending sends, each with its own OVERLAPPED structure, it would add a lot of complexity. Instead, I have blocked a subsequent Send until the previous one completes. If this blocking is a problem, it can be disabled by setting the checkAllSends member to false, in which case the OVERLAPPED structure is reused and there is no guarantee that errors or timeouts will be caught.
Receiving
As you might guess, receiving data is simply a matter of calling the API ReadFile function. As previously mentioned, the difficult part is not so much receiving, but knowing when to receive. To avoid forcing application programmers to continually poll for data, some form of callback mechanism is required. Virtual methods called from a worker thread can perform this function. CommBase invokes a virtual method on reception of each byte. This method is overridden in CommLine to buffer up the bytes and call a different virtual method when a line terminator is received.
To make this work, I create a second thread of execution using the following code in the Open method:
rxThread = new Thread(new ThreadStart(this.ReceiveThread)); rxThread.Name = "ComBaseRx"; rxThread.Priority = ThreadPriority.AboveNormal; rxThread.Start; Thread.Sleep(1);This starts up a new thread running the code in the private method ReceiveThread. The need for the final line surprised me; I assumed that the new, higher priority thread would preempt the original thread at the Start command. For some reason it does not and this created problems because the worker thread was not always ready when it was first needed. As a second attempt, I used Sleep(0) which the documentation suggests should allow preemption without wasting time (most of a whole millisecond, for goodness sake!), but again this did not work in practice.
ReceiveThread is an infinite loop of code broken only on an exception. I terminate the thread on closing the port using the following line:
rxThread.Abort;This throws a ThreadAbortException in the thread, causing it to terminate via the catch clause, which is used for tidying up. A finally clause could also be used, but in this case it makes is no difference because the only exit route is through an exception.
Figure 6 shows a simplified version of ReceiveThread. SetCommMask indicates that I want to be notified when a new byte arrives. WaitCommEvent may return true, in which case there is already one or more bytes waiting. If it returns false with error code ERROR_IO_PENDING, I can suspend the thread until a byte arrives. The OVERLAPPED structure that was passed to WaitCommEvent includes the handle to an AutoResetEvent, which will be signaled when the byte arrives. When I execute the AutoResetEvent's WaitOne method, execution is suspended until the event is signaled.
Whether WaitCommEvent returns true immediately or signals completion later, the eventMask variable will contain a bitmask identifying which of the conditions requested in SetCommMask actually occurred (in the real code, I specify some other housekeeping conditions as well).
Notice that I use the same manual marshaling trick for eventMask as previously discussed for OVERLAPPED. I suspect this is not actually necessary and that automatic marshaling might work, but the exact behavior is not documented so it's better to be safe than sorry. Replacing the unmarshaled pointer with the managed variable as a reference parameter seems to work, but it may just be luck that the memory has not been reused.
Depending on timing, more than one character may be waiting, so the queue is drained one character at a time using ReadFile repeatedly and calling the virtual function OnRxChar with each character. When I get ERROR_IO_PENDING, I call CancelIo because I don't want to wait here; I want to loop back and wait at WaitCommEvent.
Error handling and exceptions need to be carefully considered when using worker threads. Any unhandled exceptions that occur in ReceiveThread, any of the virtual methods called from it, and any methods or events called or fired from any of these will propagate up and be handled by the catch clause. If the exception is not ThreadAbortException, it is stored in a private member of CommBase and the thread is then terminated. The next time the application code invokes a method on the primary thread, the exception is raised again and the port is closed. This makes good use of the inner exception mechanism since a generic "Receive Thread Fault" is raised, containing a reference to the original stored exception. ThrowException is provided as a helper method for use in derived classes; it adjusts its behavior according to which thread it is called on.
Settings and Other Details
I read all the configurable settings from a helper object of class CommBaseSettings. Open acquires this object by calling the virtual method CommSettings and then copies the values into the API structures. CommBaseSettings also provides methods for saving and restoring settings to XML configuration files and for applying common settings scenarios in bulk. I provide documentation for the settings in the form of IntelliSense help. This design provides an extensible settings infrastructure since derived classes can provide their own settings class that inherits from CommBaseSettings. I have inherited CommLineSettings in this way to add additional settings required by CommLine.
There are three API functions for configuring communications protocols: SetupComm, SetCommState, and SetCommTimeouts. SetupComm requests sizes for the reception and transmission queues. Normally you can just set these to zero and the operating system decides, but it may be worth adjusting the requested size for some file transfer and similar applications. There is no guarantee that the system will honor this request; in Windows XP, there appears to be a dynamic transmission queue and only the reception queue length is honored.
SetCommState supplies the settings for baud rate, word format, and handshake settings in a structure called the device control block (DCB).
SetCommTimeouts provides three reception and two transmission timeout values in a COMMTIMEOUTS structure. The reception timeouts are not useful in the design I have chosen because individual characters are processed asynchronously. If a reception timeout is required, then it needs to be implemented at a higher level (for example, CommLine provides a timeout for its Transact method). The transmission timeouts are useful, however, for multibyte sends. The number of bytes in the send is multiplied by the sendTimeoutMultiplier and then sendTimeoutConstant is added to this to give the total time allowed in milliseconds.
Once the port is opened and configured, Open invokes a virtual method AfterOpen, which may be overridden to check the status of the connection to the remote device and possibly also to configure it. If this returns false, the port will be closed again and Open itself will return false. There is also a BeforeClose method for shutting down the remote device if necessary.
CommBase provides two overloaded versions of Send, one taking a byte array and the other taking a single byte. CommLine provides a third version of Send, taking a string. All of these ultimately use the byte array version after appropriate data conversion. There is also a SendImmediate method taking a single byte. This sends the byte ahead of any other bytes in the transmission queue and may be useful for implementing custom flow control schemes. There are also properties providing direct control of the Request to Send (RTS) and Data Terminal Ready (DTR) output pins and the ability to put the TX output into a break condition. The input pins—Clear-to-Send (CTS), Data Set Ready (DSR), Received Line Signal Detector (RLSD), and Ring Detect—can be read directly using GetModemStatus and the virtual function OnStatusChange is called when the state of any of these input and output pins changes.
GetQueueStatus returns a QueueStatus object, giving the size and content of the transmission and reception queues and what, if any, flow control condition is currently blocking transmission.
Conclusion
I've used Platform Invocation Services to address one of the gaps in the functionality of the FCL. This turned out to be a nontrivial but perfectly feasible exercise. Much of the difficulty arises because full tool support and documentation for P/Invoke is not yet in place.
Finally, I have a confession to make. As part of this project, I wrote and tested a complete wrapper for the Win32 Waitable Events API before stumbling on the ManualResetEvent and AutoResetEvent framework classes that already encapsulated all the functionality I needed. Remember: you just might be spending all your free time writing brand new classes from scratch when just what you needed already existed. Check your local hardware store before reinventing the wheel. On this principle, I hope that the base classes developed here will help other programmers bring RS232 device communications into the .NET world.