  Effective C# 原则41:选择DataSet而不是自定义的数据结构

    Item 41: Prefer DataSets to Custom Structures
    DataSets have gotten a bad reputation for two reasons. First, XML serialized DataSets do not interact well with non-.NET code. Using DataSets as part of a web service API makes it more difficult to interact with systems that don't use the .NET Framework. Second, they are a very generic container. You can misuse a DataSet by circumventing some of the .NET Framework's type safety. But the DataSet still solves a large number of common requirements for modern systems. If you understand its strengths and avoid its weaknesses, you can make extensive use of the type.

    The DataSet class is designed to be an offline cache of data stored in a relational database. You already know that it stores DataTables, which store rows and columns of data that can match the layout of a database. You know that the DataSet and its members support data binding. You might even have seen examples of how the DataSet supports relations between the DataTables it contains. It's even possible that you've seen examples of constraints that validate the data being placed in a DataSet.

    But there's even more than that. Datasets also support transactions through the AcceptChanges and RejectChanges methods, and they can be stored as DiffGrams that contain the history of changes to the data. Multiple Datasets can be merged to provide a common storage repository. DataSets support views, which enable you to examine portions of your data that satisfy search criteria. You can create views that cross several tables.

    Yet, some of us want to develop our own storage structures rather than use the DataSet. The DataSet is a general container. Performance suffers a little to support that generality. A DataSet is not a strongly typed container. The collection of DataTables is a dictionary. The collection of columns in a table is also a dictionary. Items are stored as System.Object references. That leads us to write these kinds of constructs:

    int val = ( int )MyDataSet.Tables[ "table1" ].
      Rows[ 0 ][ "total" ];

    To the strongly typed C# mind, this construct is troublesome. If you mistype table1 or total, you get a runtime error. An access to the data element requires a cast. If you multiply these problems by the number of times you access the elements of a DataSet, you can really want to find a strongly typed solution. So we try typed DataSets. On the surface, it's what we want:

    int val = MyDataSet.table1.
      Rows[ 0 ].total;

    It's perfectuntil you look inside the generated C# that comprises the typed DataSet. It wraps the existing DataSet and provides strongly typed access in addition to the weakly typed access in the DataSet class. Your clients can still access the weakly typed API. That's less than optimal.

    Live with it. To illustrate how much you give up, I'll show you how some of the features inside the DataSet class are implemented, in the context of creating your own custom collection. You're thinking that it can't be that hard. You're thinking that you don't need all the features of the DataSet, so it won't take that long. Okay, fine, I'll play along.

    Imagine that you need to create a collection that stores addresses. An individual item must support data binding, so you create a struct with public properties:

    public struct AddressRecord
      private string _street;
      public string Street
        get { return _street; }
        set { _street = value; }

      private string _city;
      public string City
        get { return _city; }
        set { _city = value; }

      private string _state;
      public string State
        get { return _state; }
        set { _state = value; }

      private string _zip;
      public string Zip
        get { return _zip; }
        set { _zip = value; }

    Next, you need to create the collection. You want a type-safe collection, so you derive one from CollectionsBase:

    public class AddressList : CollectionBase

    CollectionBase supports the IList interface, so you can use it as a data-binding source. Now you discover your first serious problem: All your data-binding actions fail when your list of addresses is empty. That did not happen with the Dataset. Data binding consists of late-binding code built on reflection. The control uses reflection to load the first element in the list, and then uses reflection to determine its type and all the properties that are members of that type. That's how a DataGrid learns what columns to add. It finds all the public properties of the first element in the collection, and those are displayed. When the collection is empty, that won't work. You have two possible solutions to this problem. The first is the ugly but simple solution: Never allow an empty list. The second is the elegant but more time-consuming solution: Implement the ITypedList interface. ITypedList provides two methods that describe the types in the collection. GetListName returns a human-readable string that describes the list. GetItemProperties returns a list of PropertyDescriptors that describe each property that should form a column in the grid:

    public class AddressList : CollectionBase
      public string GetListName(
        PropertyDescriptor[ ] listAccessors )
        return "AddressList";

      public PropertyDescriptorCollection
        PropertyDescriptor[ ] listAccessors)
        Type t = typeof( AddressRecord );
        return TypeDescriptor.GetProperties( t );

    It's getting a little better. Now you have a collection that supports simple binding. You're missing a lot of features, though. The next requested feature is transaction support. If you had used a DataSet, your users would be able to cancel all changes to a single row in the DataGrid by pressing the Esc key. For example, a user could type the wrong city, press Esc, and have the original value restored. The DataGrid also supports error notification. You could attach a ColumnChanged event handler to perform any validation rules you need on a particular column For instance, the state code must be a two-letter abbreviation. Using the DataSet framework, that's coded like this:

    ds.Tables[ "Addresses" ].ColumnChanged +=new
      DataColumnChangeEventHandler( ds_ColumnChanged );

    private void ds_ColumnChanged( object sender,
      DataColumnChangeEventArgs e )
      if ( e.Column.ColumnName == "State" )
        string newVal = e.ProposedValue.ToString( );
        if ( newVal.Length != 2 )
          e.Row.SetColumnError( e.Column,
            "State abbreviation must be two letters" );
          e.Row.RowError = "Error on State";
          e.Row.SetColumnError( e.Column,
            "" );
          e.Row.RowError = "";

    To support both concepts on your custom collection, you have quite a bit more work ahead of you. You need to modify your AddressRecord structure to support two new interfaces, IEditableObject and IDataErrorInfo. IEditableObject provides transaction support for your object. IDataErrorInfo provides the error-handling routines. To support the transactions, you must modify your data storage to provide your own rollback capability. You might have errors on multiple columns, so your storage must also include a collection of errors for each column. Here's the updated listing for the AddressRecord:

    public class AddressRecord : IEditableObject, IDataErrorInfo
        private struct AddressRecordData
          public string street;
          public string city;
          public string state;
          public string zip;

        private AddressRecordData permanentRecord;
        private AddressRecordData tempRecord;

        private bool _inEdit = false;
        private IList _container;

        private Hashtable errors = new Hashtable();

        public AddressRecord( AddressList container )
          _container = container;

        public string Street
            return ( _inEdit ) ? tempRecord.street :
            if ( value.Length == 0 )
              errors[ "Street" ] = "Street cannot be empty";
              errors.Remove( "Street" );
            if ( _inEdit )
              tempRecord.street = value;
              permanentRecord.street = value;
              int index = _container.IndexOf( this );
              _container[ index ] = this;

        public string City
            return ( _inEdit ) ? tempRecord.city :
            if ( value.Length == 0 )
              errors[ "City" ] = "City cannot be empty";
              errors.Remove( "City" );
            if ( _inEdit )
              tempRecord.city = value;
              permanentRecord.city = value;
              int index = _container.IndexOf( this );
              _container[ index ] = this;

        public string State
            return ( _inEdit ) ? tempRecord.state :
            if ( value.Length == 0 )
              errors[ "State" ] = "City cannot be empty";
              errors.Remove( "State" );
            if ( _inEdit )
              tempRecord.state = value;
              permanentRecord.state = value;
              int index = _container.IndexOf( this );
              _container[ index ] = this;

        public string Zip
            return ( _inEdit ) ? tempRecord.zip :
            if ( value.Length == 0 )
              errors["Zip"] = "Zip cannot be empty";
              errors.Remove ( "Zip" );
            if ( _inEdit )
              tempRecord.zip = value;
              permanentRecord.zip = value;
              int index = _container.IndexOf( this );
              _container[ index ] = this;
        public void BeginEdit( )
          if ( ( ! _inEdit ) && ( errors.Count == 0 ) )
            tempRecord = permanentRecord;
          _inEdit = true;

        public void EndEdit( )
          // Can't end editing if there are errors:
          if ( errors.Count > 0 )

          if ( _inEdit )
            permanentRecord = tempRecord;
          _inEdit = false;

        public void CancelEdit( )
          errors.Clear( );
          _inEdit = false;

        public string this[string columnName]
            string val = errors[ columnName ] as string;
            if ( val != null )
              return val;
              return null;

        public string Error
            if ( errors.Count > 0 )
              System.Text.StringBuilder errString = new
              foreach ( string s in errors.Keys )
                errString.Append( s );
                errString.Append( ", " );
              errString.Append( "Have errors" );
              return errString.ToString( );
              return "";

    That's several pages of codeall to support features already implemented in the DataSet. In fact, this still doesn't have all the DataSet features working properly. Interactively adding new records to the collection and supporting transactions require some more hoops for BeginEdit, CancelEdit, and EndEdit. You need to detect when CancelEdit is called on a new object rather than a modified object. CancelEdit must remove the new object from the container if the object was created after that last BeginEdit. It requires more modification to the AddressRecord and a couple event handlers added to the AddressList class.

    Finally, there's the IBindingList interface. This interface contains more than 20 methods and properties that controls query to describe the capabilities of the list. You must implement IBindingList for read-only lists or interactive sorting, or to support searching. That's before you get to anything involving navigation and hierarchies. I'm not even going to add an example of all that code.

    Several pages later, ask yourself, do you still want to create your own specialized collection? Or do you want to use a DataSet? Unless your collection is part of a performance-critical set of algorithms or must have a portable format, use the DataSetespecially the typed DataSet. It will save you tremendous amounts of time. Yes, you can argue that the DataSet is not the best example of object-oriented design. Typed DataSets break even more rules. But this is one of those times when your productivity far outweighs what might be a more elegant hand-coded design.


