设计SSIS包的OVAL原则
Created as a development framework for SSIS packages, OVAL principles of package design encompass four facets of SSIS applications. These design considerations include identifying the Operations to be performed; the data Volume to be processed (in production) ; the Application of the right tools, tasks, sequence, and flow; and Location -determining where the SSIS application will run.
(1) Best practices design concerning data conversion is first to sharpen data types among your data sources and targets to meet your business requirements most efficiently and avoid unnecessary data type conversions.
(2) Another best practices discipline is to remove redundant or unused columns from your data sources.
(3) Another best practices approach is to remove redundant columns after every asynchronous component.
(4) If you can filter your rows in the WHERE clause of your data source, that is optimal for your design. Sometimes, however, you might need to filter rows farther into the data flow and, if so, use the Filter task.
(5) Rather than reading data from multiple database tables within a single source database and using the Merge Join task within an SSIS package, it is significantly better to create DBMS views or stored procedures and exploit the power of the SQL Server database engine to prepare the data for other SSIS operations. Joining at the database level is one advantage, but you can also take advantage of eliminating Sort tasks, Conditional Splits, and Filter and Derived Columns tasks by including ORDER BY and basic data cleansing using ISNULL, NULLIF, and TRIM options.
(6) Remember, you can generate an SSIS package for Bulk Import by using the Data Import Wizard within Microsoft SQL Server Management Studio
(7) Using Variables
(8)Using the Lookup Task versus Merge Join
(9) Using Database Snapshots
(10)When using parallel-design techniques, always remember to allocate enough threads for server processing. The EngineThreads property is found on the Data Flow task in the control flow component. Number of threads = Number of sources + Execution trees
(11) Use Row Counts to show whether you have all the rows expected.
(12)Break out transformation logic specific to your environment.
(13)Always configure to handle lookup errors (especially for when a fact table business key has no corresponding entry in the dimension table).
(14)Always perform iterative design and development and testing methodology such as Agile, rapid application deployment (RAD), Extreme, and Microsoft Solutions Framework (MSF), which all promote modularity and short cycles.
(15)Break complex, multisource, multidestination ETL into logically distinct packages (versus monolithic design).
(16)Separate subprocesses within a package into separate containers. This is a more elegant, easier way to develop and allows you to disable whole containers simply when debugging.
(17)Use Script Task/Transform for one-off problems.
(18)Build custom components to reduce redundant script logic.