Data Flow Diagram (DFD) provides a visual representation of the flow of information (i.e. data) within a system. By creating a Data Flow Diagram, you can tell the information provided by and delivered to someone who takes part in system processes, the information needed in order to complete the processes and the information needed to be stored and accessed. Data Flow Diagram is widely-used in software engineering. You can use DFD in modeling information systems. This article describes and explain Data Flow Diagram (DFD) by using a customer service system as an example.
The CS System Example
The data flow diagram is a hierarchy of diagram consist of:
- Context Diagram (conceptually level zero)
- The Level-1 DFD
- And possible Level-2 DFD and further levels of functional decomposition depending on the complexity of your system
The figure below shows a context Data Flow Diagram that is drawn for a railway company's Customer Service System. It contains a process (shape) that represents the system to model, in this case, the "CS System". It also shows the participants who will interact with the system, called the external entities. In this example, CS Assistant and Passenger are the two entities who will interact with the system. In between the process and the external entities, there are data flow (connectors) that indicate the existence of information exchange between the entities and the system.
Context DFD is the entrance of a data flow model. It contains one and only one process and does not show any data store.
Level 1 DFD
The figure below shows the level 1 DFD, which is the decomposition (i.e. break down) of the CS System process shown in the context DFD. Read through the diagram and then we will introduce some of the key concepts based
The CS System Data Flow Diagram example contains four processes, two external entities and four data stores. Although there is no design guidelines that governs the positioning of shapes in a Data Flow Diagram, we tend to put the processes in the middle and data stores and external entities on the sides to make it easier to comprehend.
Based on the diagram, we know that a Passenger can receive Transport details from the Inquiry Transport Details process, and the details are provided by the data stores Transport Details and Railway Live Statistic. While data stored in Transport Details are persistent data (indicated by the label "D"), data stored in Railway Live Statistic are transient data that are held for a short time (indicated by the label "T"). A callout shape is used to list out the kind of details that can be inquired by passenger.
CS Assistant can initiate the Buy Souvenir process, which will result in having the Order details stored in the Order data store. Although customer is the real person who buy souvenir, it is the CS Assistant who accesses the system for storing the order details. Therefore, we make the data flow from CS Assistant to the Buy Souvenir process.
CS Assistant can also initiate the Buy Ticket process by providing Order details and the details will be stored again in the Order data store. Data Flow Diagram is a high level diagram that is drawn with a high degree of abstraction. The data store Order which is drawn here does not necessarily imply a real order database or order table in a database. The way how order details are stored physically is to be decided later on when implementing the system.
Finally, CS Assistant can initiate the Report Lost process by providing the Incident and item details and the information will be stored in the Lost Item database.
Data Flow Diagram Tips and Cautions
Stating the type of data with D, M and T
Each data store which is drawn in a Data Flow Diagram are prefixed by a letter, which is 'D' by default. The letter indicates the kind of data the data store holds. The letter 'D' is used to represent a persistent computerized data, which is probably the most common kind of data type in a typical information system. Besides computerized data, data can also be held for a short time in temporary. We call this kind of data transient data and is represented by letter 'T'. Sometimes, data is stored without the use of a computer. We call this kind of data manual data and is represented by letter 'M'. Finally, if the data is stored without using computer and also is held for a short time, this is known as manual transient data and is represented by T(M).
Be aware of the level of details
In this Data Flow Diagram example, the word "details" is used many times when labeling data. We have "transport details" and "order details". What if we write them explicitly as "route information, train times and delays", "souvenir name, quantity and amount" and "ticket type and amount"? Is this correct? Well, there is no definite answer to this question but try to ask yourself a question when making a decision. Why are you drawing a DFD?
In most cases, Data Flow Diagram is drawn in the early phase of system development, where many details are yet to be confirmed. The use of general terminologies like "details", "information", "credential" certainly leave room for discussion. However, using general terms can be kind of lacking details and make the design lost its usefulness. So it really depends on the purpose of your design.
In a Data Flow Diagram, we focus on the interactions between the system and external parties, rather than the internal communications among interfaces. Therefore, data flows between interfaces and the data stores used are considered to be out of scope and should not be shown in the diagram.
Don't mix up data flow and process flow
Some designers may feel uncomfortable when coming across a connector connecting from a data store to a process, without showing the step of data request being specified on the diagram. Some designers will attempt to put a request attached to the connector between a process and a data store, labeling it "a request" or "request for something", which is surely unnecessary.
Keep in mind that Data Flow Diagram was designed for representing the exchange of information. Connectors in a Data Flow Diagram are for representing data, not for representing process flow, step or anything else. When we label a data flow that ends at a data store "a request", this literally means we are passing a request as data into a data store. Although this may be the case in implementation level as some of the DBMS do support the use of functions, which intake some values as parameters and return a result, however, in data flow diagram, we tend to treat data store as a sole data holder that does not possess any processing capability. If you want to model the system flow or process flow, you could use either Activity Diagram or BPMN Business Process Diagram instead. If you want to model the internal structure of data store, you may use Entity Relationship Diagram.