Pipelined Functions
Can you illustrate the usage of pipelined functions with a simple (EMP, DEPT) example? Under what circumstances can using a pipelined function be effective?
Pipelined functions are simply code that you can pretend is a database table. Pipelined functions give you the (amazing, to me) ability to use SELECT * FROM <PLSQL_FUNCTION>;.
Anytime you think you have the ability to use SELECT * from a function instead of a table, it might be useful. Consider the following extract/ transform/load (ETL) process, whereby a flat file is fed into a PL/SQL function that transforms it and the transformed data is then used to update existing table data. It demonstrates quite a few database features, including external tables, pipelined functions, and MERGE.
To create and use an external table, I need to use a directory object. I'll start by mapping a directory object to the temp directory:
SQL> create or replace 2 directory data_dir as '/tmp/' 3 / Directory created.
Now, I'll create the external table. Part of its definition looks like a control file, because part of creating an external table is, in effect, creating a control file:
SQL> create table external_table 2 (EMPNO NUMBER(4) , 3 ENAME VARCHAR2(10), 4 JOB VARCHAR2(9), 5 MGR NUMBER(4), 6 HIREDATE DATE, 7 SAL NUMBER(7, 2), 8 COMM NUMBER(7, 2), 9 DEPTNO NUMBER(2) 10 ) 11 ORGANIZATION EXTERNAL 12 (type oracle_loader 13 default directory data_dir 14 access parameters 15 (fields terminated by ',') 16 location ('emp.dat') 17 ) 18 / Table created.
Now I'll use the flat utility to create a flat file from my EMP table data. You can find the flat utility at asktom.oracle.com/~tkyte/flat.
SQL> host flat scott/tiger - > emp > /tmp/emp.dat
Now I am ready to test the external table; the flat file I created now works just like a database table:
SQL> select empno, ename, hiredate 2 from external_table 3 where ename like '%A%' 4 / EMPNO ENAME HIREDATE ---------- ------ --------- 7499 ALLEN 20-FEB-81 7521 WARD 22-FEB-81 7654 MARTIN 28-SEP-81 7698 BLAKE 01-MAY-81 7782 CLARK 09-JUN-81 7876 ADAMS 12-JAN-83 7900 JAMES 03-DEC-81 7 rows selected.
I'll set up a PL/SQL ETL routine to ingest the flat file and output live data to be merged or inserted. A pipelined function needs to return a collection type, and I want to return a collection that looks like the EMP table itself, so I create the scalar object type and then I create a table of that type:
SQL> create or replace type 2 emp_scalar_type as object 3 (EMPNO NUMBER(4) , 4 ENAME VARCHAR2(10), 5 JOB VARCHAR2(9), 6 MGR NUMBER(4), 7 HIREDATE DATE, 8 SAL NUMBER(7, 2), 9 COMM NUMBER(7, 2), 10 DEPTNO NUMBER(2) 11 ) 12 / Type created. SQL> create or replace type 2 emp_table_type as table 3 of emp_scalar_type 4 / Type created.
Now I am ready to create the pipelined function itself. Note that the ETL function below is very simplistic; it involves modifying the ename column, but you can include any complex logic you want, including the ability to log error records and the like:
create or replace function emp_etl (p_cursor in sys_refcursor) return emp_table_type PIPELINED as l_rec external_table%rowtype; begin loop fetch p_cursor into l_rec; exit when (p_cursor%notfound); -- validation routine -- log bad rows elsewhere -- lookup some value -- perform conversion pipe row( emp_scalar_type(l_rec.empno, LOWER(l_rec.ename), l_rec.job, l_rec.mgr, l_rec.hiredate, l_rec.sal, l_rec.comm, l_rec.deptno) ); end loop; return; end; / Function created.
The emp_etl pipelined function works just like a table. The following query selects columns (empno, ename) from the function, and the function selects all columns from the external table:
SQL> select empno, ename 2 from TABLE(emp_etl( 3 cursor(select * 4 from external_table 5 ) ) ) 6 where ename like '%a%'; EMPNO ENAME ---------- ------ 7499 allen 7521 ward 7654 martin 7698 blake 7782 clark 7876 adams 7900 james 7 rows selected.
Note the use of the keyword PIPELINED in the definition of this function; the keyword is mandatory in the making of a pipelined function. Also note the use of the pipe row directive in PL/SQL—that is the magic that makes a pipelined function really interesting. The pipe row directive returns data to the client immediately, meaning that I am getting output from this function in my client routine before the function generates the last row of data. If the cursor I send to this pipelined function returns 1,000,000 rows, I will not have to wait for PL/SQL to process all 1,000,000 rows to get the first row; data will start coming back as soon as it is ready. That is why these are called pipelined functions: Data streams—as if in a big pipe—from the cursor to the PL/SQL function to the caller.
Now, to finish the job, I'll create a table of data I would like to refresh from the source system, which sends me the flat file I produced above. The logic is as follows: If the record already exists in my database, UPDATE the ename and the sal columns; if the record does not exist, INSERT it. I'll start with some of the data from the EMP table:
SQL> create table emp as 2 select * from scott.emp 3 where mod(empno,2) = 0; Table created.
And here is the MERGE, which manages data from the flat file, through ETL, straight to the table, without hitting the disk with staging files:
SQL> merge into EMP e1 2 using (select * 3 from TABLE 4 (emp_etl( 5 cursor(select * 6 from external_table)) 7 ) 8 ) e2 9 on (e2.empno = e1.empno) 10 when matched then 11 update set e1.sal = e2.sal, 12 e1.ename = e2.ename 13 when not matched then 14 insert (empno, ename, job, mgr, 15 hiredate, sal, comm, deptno) 16 values (e2.empno, e2.ename, 17 e2.job, e2.mgr, 18 e2.hiredate, e2.sal, 19 e2.comm, e2.deptno) 20 / 14 rows merged.