You Asked
I have a table from a 3rd party application that is used to track
an order through the various manufacturing operations. A subset of
the information looks like this:
ORDER OPN STATION CLOSE_DATE
----- --- ------- ----------
12345 10 RECV 07/01/2003
12345 20 MACH1 07/02/2003
12345 25 MACH1 07/05/2003
12345 30 MACH1 07/11/2003
12345 36 INSP1 07/12/2003
12345 50 MACH1 07/16/2003
12345 90 MACH2 07/30/2003
12345 990 STOCK 08/01/2003
Where each row is a process that the order had to go through,
with OPN being the order of the processes.
What I would like to receive is the output grouped by consecutive
STATION values and include the start and close dates for each
STATION group. The start date is defined as the date the prior
station closed. So the output expected from the above data subset
would be:
ORDER STATION START_DATE CLOSE_DATE
----- ------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/01/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003
Is this possible? I've tried using analytics, but I can't seem to
get what I want. I can use the LAG function to get the start and
close dates, grouped by STATION, but it will group all the different
STATION values together (i.e. all MACH1 STATIONS will be grouped
together), not just the consecutive STATION values. I could use
procedural code to get this answer, but I was wanting to see if
it could be done in 1 statement.
I'm sure it will be something easy, but I've been racking my tiny
brain over this for the last few days and can't come up with a
solution. Can you help?
Many thanks,
Michael T.
and we said...
So, we want to keep rows that are:
a) the first row in the partition "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"
This query does that:
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date,
5 decode( lead_station, station, 1, 0 ) first_of_pair,
6 decode( lag_station, station, 1, 0 ) second_of_pair
7 from (
8 select order#,
9 lag(station) over (partition by order# order by close_date)
10 lag_station,
11 lead(station) over (partition by order# order by close_date)
12 lead_station,
13 station,
14 close_date,
15 lag(close_date) over (partition by order# order by close_date)
16 lag_close_date,
17 lead(close_date) over (partition by order# order by close_date)
18 lead_close_date
19 from t
20 )
21 where lag_station is null
22 or lead_station is null
23 or lead_station <> station
24 or lag_station <> station
25 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
12345 RECV 07/01/2003 0 0
12345 MACH1 07/01/2003 07/02/2003 1 0
12345 MACH1 07/05/2003 07/11/2003 0 1
12345 INSP1 07/11/2003 07/12/2003 0 0
12345 MACH1 07/12/2003 07/16/2003 0 0
12345 MACH2 07/16/2003 07/30/2003 0 0
12345 STOCK 07/30/2003 08/01/2003 0 0
7 rows selected.
we can see with the 1's the first/second of a pair in there. All we need to do now is "reach
forward" for the first of a pair and grab the close date from the next record:
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date
5 from (
6 select order#,
7 station,
8 lag_close_date,
9 decode( lead_station,
10 station,
11 lead(close_date) over (partition by order# order by close_date),
12 close_date ) close_date,
13 decode( lead_station, station, 1, 0 ) first_of_pair,
14 decode( lag_station, station, 1, 0 ) second_of_pair
15 from (
16 select order#,
17 lag(station) over (partition by order# order by close_date)
18 lag_station,
19 lead(station) over (partition by order# order by close_date)
20 lead_station,
21 station,
22 close_date,
23 lag(close_date) over (partition by order# order by close_date)
24 lag_close_date,
25 lead(close_date) over (partition by order# order by close_date)
26 lead_close_date
27 from t
28 )
29 where lag_station is null
30 or lead_station is null
31 or lead_station <> station
32 or lag_station <> station
33 )
34 where second_of_pair <> 1
35 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/01/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003
6 rows selected.
and discard the second of pairs row
That is another way to do it (and an insight into how I develop analytic queries -- adding extra
columns like that just to see visually what I want to do)