最近一直在看《SQL puzzles and answers》,其中EMPLOYMENT AGENCY一文让我眼前一亮。在这里把我对此文的概括与理解写下来与大家分享(代码在SQL Server 2005中测试通过)。
需求描述
对于人力资源网站或职业中介,会保存大量求职者的简历信息。通过让求职者勾选技能关键字或对简历进行分词都可以得到求职者所掌握的技能列表,而且可提供的技能关键字很多(书中提到的系统中的关键字来自于某职业字典,超过250000个)。
求职者的技能列表如下:
create table CandidateSkills ( candidate_id int not null, skill_code char(15) not null, primary key(candidate_id, skill_code) ); insert into CandidateSkills select 100, 'accounting' union all select 100, 'inventory' union all select 100, 'manufacturing' union all select 200, 'accounting' union all select 200, 'inventory' union all select 300, 'manufacturing' union all select 400, 'inventory' union all select 400, 'manufacturing' union all select 500, 'accounting' union all select 500, 'manufacturing';
另一方面,人力资源网站还会存储企业开放出来的职位信息,这些职位对于技能有不同的要求。例如:某职位要求技能为('inventory’ and ‘manufacturing’) or 'accounting’。现在的问题是:
1. 如何设计职位技能要求表(JobOrders)?
2. 如何把求职者技能(CandidateSkills)与职位技能要求(JobOrders)进行匹配?
职位要求表设计
首先我们来解决JobOrders表设计的问题。要表达技能间的逻辑关系,我们可以引入了skill_group列。JobOrders表的脚本如下:
create table JobOrders ( job_id int not null, skill_group int not null, skill_code char(15) not null, primary key(job_id, skill_group, skill_code) );
假设有如下职位需求:
Job 1 = (‘inventory’ and ‘manufacturing’) or ‘accounting’
Job 2 = (‘inventory’ and ‘manufacturing’) or (‘accounting’ and ‘manufacturing’)
Job 3 = ‘manufacturing’
Job 4 = (‘inventory’ and ‘manufacturing’ and ‘accounting’)
插入上述需求的脚本如下:
insert into JobOrders select 1, 1, 'inventory' union all select 1, 1, 'manufacturing' union all select 1, 2, 'accounting' union all select 2, 1, 'inventory' union all select 2, 1, 'manufacturing' union all select 2, 2, 'accounting' union all select 2, 2, 'manufacturing' union all select 3, 1, 'manufacturing' union all select 4, 1, 'inventory' union all select 4, 1, 'manufacturing' union all select 4, 1, 'accounting';
JobOrders中数据的规则如下:
- 对于相同job_id,相同skill_group的skill_code之间为‘与’关系。对于相同job_id,不同skill_group的skill_code之间为‘或’关系。
- 除此之外,还有一条约束为‘与’关系优先‘或’关系。
对于'manufacturing’ and (‘inventory’ or ‘accounting’)逻辑,可以转化为('manufacturing’ and ‘inventory’) or (‘manufacturing’ and ‘accounting’)。
职位要求与求职者的匹配
现在我们来解决如何匹配职位要求和求职者技能。
- 解决方案1
select distinct J1.job_id, C1.candidate_id from JobOrders as J1 inner join CandidateSkills as C1 on J1.skill_code = C1.skill_code group by C1.candidate_id, J1.skill_group, J1.job_id having count(*) >= (select count(*) from JobOrders as J2 where J1.skill_group = J2.skill_group and J1.job_id = J2.job_id);
- 解决方案2
With JobRequirementCnt as ( select job_id, skill_group, count(*) grp_cnt from JobOrders group by job_id, skill_group ), CandidateSkillGrpCnt as ( select CS.candidate_id, JO.job_id, JO.skill_group, count(*) grp_cnt from CandidateSkills CS inner join JobOrders JO on CS.skill_code = JO.skill_code group by CS.candidate_id, JO.job_id, JO.skill_group ) select distinct CandidateSkillGrpCnt.candidate_id, CandidateSkillGrpCnt.job_id from JobRequirementCnt inner join CandidateSkillGrpCnt on JobRequirementCnt.job_id = CandidateSkillGrpCnt.job_id and JobRequirementCnt.skill_group = CandidateSkillGrpCnt.skill_group and JobRequirementCnt.grp_cnt = CandidateSkillGrpCnt.grp_cnt;