Can we use big data for skills anticipation and matching? The case - - PowerPoint PPT Presentation

can we use big data for skills anticipation and matching
SMART_READER_LITE
LIVE PREVIEW

Can we use big data for skills anticipation and matching? The case - - PowerPoint PPT Presentation

Can we use big data for skills anticipation and matching? The case of Online Job Vacancies Fabio Mercorio, PhD in AI Assistant Professor in AI and Data Science University of Milano-Bicocca Geneva, September 20 th , 2019 The CRISP Centre


slide-1
SLIDE 1

“Can we use big data for skills anticipation and matching?” The case of Online Job Vacancies

Fabio Mercorio, PhD in AI Assistant Professor in AI and Data Science University of Milano-Bicocca Geneva, September 20th, 2019

slide-2
SLIDE 2

The Interuniversity research centre on public services (CRISP) is an interdisciplinary academic network of Universities in Milan, guided by Unimib

2

The CRISP Centre @Unimib

Research Goal. To study and support policy and decision makers in the analysis of socio-economic phenomena through novel AI, Big Data and Statistics algorithms and pipelines, by processing statistical, administrative and web data as well.

slide-3
SLIDE 3
  • (2014-2016) Cedefop I [Prototype real-time skill and vacancies analysis]. 5 countries

(United Kingdom, Ireland, Czech Republic, Italy and Germany). Granted by Cedefop

  • (2016-ongoing) Cedefop II [Development of Cedefop I real-time system for Europe].

28 EU Countries, 32 languages supported. Granted by Cedefop

  • (2016-ongoing) ETF [Guide on putting Big Data into LMI + source selection for Tunisia

and Morocco], Granted by The European Training Foundation

  • (2016-ongoing) Italian Digital Competences Observatory 2017, 2018 and 2019 [Estimate

Skill Impact on ICT jobs] granted by Italian ICT Unions.

  • (2018-ongoing) EXCELSIOR [Put OJV into Official Occupation Statistics] granted by The

Italian Unions of chambers of Commerce, Industry, Crafts and Agriculture

  • (2017-ogoing) REPLY-VET [Strengthening key competencies of low-skilled people in

VET to cover future replacement positions]. Erasmus+

3

Some Research Projects on Big Data for LM [selection]

slide-4
SLIDE 4

Labour Market Challenging Factors

1. Skills Evolution 2. New Emerging Occupations 3. Job Automatisation/Replacement 4. Mobility

LM CHALLENGING FACTORS

1. Updated information (near-real-time) 2. Data driven decisions (let data speak) 3. Process LM data at scale

LM NEEDS

Labour Market Intelligence (LMI): Design, define and implement AI-based framework and algorithms to derive knowledge from labour market information

slide-5
SLIDE 5

“Can we use big data for skills anticipation and matching?”

  • Occupation and Skill Discovery: Focus on occupations and skills

requested by the online-LM

  • Soft/Digital/Hard Skill Rates: How to estimate the impact of

digitalization within occupations?

  • New Emerging Occupations on the basis of skill (dis)similarities
  • Training Course Design through skills identification
  • Taxonomy Extension: Improve skills/occupations taxonomy through
slide-6
SLIDE 6

How to deal with OJVs at scale?

slide-7
SLIDE 7

Web Job Vacancy example

Job Title: Data Scientist. Description: We’re looking for a talented Computer Scientist to join our growing development team. Your expertise in data will help us take this to the next level. You will be responsible for identifying opportunities to further improve how we connect recruiters with jobseekers, and designing and implementing solutions. […] Required skills and experience:

  • SQL and relational databases;
  • Data analysis with R (or Matlab);
  • Processing large data sets with MapReduce and Hadoop);
  • Real time analytics with Spark, Storm or similar;
  • Machine Learning;
  • Natural Language Processing (NLP) and text mining;
  • Development in C++, Python, Perl;
  • Experience with search engines e.g. Lucene/Solr or ElasticSearch advantageous

8

slide-8
SLIDE 8

Web Job Vacancy example

Job Title: Data Scientist. Description: We’re looking for a talented Computer Scientist to join our growing development team. Your expertise in data will help us take this to the next level. You will be responsible for identifying opportunities to further improve how we connect recruiters with jobseekers, and designing and implementing solutions. […] Required skills and experience:

  • SQL and relational databases;
  • Data analysis with R (or Matlab);
  • Processing large data sets with MapReduce and Hadoop);
  • Real time analytics with Spark, Storm or similar;
  • Machine Learning;
  • Natural Language Processing (NLP) and text mining;
  • Development in C++, Python, Perl;
  • Experience with search engines e.g. Lucene/Solr or ElasticSearch advantageous

ISCO/ ESCO Occ. classified on linkage ESCO SKILL contained Web SKILL Novel Occ. c l a s s i f i e d

  • n

Job Vac.

9

NEW SKILL

slide-9
SLIDE 9

The Methodological path

Data pre processing, Trasformation and cleansing Classification Skills extraction Analysis and Data Visualisation Source selection, Raking and data Ingestion

10

slide-10
SLIDE 10

ITALIAN Real-Time Labour Market Monitor

Big Data AI Eco+ Stat Labour Market Intelligence

OJV since 2013 – 4M+ vacancies unique

11

slide-11
SLIDE 11

EUROPEAN Real-Time Labour Market Monitor

Big Data AI Eco+ Stat Labour Market Intelligence

28 EU Countries – 32 Languages – more than 6M unique vacancies per month

12

slide-12
SLIDE 12

Occupation and Skill Discovery: Focus on

  • ccupations and skills requested by the
  • nline-LM
slide-13
SLIDE 13

Li Live e demo emo: : Te Territorial di dimens nsion

Credits to WollyBI: a trademark of TabulaeX

slide-14
SLIDE 14

Li Live e demo emo: : sk skills ills ga gap

Credits to WollyBI: a trademark of TabulaeX

slide-15
SLIDE 15

Soft/Digital/Hard Skill Rates: How to estimate the impact of digitalization within occupations?

slide-16
SLIDE 16

Compute Skills Rates

Goal: Estimate the pervasiveness of ICT in both ICT and not ICT- related jobs Idea: Exploit the informative power of Classified OJV for computing The Digita Skill Rate (DSR), Soft skill rate and Hard non digital Skill Rate DSR estimates the incidence of digital skills in a single profession and comes from observing the pervasiveness of digital skills in all professions whether they are related to the ICT world or not.

slide-17
SLIDE 17

SOURCE WOLLYBI

Co Comp mpute e Ski kills Ra Rates es

Demand of digital, specialist and soft skills - by sector

Ad hoc analyses at different level of granularity, here focusing on the «sector»….

Credits to WollyBI: a trademark of TabulaeX

slide-18
SLIDE 18

SOURCE WOLLYBI

Co Comp mpute e Ski kills Ra Rates es

  • Applied and Management Skills = ability to use tools and software to manage both operational

and decisional processes

  • ICT Techniques Skill= very specialized on solutions, platforms and programming languages
  • Basic Skill = for everyday use of basic IT tools
  • Information Brokerage Skill = for the use of IT tools aimed at corporate communication

Applied and Management Skills Basic

ICT techniques

Information Brokerage Secretary personnel HR training specialist Graphic and multimedia designers Industrial and management engineers

… and more, looking at each occupation…

Credits to WollyBI: a trademark of TabulaeX

slide-19
SLIDE 19

SOURCE WOLLYBI

Co Comp mpute e Ski kills Ra Rates es [E [ESC SCO sk skills ills + + no novel]

… and more, looking at elementary skills

Occupation

HR training specialist

Applied and Management Skills Information Brokerage

Database usage ERP Digital data manage ment SEO Search Engine Optimiz. Social Networ k Usage

Applied and Management Skills Information Brokerage ICT techniques Occupation

Graphic and multimedia designers Database usage Program s for draughts man 3D modelling Front-end Website implementation Web programming Graphic Software Usage SW markup usage

Credits to WollyBI: a trademark of TabulaeX

slide-20
SLIDE 20

New Emerging Occupations on the basis of skill (dis)similarities

slide-21
SLIDE 21

Detecting new emerging occupations through AI

Job Vacancy Source A Job Vacancy Source B Job Vacancy Source N ISCO IV digit Classifier Vacancies classified over code A of ISCO-08 Vacancies classified over code B of ISCO-08 Vacancies classified over code Z of ISCO-08 Word-embeddings Word-embeddings Word-embeddings Word Similarities on code A Word Similarities Word Similarities

  • n code B

Word Similarities

  • n code Z

Suggested new potential

  • ccupations

Human-AI interaction Approved new

  • ccupations
  • 1. Classify OJV over ISCO-iv digit
  • 2. Build-up several vector-space

representations of words (occupations and skills) to catch lexicon similarities between OJVs

  • 3. Compute similarities between

known terms (occupations and skills) and new ones

  • 4. Suggest new potential
  • ccupations for Human-AI

validation

slide-22
SLIDE 22

(Some) New Emerging Occupations

Data Scientist Cloud Computing Cyber Security Expert Business Intelligence Analyst Big Data Analyst Social Media Marketing

9,000 Web Job Vacancies collected related to (some) new emerging

  • ccupations (above) between Jan-2014 and August-2019
slide-23
SLIDE 23

DATA SCIENTIST – 1,7k vacancies 2014-2019

48%

MATH & STAT COMPUTING

BUS & ADM

39% 13%

HARD SKILLS

  • Data Analysis, Statistical Learning
  • SAS, R
  • SAP & SPSS*
  • SQL, Python, Hadoop
  • BI, Machine-Learning
  • Data Integration
  • Public Relations
  • Management
  • Clients Relations

Management 38%

BEHAVIOURS

FOREIGN LANGUAGES

PROBLEM SOLVING

29% 13% 7%

SOFT SKILLS

5%

COLLABORATION

LEADERSHIP

Variation 2019 vs 2018: +31% Variation 2019 vs 2017: +149%

slide-24
SLIDE 24

SOCIAL MEDIA SPECIALIST – 0,4k vacancies 2014-2019

55%

BUS & ADM COMPUTING

MATH & STAT

39% 16%

HARD SKILLS

  • Adobe Photoshop & HTML5
  • Google Analytics & AdWords*
  • CMS (Content Management

System)*

  • Management
  • Public Relations
  • Marketing

Data Analysis 42%

BEHAVIOURS

FOREIGN LANGUAGES 35% 7% 6%

SOFT SKILLS

4%

CREATIVE & ENTREPN. THINKING

PROBLEM SOLVING

INFORMATION & COMMUNICATION

Variation 2019 vs 2018: +105% Variation 2019 vs 2017: +123%

slide-25
SLIDE 25

New Occupations can be compared against traditional

  • nes using different

taxonomies (the Eropean Competence Framework in such a case)

slide-26
SLIDE 26

Training Course Design through skills identification http://tiny.cc/masterBI

slide-27
SLIDE 27

LMI for Course Design allows one to:

  • 1. Identify most requested occupational profiles related to the

course goals and characteristics (fine-grained geo level)

  • 2. Estimate the salary for those professions (course

attractiveness)

  • 3. Improve course programmes by including top skills

(soft/digital and hard) requested by LM

  • 4. Iteratively refine and improve the whole Course to stay

focused on real LM expectations

  • 5. Design paths according to student backgrounds guiding them

through the course

slide-28
SLIDE 28

Taxonomy Extension: Improve skills/occupations taxonomy through semantic similarities within OJVs

slide-29
SLIDE 29

Main Idea

2512 SW dev 2514 App Prog

Web Developer Application Engineer App Developer SW designer

ISCO iv digit ESCO v digit & alternative labels

Full-stack Developer Full-stack Engineer Data Migration Analyst SCRUM Tester

Mentions from OJV KEY Questions: How to… 1. maintain the taxonomy up-to-date with labour market expectation and lexicon? 2. enrich the taxonomy with those mentions to corresponding entities in the taxonomy? 3. Estimate similarities between all taxonomy entities? 4. Estimate the relevance of taxonomy entities?

slide-30
SLIDE 30

Compare different Web labour markets [IT, UK and DE here] to perform Skill Gap Analysis using the taxonomy as a baseline. Question: “starting from a given

  • ccupation of the ISCO taxonomy for

the ITA labour market, what is the

  • ccupation in the UK labour market

whose requested skills with a better fit?“ Skills associated to “Web Technician” in ITA are more similar to a “Web and Multimedia Developer” in UK Skills in common Skills GAP!!!

Extending Skills Taxonomy

slide-31
SLIDE 31

OJV COLLECTED FROM UK, IT AND DE IN 2018

Compare Different Countries (Web Technician)

slide-32
SLIDE 32

So Some e ot

  • ther on
  • ngoin
  • ing research

ch act ctivities…

H/S Skills as factor for Job Replacement: Is there a correlation between the request for hard/soft skill and the probability for a job to be replaced by computerisation? [Colombo, Mezzanzanica & Mercorio. AI Meets Labor Market: Exploring the Link Between Automation and Skills. Information Economics and Policy, 2019.] Explainable LMI: improve the believability of the analyses provided by explaining the behaviour of black-box AI algorithms in LMI [Under review] GraphLMI: Use data from Web vacancies to enrich ISCO/ESCO taxonomies with analytics (job/skills relevance) and mentions from the Market [Under review]

slide-33
SLIDE 33

Some references to our research

(take a picture on QRcode to get the paper

  • Notice. PDFs are for personal use only)

1. Colombo, Mezzanzanica & Mercorio. AI Meets Labor Market: Exploring the Link Between Automation and Skills. Information Economics and Policy, 2019. 2. Mezzanzanica & Mercorio: Big Data enables Labour Market

  • Intelligence. In Encyclopedia of Big Data Technologies. Springer, 2019.

3. Mezzanzanica et al. WoLMIS: a labor market intelligence system for classifying web job vacancies. Journal of Intelligent Information Systems, 2018. 4. Mezzanzanica, Mercorio et al. Using Machine Learning for Labour Market

  • Intelligence. In ECML PKDD 2017, LNCS. Springer, 2017

5. Mezzanzanica, Mercorio et al. A Language Modelling Approach for Discovering Novel Labour Market Occupations from the Web. In 2017 IEEE/WIC/ACM International Conference on Web Intelligence, 2017.

Thank you!!!