Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu
Scalable Understanding of Multilingual Media Steve Renals - - PowerPoint PPT Presentation
Scalable Understanding of Multilingual Media Steve Renals - - PowerPoint PPT Presentation
Scalable Understanding of Multilingual Media Steve Renals University of Edinburgh Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu Funded by the EU H2020 ICT Programme under Grant Agreement 688139
Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu
http://summa-project.eu
SUMMA in a nutshell
- Significantly improve media monitoring, by
the automatic
- analysis of media streams across many
languages
- aggregation and distillation of stream content
- construction of knowledge bases from reported
facts
- supply of media data visualisations at scale
http://summa-project.eu
BBC Monitoring
300 journalists each monitoring up to 4 TV channels several online text sources
30 languages – most important include Russian Arabic Farsi
http://summa-project.eu
Big Data
- 250 video channels
- 2.5Tb/day, 19Tb/week, 1Pb/year
- BBC monitoring has access to
- 1,500 TV channels
- 1,350 radio sources
- But… ~700 free-to-air Arabic satellite channels,
increases at ~100/year
- Current monitoring processes are largely manual
and cannot keep up with the scale of the task
http://summa-project.eu
Use cases
- 1. External Media Monitoring
- identify emerging trends
- tracking people in the news
- monitoring the evolution of storylines
- 2. Internal Media Montoring
- manage multilingual content creation
- efficient reuse of content across languages
- 3. Data Journalism
- use SUMMA platform for data driven journalism
http://summa-project.eu
SUMMA Prototypes
Channel ID & native language Semantic Tag word cloud- size indicates current frequency across region/ group Segment Unique timestamp Player (Sd? HD?) Player controller - tag instances marked Tools - screen grab, snip video, save, attach Translated transcript “Now playing” text highlighted Tags shown underlined Add new tag - click pencil to ‘underline’, and enter text Segment machine analysis confidence (possibly better represented graphically?)
UI Concept 1
http://summa-project.eu
SUMMA Prototypes
http://summa-project.eu
SUMMA Prototypes
http://summa-project.eu
SUMMA Prototypes
http://summa-project.eu
Platform & Technologies
Speech recognition Machine translation Segmentation & clustering Ingest audio, video, text Identify entities & relations Summarisation & distillation Sentiment detection Visualisation & prototypes
http://summa-project.eu
Multilingual technologies
http://summa-project.eu
SUMMA system v0.1
Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu