Geospatial data on enterprises challenges to geolocate enterprises - - PowerPoint PPT Presentation

geospatial data on enterprises challenges to geolocate
SMART_READER_LITE
LIVE PREVIEW

Geospatial data on enterprises challenges to geolocate enterprises - - PowerPoint PPT Presentation

Geospatial data on enterprises challenges to geolocate enterprises and their local units for statistical purposes Zrinka Pavlovi Head of Statistical Business Register, Classifications, Sampling, Statistical Methods and Analyses


slide-1
SLIDE 1

Geospatial data on enterprises – challenges to geolocate enterprises and their local units for statistical purposes

Zrinka Pavlović

Head of Statistical Business Register, Classifications, Sampling, Statistical Methods and Analyses Department

INSPIRE Conference 2016 Barcelona

slide-2
SLIDE 2
  • Lifecycle

– Birth, existence, death – demography/business demography statistics – Enterprises can be transformed through the years

  • “ same enterprise or not?”
  • Location

– Residence/ headquarter address – Enterprises can operate in many locations – local units

  • Both in focus of statistics

Strana 2

Enterprises vs. people

slide-3
SLIDE 3
  • Allocation of enterprises within the country

– Cities/parts of cities – Territorial / climate / relief characteristics – Other

  • Is “doing business” easier in some parts of country?
  • Is more enterprises born, survive longer, grow faster or steadier or

die faster in some parts of country?

  • Are enterprises more successful in some parts of country?
  • Which activities are present or missing in some areas?
  • Why?
  • Should government put more effort and assets in improving

conditions for “doing business” in some areas to stop depopulation?

Strana 3

Enterprises vs. teritory

slide-4
SLIDE 4
  • Statistics can give a lot of information about business population

related to location (number of units, demography, density, activities, employment, efficiency, etc.)

  • Precondition :

– Accurate – Comprehensive – Good coverage – Relevant – Good quality

Strana 4

Statistical observation of enterprises Business register

slide-5
SLIDE 5

Legal background

  • Commission Regulation 177/2008 requires that Business

Register must contain geographical location code on local units

2.11 Geographical location code Purpose: The geographical location code complements the address and postal codes (2.2) and can be used to derive classifications relating to the geographical location of units at the most detailed level. Other national classifications such as administrative regions, travel-to-work areas, health

  • r education regions etc. can also be derived from it.

Strana 5

slide-6
SLIDE 6

Merging statistics and geospatial information Strana 6

Geolocation of enterprises

Settlement (city/village), Code Street name House number Appendix to house number Only settlement code is unique and official and used by majority of administrative sources. Other parts of the address are in free text form Address data consist of:

slide-7
SLIDE 7
  • Statistical Business Register is compiled from several sources

which provide address data :

Strana 7

Sources of address data

Statistical Business Register

Administrative Business Register Central Register of Craft Businesses Tax Administration Commercial Court Register

slide-8
SLIDE 8
  • Central Register of Territorial Units – State Geodetic

Administration (SGA) in the Republic of Croatia

  • Statistical Spatial Register in Croatian Bureau of Statistics is

updated from SGA register

  • Not all administrative registers and administrative records are

directly connected or updated with SGA register data

Strana 8

Sources of spatial data

slide-9
SLIDE 9
  • Goal:

Geolocate enterprises and their local units by assigning them geocode

  • Activities:

– Establish system that will enable every new address entry in SBR to get a unique official street code, street name and geocode. – Develop application that will automatically or by clerk intervention match new address entries with official addresses – Coding existing addresses in Statistical Business Register – Publish geographical presentation of selected SBR data on the CBS website

Strana 9

Project – merging statistics and geospatial information

slide-10
SLIDE 10

Starting point

  • SBR contains addresses for enterprises and their local units that had to

be coded: – More then 500.000 enterprises (regardless of activity status) – More then 600.000 local units (98.000 <> legal unit address) – More then 1.100.000 address information – 109.215 different forms of street names within certain settlements – 68.863 different forms of street names regardless of settlement

  • In the Statistical Spatial Register:

– 51.798 streets in 6759 settlements

Strana 10

slide-11
SLIDE 11

Strana 11

Existing situation

slide-12
SLIDE 12
  • Street names in Statistical Business Register are stored in the

version as in administrative sources – not official name of the street as in Register of Territorial Units. E.g. :

  • Many variations of the same street name:

Strana 12

Street code Street name Sett.code

  • Sett. name

0721501371 ULICA KNEZA BRANIMIRA 072150 Zagreb

Existing situation

slide-13
SLIDE 13

Conditions:

  • In order to assign code to these forms of name of the same

street in the same town, all this forms should be stored in the Thesaurus.

  • Matching of names must be 100% because one letter means

difference.

Strana 13

ULICA MARIJA BABIĆA Ulica M. Babića ULICA NENADA BABIĆA Ulica N. Babića ULICA MATIJE BAKIĆA Ulica M. Bakića ULICA VOJINA BAKIĆA Ulica V. Bakića

slide-14
SLIDE 14
  • 1. Street name

without code + settlement code

Strana 14

3a) Entry is found in the list of units in selected settlement ‐updating of thesaurus 2b) Street is not identified automatically – mannual connecting is needed

SBR

MODULE FOR CODING OF STREETS MANNUAL CONNECTING ENTRIES WITH OFFICIAL STREET NAME

2a) Street is identified and code is selected. Code of stret saved in SBR Thesaurus of street names 3b) updating of SBR with street code 4) Entry is not found in the list of units in selected settlement ‐investigation needed – possible miss‐linking of settlement and street Data on settlement code is changed in SBR

slide-15
SLIDE 15
  • Complex spatial situation with many incorrect address entries:

– Streets that does not belong to settlement as registered – Streets that does not exist in Stat. Spatial Register

  • Address contain not valid/former street name
  • Address contain never – existing street name

– House number is not registered in Stat. Spatial Register

  • Various format of street names and house numbers with

appendices – Street name contain house numbers and appendices – House number is mixed with appendices – Appendices are separated with different characters

Strana 15

Challenges faced during project:

slide-16
SLIDE 16
  • House numbers and appendices in different format , non –

existing house numbers in the Register of Territorial Units

  • Non existing unique and comprehensive data base of historical

street names

  • The most difficult cases – units that are no more active
  • Too many clarical work to investigate all problematic street

names = focus on active units

Strana 16

Challenges faced during project (cont.)

slide-17
SLIDE 17
  • Normalising of data from a both data bases needed (removing special

characters, double spaces, normalising usual abbreviations and some very common street titulars.

  • Automatic matching at the beginning of the project: a little bit above 40 % for

unique street names with little possibility to increase automatic procedure since

  • nly 100 % match of cleaned data counts (one character might mean

difference).

  • Clerical matching the rest of non-coded streets
  • Thesaurus filled with normalised data and pairs of street names (Stat. Business

Register and official street name from Stat. Spatial Register) and consecutively supplemented with new manual entries

  • Many dead units with non existing addresses – disregarded
  • Searching for files with changed street names in several institutions (in larger

cities)

Strana 17

Way forward.....

slide-18
SLIDE 18
  • IT Application developed for coding streets with thesaurus of

streets – Stand-alone aplication consisting of three modules:

  • Street data-base from Statistical Spatial Register
  • Module for matching street names that enables:

– Transmission of non-coded streets from the source data-base – Automatic coding of streets – Module for manual matching of street names

  • Thesaurus of street names
  • 98% Addresses of active enterprises coded with street codes

Strana 18

Results:

slide-19
SLIDE 19
  • Automatic procedure developed for assigning geocodes from

Statistical Spatial Register data base, based on street code and house number – by the moment of entering in the SBR data base – Batch updating – Manual updating

  • Upgraded Statistical Business Register with additional attributes

Merging statistics and geospatial information Strana 19

Results (cont.):

slide-20
SLIDE 20
  • Business Demography Statistics convenient for geographical

presentation

  • Available tools for presentation:

– ArcMap 10 – Geostat Portal of CBS

Strana 20

Geographical presentation of Business Register Data

slide-21
SLIDE 21
  • Addresses of enterprises are in many cases problematic –

identified main types of mistakes – new system enables detecting mistakes and gives opportunity to correct them

  • Accuracy of spatial data varies between counties
  • Spatial registers need to be improved in following years,

historical data needed.

  • Administrative sources should be contacted in order to put much

more focus on accuracy of their address data (link of settlement and street) – connection to official Register of territorial units

Strana 21

Lessons learned

slide-22
SLIDE 22
  • Module for coding streets with thesaurus and functionality for

manual matching and supplementing content of thesaurus was developed as separate application

  • Envisaged use for other users in CBS – e.g. Farm Register,

Population Register, Population Census

  • Possible availability for other institutions – administrative

sources of SBR  in-coming data of better quality?

Strana 22

Potential reuse

slide-23
SLIDE 23

Upišite naziv prezentacije Strana 23

Thank you for your attention!

Zrinka Pavlović

Head of Statistical Business Register, Classifications, Sampling, Statistical Methods and Analyses Department Tel: +385 (0)1 4893 514 E-mail: pavlovicz@dzs.hr