Information Made Information Made Accountable Accountable The - - PowerPoint PPT Presentation

information made information made accountable accountable
SMART_READER_LITE
LIVE PREVIEW

Information Made Information Made Accountable Accountable The - - PowerPoint PPT Presentation

Information Made Information Made Accountable Accountable The Data Projection Model Michel Biezunski Infoloom mb@infoloom.com http://www.infoloom.com 1 Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010 Michel


slide-1
SLIDE 1

1

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Information Made Accountable Information Made Accountable

The Data Projection Model Michel Biezunski

Infoloom mb@infoloom.com http://www.infoloom.com

slide-2
SLIDE 2

2

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Michel Biezunski

IT Consultant/Innovator, dba Infoloom, based in New York. Created the Topic Maps paradigm. Initiator of the ISO/IEC 13250 standard . Major current project: TaxMap, an electronic delivery tool for IRS publications and forms.

Background: History/Philosophy of Science

slide-3
SLIDE 3

3

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Rationale

  • The problem

– Accountability works with financial information. – Accountability doesn't work so great with non- financial information.

  • The fix

– Use accounting-like approaches for information management.

slide-4
SLIDE 4

4

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Outline Outline

Theoretical Introduction: Why bother? The Data Projection Model: Double Entry Bookkeeping for Information. Application Examples.

slide-5
SLIDE 5

5

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Luca Pacioli

The father of accounting... and of other things

Luca Pacioli, painting attributed to Jacopo de' Barbari, 1495. http://www.art-prints-on-demand.com/kunst/_1123580422666163/alg55519.jpg

slide-6
SLIDE 6

6

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Luca Pacioli

1445-1514 (or 1517)

Worked on Perspective. “Invented” Double Entry Bookkeeping. Wrote on Accounting Ethics and Cost Accounting. Elementary Algebra Taught mathematics to Leonardo da Vinci. Wrote De Divina Proportione.

slide-7
SLIDE 7

7

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

RDF, the foundation of the Semantic Web

Object Predicate Subject (Resource) (Resource) Based on “triples”. Uses URIs to name the relationship between things (“Predicate”) as well as the 2 ends of the link.

slide-8
SLIDE 8

8

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Subject - Object

In the Quattrocento perspective system, the ideal point of localization is “the one which places as opposite, but parallel, the subject and the object.” This is called a “one point perspective” or “central perspective”. Governed by a single vanishing point. Several conceptions of perspective:

  • Science of Vision
  • Technique of Representation
  • Technique of the Measurement
  • Architecture
slide-9
SLIDE 9

9

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Perspective as a Science

It is among our senses, the wise men conclude, that the sight is the noblest one. That is why vulgarly said, with reason, that the eye is the first door from which the intellect understands and likes. (L. Pacioli, De Divina Proportione, f.4r) The eye, which is said is the window of the soul... (Leonardo da Vinci, Paragone)

slide-10
SLIDE 10

10

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Flattening the world

Perspectives are defined according to projections. Perspectives express ways 3- dimensional space is rendered into 2- dimensional, i.e. projections.

Pacioli´s book De Divina Proportione illustrated by Leonardo da Vinci.

slide-11
SLIDE 11

11

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Real World Information:

Is multidimensional. Can be flattened to be processed. Binary relations correspond to 2D space Translating a world of n-ary relations into a world of binary relations is a kind of projection. The result of projecting is a graph. Perspective is what accompanies projection from n-ary relations to binary relations.

slide-12
SLIDE 12

12

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

The tree of proportions and proportionality, by Luca Pacioli.

Proportions as a Semantic Network

slide-13
SLIDE 13

13

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Entity-Relationship Model

Can always be decomposed into binary relations.

A simple entity relationship

  • model. Http://

en.wikipedia.org/wiki/ Entity-relationship_model

slide-14
SLIDE 14

14

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

The limitations of RDF

Or rather, the way it is used

The problem Subject – Predicate – Object: A one point perspective

  • n things.

Not contextual: Who said so? (not automatically reified) Operates in a closed world. First order logic works if everything known. The fix Nodes derive meaning from connectors (provide context) Enable multiple perspectives. Every node is an “account”. Logic is not built-in, multiple logics can be super-imposed. Still RDF, but used differently.

slide-15
SLIDE 15

15

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Introducing the Data Projection Model

  • The Data Projection model enables information systems

to become auditable. It aims at facilitating system maintenance and knowledge management. The Data Projection Model can be used to integrate information assembled from a variety of sources and to express multiple perspectives on the same information set.

slide-16
SLIDE 16

16

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

What is Data Projection for?

Tracking background of any information item for: – Search engine efficiency: Why a given item is on top of the hits? – Identity Theft Investigation: Where is the leakage – Privacy: Who knows what? Managing funding with strings attached: How is this grant being spent? Where does the money go? Accounting++ Integrating heterogeneous sources. Creating diverse views for targeted audiences.

slide-17
SLIDE 17

17

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Data Projection. How it works

Views result from linking data. Semantic is in the views. Multiple views are possible: – Filtering out unwanted information. – Focusing on details (microscopic views) – Anything in between. Views can be created after information is produced. Different people can have different perspectives of the same information set.

slide-18
SLIDE 18

18

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Information and Accounting

Accounting Transactions occur between accounts. Account statements: all transactions from or to an account. Information Any information item is always related to at least another one. Audit trail: all links from or to an information item.

slide-19
SLIDE 19

19

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Single Entry List of expenses per category List of income sources per category. Some money may be unaccounted for (although not desirable). Double Entry Organized by accounts. Each transaction affects two accounts in a way that keeps the overall system of accounts in balance. No money amount unaccounted for.

Bookkeeping made accountable

slide-20
SLIDE 20

20

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Double Entry Bookkeeping

slide-21
SLIDE 21

21

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Double Entry Bookkeeping & DPM

Plane Ticket to SFO PTSFO 091013 $450.00 2009-10-13 Checking Account Air Travel Expenses

slide-22
SLIDE 22

22

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Double Entry Bookkeeping & DPM

Plane Ticket to SFO PTSFO 091013 $450.00 2009-10-13 Checking Account Air Travel Expenses

Description Date +$450.00

  • $450.00

Subaccount

slide-23
SLIDE 23

23

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

DPM Notation

Plane Ticket to SFO PTSFO 091013 $450.00 2009-10-13 Checking Account Air Travel Expenses

Description Date +$450.00

  • $450.00

Subaccount < PTSFO_09-10-13_450.00 | date | 2009-10-13 > < PTSFO_09-10-13_450.00 | description | Plane ticket to SFO> < PTSFO_09-10-13_450.00 | 450.00 | Air > < PTSFO_09-10-13_450.00 | -450.00 | Checking Account > < Air | subaccount of | Travel > < Travel | subaccount of | Expenses >

slide-24
SLIDE 24

24

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

A “perspector” is notated: < x | o | y > x and y are operands (order matters).

  • is an operator.

A perspector can represent a semantic relation, for example: < New York | is a | city > ( This is an instance/class relationship)

  • r < city | added to the system by | MB >

( This is usually considered metadata).

Perspector

slide-25
SLIDE 25

25

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

From Multiple to Multiple Via One

SGML / XML

– One source, – Multiple outputs

(Ex Uno Plures)

DPM Diverse inputs, One common representation, Multiple outputs (E Pluribus Plures Via Unum)

slide-26
SLIDE 26

26

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

DPM and RDF

RDF is based on triples that express statements: subject – object – predicate RDF connects URIs RDF statements are not automatically reified. DPM is based on triples that express operations: x

  • perand – operator – y
  • perand

DPM is not limited to URIs DPM perspectors are automatically reified.

slide-27
SLIDE 27

27

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

DPM and Topic Maps

Topic Maps is a Navigation system using topics as nodes for representing subjects. Names, Types, Occurrences are topics connected through specific relationships. DPM is a Navigation system based on nodes All nodes are related with

  • ther nodes.

Topic Maps can be considered an application of DPM.

slide-28
SLIDE 28

28

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

A Name does not identify a Subject: Variant names may be used to designate the same subject. Synonyms Typographical variations One name may identify several subjects.

Example: Name versus Subject

slide-29
SLIDE 29

29

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Washington Washington, DC Wash D.C. George Washington Denzel Washington Washington State Wa General Washington

Names

slide-30
SLIDE 30

30

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Names

< Washington | is an alternate name for | Wash. D.C. > < Washington | is an alternate name for | Washington, DC > < Washington | is an alternate name for | General Washington> < Washington | is an alternate name for | George Washington > < Washington | is an alternate name for | Wa > < Washington | is an alternate name for | Washington State > < Washington | is an alternate name for | Denzel Washington >

slide-31
SLIDE 31

31

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Washington Washington, DC Wash D.C. George Washington Denzel Washington Washington State Wa General Washington

Emerging Subjects

slide-32
SLIDE 32

32

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Strings Become Subjects

Washington Washington, DC Wash D.C. George Washington Denzel Washington Washington State Wa General Washington

slide-33
SLIDE 33

33

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Generalization

Washington Washington, DC Wash D.C. George Washington Denzel Washingt Washington State Wa General Washington

is a name for is a name for is a name for is a name for is a name for is a name for is a name for is a name for is a name for is a name for

slide-34
SLIDE 34

34

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Names and Subjects

< Washington | is a name for | _city_of_Washington > < Washington DC | is a name for | _city_of_Washington > < Wash. D.C. | is a name for | _city_of_Washington > < Washington | is a name for | _General_G_Washington > < General Washington | is a name for | _General_G_Washington > < George Washington | is a name for | _General_G_Washington > < Washington | is a name for | _Washington_State > < Wa | is a name for | _Washington_State > < Washington State | is a name for | _Washington_State > < Washington | is a name for | _Denzel_Washington > < Denzel Washington | is a name for | _Denzel_Washington >

slide-35
SLIDE 35

35

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Strings as Subjects

< Washington | is in character set | UTF-8 > < Washington | is a name for | _city_of_Washington > < Washington | is a name in the language | English >

slide-36
SLIDE 36

36

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Washington General Washington George Washington Wa Washington State Denzel Washington Washington, DC Wash D.C.

abbreviates indicates is usually called designates is the last name of is a code name for stands for is a name for represents also known as

Integration

slide-37
SLIDE 37

37

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Diversity

< _city_of_Washington | is usually called | Washington > < Washington DC | indicates | _city_of_Washington > < Wash. D.C. | abbreviates | _city_of_Washington > < Washington | is a name for | _General_G_Washington <_General_G_Washington| also_known_as | General Washington > < George Washington | represents | _General_G_Washington < Washington | stands for | _Washington_State > < Wa | is a code name for| _Washington_State > < Washington State | is a name for | _Washington_State > < Washington | is last name of | _Denzel_Washington > < Denzel Washington | designates | _Denzel_Washington >

slide-38
SLIDE 38

38

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Perspective on Naming

< _city_of_Washington | is named | Washington > < Washington DC | is a name for | _city_of_Washington > < Wash. D.C. | is a name for | _city_of_Washington > < Washington | is a name for | _General_G_Washington > <_General_G_Washington| is named | General Washington > < George Washington | is a name for | _General_G_Washington > < Washington | is a name for | _Washington_State > < Wa | is a name for | _Washington_State > < Washington State | is a name for | _Washington_State > < Washington | is a name for | _Denzel_Washington > < Denzel Washington | is a name for | _Denzel_Washington >

slide-39
SLIDE 39

39

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Multidimensional Information

< New York | is a name for | _New_York_City > < New York | is a name for | _New_York_State > < New York | is a name for | _New_York_County > < New York | is a name for | _Manhattan > < New York | is a name for | _Wall_Street > < New York | is an old name for | _Manhattan > < Nueva York | is a name for | _New_York_City > < קרו׳ ונ | is a name for | _New_York_City > < New York | is a name in the language | _English > < Nueva York | is a name in the language | _Spanish > < New York | is a name in the language | _French > < English | is a name for | _English > < English | is a name in the language | _English > < Anglais | is a name for | _English > < Anglais | is a name in the language | _French > < Inglés | is a name for | _English > < Inglés | is a name in the language | _Spanish >

etc., etc., etc., etc., etc., etc., etc., etc., etc., etc., etc., etc., etc.

slide-40
SLIDE 40

40

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Why was the DPM created?

TaxMap is a topic map-based application, published every

  • ther week, containing IRS publications, forms, and FAQs,

accessible by topics.

– Topics are extracted from XML document structure. – Relations between topics are created using rules and

by applying a semantic layer authored by tax experts.

  • Rendition hides the creation process. Hence the question

(among others): why are topics a and b related?

slide-41
SLIDE 41

41

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

TaxMap is built by a combination of automatic and manual processes. Names are added, modified, sometimes deleted, or regarded as synonyms. It's hard to know where a topic name comes from.

Operations on Names

slide-42
SLIDE 42

42

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

TaxMap audited

slide-43
SLIDE 43

43

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Tax Map Audited

Living Abroad

slide-44
SLIDE 44

44

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Where does “Living Abroad” come from?

slide-45
SLIDE 45

45

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Containment Rule Results

If one topic name is entirely contained into another one, they get automatically related.

slide-46
SLIDE 46

46

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Synonyms created by Tax Experts

slide-47
SLIDE 47

47

Michel Biezunski, Infoloom Semantic Conference, San Francisco, 6/23/2010

Demos, other presentations available at: http://www.infoloom.com

Michel Biezunski Infoloom (718) 921-0901 mb@infoloom.com

More Information