SLIDE 16 To index structured documents
The formal representation of a document is defined by exploitin The formal representation of a document is defined by exploiting its g its logical structure (e.g. XML documents). Given a term t, for each logical structure (e.g. XML documents). Given a term t, for each subpart s subpart si
i
- f the document a distinct term weight is computed,
- f the document a distinct term weight is computed,
expressing the importance of the term as a descriptor in that do expressing the importance of the term as a descriptor in that document cument
- subpart. The overall index term weigh
- subpart. The overall index term weigh is computed by aggregating, in a
is computed by aggregating, in a user user-
driven way the “ “partial” weigths partial” weigths The user specifies both her The user specifies both her/ / his preference about the sections in which to his preference about the sections in which to privilege the search and the aggregation function. privilege the search and the aggregation function.
aggregation function
TITLE AUTHORS ABSTRACT INTRODUCTION ............................ ............................ ............................ ............................ ............................ ............................ ............................ ............................ ............................
F (d,t)
s1
F (d,t)
s2
F (d,t)
s3
F (d,t)
s4
A
F (d,t)
Personalized document indexing Personalized document indexing