SLIDE 6 Notation
Documents x1, . . . , xn ∈ X of dimensionality d
◮ Can be, e.g., bag-of-word indicators or tf-idf scores
y1, . . . , yn ∈ {1, . . . , c} are topic labels in some taxonomy T Indices
◮ i, j ∈ {1, . . . , n} for documents ◮ α, β ∈ {1, . . . , c} for classes
The taxonomy gives rise to a cost matrix C ∈ Rc×c, where Cαβ ≥ 0 is the cost of misclassifying class α as β and Cαα = 0 We wish to represent
◮ each topic α as a prototype
pα ∈ F
◮ each document
xi as a low-dimensional vector zi ∈ F
We assume C is given
Large Margin Taxonomy Embedding with an Application to Document Categorization May 13, 2011 6 / 16