Faster Cover Trees
Mike Izbicki and Christian R. Shelton UC Riverside
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 1 / 21
Faster Cover Trees Mike Izbicki and Christian R. Shelton UC - - PowerPoint PPT Presentation
Faster Cover Trees Mike Izbicki and Christian R. Shelton UC Riverside Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 1 / 21 Outline Why care about faster cover trees? Making cover trees faster. Experimental setup
Mike Izbicki and Christian R. Shelton UC Riverside
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 1 / 21
Outline
Why care about faster cover trees? Making cover trees faster.
Experimental setup Simpler definition reduces the number of nodes The nearest ancestor invariant Better cache performance Constructing and querying the tree in parallel
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 2 / 21
Methods for fast nearest neighbor queries: provable speedup arbitrary metric high dimensions quadtree yes no no kd-tree yes no somewhat hashing yes no yes ball tree no yes somewhat cover tree yes yes yes
(Beygelzimer, Kakade, and Langford, 2006)
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 3 / 21
Other uses of cover trees Any learning algorithm that cares about distance can be made faster using cover trees. Examples: k-nearest neighbor Support vector machines (Segata and Blanzieri, 2010) Dimensionality reduction (Lisitsyn et. al., 2010) Reinforcement learning (Tziortziotis et. al., 2014)
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 4 / 21
Outline
Why care about faster cover trees? Making cover trees faster.
◮ Experimental setup ◮ Simpler definition reduces the number of nodes ◮ The nearest ancestor invariant ◮ Better cache performance ◮ Constructing and querying the tree in parallel Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 5 / 21
Experimental setup Three data sources: MLPack benchmarks with Euclidean distance Protein dataset with the random walk graph distance Yahoo! 1.5 million creative common images with the earth movers distance Benchmarking procedure: Construct a cover tree on the dataset For each data point in the dataset, find the nearest neighbor
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 6 / 21
The simplified cover tree
10 8 7 9 12 level 3 level 2 level 1 The covering invariant. For every node p, define the function covdist(p) = 2level(p). For each child q of p d(p, q) ≤ covdist(p) The separating invariant. For every node p, define the function sepdist(p) = 2level(p)−1. For all distinct children q1 and q2 of p d(q1, q2) ≥ sepdist(p)
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 7 / 21
The simplified cover tree
10 8 7 9 12 level 3 level 2 level 1 Advantages of the simplified cover tree: Maintains all runtime guarantees of the original cover tree. Significantly easier to understand and implement. The original cover tree was described in terms of an infinitely large tree, only a subset of which actually gets implemented. Requires exactly n nodes instead of O(n) nodes. Fewer nodes means a faster constant factor for all algorithms.
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 7 / 21
The simplified cover tree
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 yearpredict twitter tinyImages mnist corel covtype artificial40 faces fraction of nodes in the original cover tree required for the simplified cover tree
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 8 / 21
The nearest ancestor cover tree
10 8 7 11 12 9 13 10 8 7 9 12 11 13 level 3 level 2 level 1 A nearest ancestor cover tree is a simplified cover tree where every point p satisfies the additional invariant that if q1 is an ancestor of p and q2 is a sibling of q1, then d(p, q1) ≤ d(p, q2)
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 9 / 21
The nearest ancestor cover tree
10 8 7 11 12 9 13 10 8 7 9 12 11 13 level 3 level 2 level 1 Insertions require rebalancing. No runtime guarantees on the rebalance step. In practice, queries are much faster and construction is only slightly slower.
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 9 / 21
Comparing cover trees on construction time
1 2 3 4 5 yearpredict twitter tinyImages mnist corel covtype artificial40 faces number of distance comparisons in tree construction only (normalized by the original cover tree) 19.1
Original cover tree Simplified cover tree Nearest ancestor cover tree
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 10 / 21
Comparing cover trees on construction and query time
0.2 0.4 0.6 0.8 1 1.2 yearpredict twitter tinyImages mnist corel covtype artificial40 faces number of distance comparisons in tree construction and query (normalized by n2)
Original cover tree Simplified cover tree Nearest ancestor cover tree
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 11 / 21
All of the cover trees scale similarly
This experiment uses the protein data and the random walk graph kernel. 200 400 600 800 1000 1200 1400 1600 50 100 150 200 250 total distance comparisons (millions)
number of data points (thousands)
Original cover tree Simplified cover tree Nearest ancestor cover tree
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 12 / 21
Cache oblivious cover tree
Need to consider cache accesses for fast, modern data structures
image from: http://1024cores.net Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 13 / 21
Cache oblivious cover tree
Arrange nodes in memory according to a preorder traversal of the tree (van Emde Boas et al., 1966; Demaine, 2002)
image from: Wikipedia Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 14 / 21
The cache efficiency of three cover tree implementations
0.2 0.4 0.6 0.8 1 yearpredict twitter tinyImages mnist corel covtype artificial40 faces cache miss rate (cache misses / cache accesses)
Without van embde boas With van embde boas
Measured using Linux’s perf stat utility on an Amazon AWS instance
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 15 / 21
Merging cover trees
Merging cover trees gives us a parallel tree construction algorithm Sometimes, merging cover trees is easy: 10 8 7 9 12 11 13 level 3 level 2 level 1 No runtime bound on the merge operation, but it is fast in practice
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 16 / 21
Merging cover trees
Merging cover trees gives us a parallel tree construction algorithm Sometimes, merging cover trees is hard: 10 8 7 9 11.5 11 13 level 3 level 2 level 1 No runtime bound on the merge operation, but it is fast in practice
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 17 / 21
The effect of parallel tree construction on small datasets
2−4 2−3 2−2 2−1 2+0 2+1 yearpredict (77sec) twitter (107sec) tinyImages (65sec) mnist (12sec) normalized tree construction time
1 2 4 8 16
number of processors
Our cover tree
Experiments run on an Amazon AWS instance with 16 true cores
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 18 / 21
Parallel tree construction really matters on larger data sets
On large datasets with an expensive metric, parallelism is more useful Yahoo! Flickr dataset with 1.5 million images and earth mover distance
num cores simplified tree nearest ancestor tree time speedup time speedup 1 70.7 min 1.0 210.9 min 1.0 2 36.6 min 1.9 94.2 min 2.2 4 18.5 min 3.8 48.5 min 4.3 8 10.2 min 6.9 25.3 min 8.3 16 6.7 min 10.5 12.0 min 17.6
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 19 / 21
The effect of parallel tree construction and query
2−4 2−3 2−2 2−1 2+0 2+1 yearpredict (277min) twitter (51min) tinyImages (34min) mnist (30min) normalized total runtime (both construction and query)
1 1 1 2 4 8 16
Reference cover tree MLPack’s cover tree Our cover tree
Experiments run on an Amazon AWS instance with 16 true cores
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 20 / 21
Summary You should use cover trees. We made them easier to implement and faster. All the code is licensed under the BSD3 and available at: http://github.com/mikeizbicki/hlearn
Izbicki and Shelton (UC Riverside) Faster Cover Trees July 7, 2015 21 / 21