data driven algorithm design
play

Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie - PowerPoint PPT Presentation

Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon University Analysis and Design of Algorithms Classic algo design: solve a worst case instance. Easy domains, have optimal poly time algos. E.g., sorting, shortest


  1. Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon University

  2. Analysis and Design of Algorithms Classic algo design: solve a worst case instance. • Easy domains, have optimal poly time algos. E.g., sorting, shortest paths • Most domains are hard. E.g., clustering, partitioning, subset selection, auction design, … Data driven algo design: use learning & data for algo design. • Suited when repeatedly solve instances of the same algo problem.

  3. Data Driven Algorithm Design Data driven algo design: use learning & data for algo design. Different methods work better in different settings. • Large family of methods – what’s best in our application? • Prior work: largely empirical. Artificial Intelligence: • [Horvitz-Ruan-Gomes-Kautz-Selman-Chickering, UAI 2001] [Xu-Hutter-Hoos-LeytonBrown, JAIR 2008] Computational Biology: E.g., [DeBlasio-Kececioglu, 2018] • Game Theory: E.g., [Likhodedov and Sandholm, 2004] •

  4. Data Driven Algorithm Design Data driven algo design: use learning & data for algo design. Different methods work better in different settings. • Large family of methods – what’s best in our application? • Prior work: largely empirical. Our Work: Data driven algos with formal guarantees . Several cases studies of widely used algo families. • General principles: push boundaries of algo design and ML. • Related to: Hyperparameter tuning, AutoML, MetaLearning. Program Synthesis (Sumit Gulwani’s talk on Mon) .

  5. Structure of the Talk • Data driven algo design as batch learning. A formal framework. • Case studies: clustering, partitioning pbs, auction pbs. • General sample complexity theorem. • • Data driven algo design as online learning.

  6. Example: Clustering Problems Clustering : Given a set objects organize then into natural groups. • E.g., cluster news articles, or web pages, or search results by topic. • Or, cluster customers according to purchase history. • Or, cluster images by who is in them. Often need do solve such problems repeatedly. • E.g., clustering news articles (Google news).

  7. Example: Clustering Problems Clustering : Given a set objects organize then into natural groups. Objective based clustering 𝒍 -means Input: Set of objects S, d Output: centers {c 1 , c 2 , … , c k } To minimize σ p min d 2 (p, c i ) i 𝐥 -median : min σ p min d(p, c i ) . k-center/facility location : minimize the maximum radius. • Finding OPT is NP-hard, so no universal efficient algo that works on all domains.

  8. Algorithm Design as Distributional Learning Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. MST + Dynamic Programming Large family 𝐆 of algorithms Greedy + Farthest Location Sample of typical inputs … Input 2: Input N: Input 1: Clustering: … Input 2: Input N: Input 1: … Input 2: Input N: Input 1: Facility … location:

  9. Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Approach: ERM, find ෡ 𝐁 near optimal algorithm over the set of samples. Key Question: Will ෡ 𝐁 do well on future instances? Seen: … New: Sample Complexity: How large should our sample of typical instances be in order to guarantee good performance on new instances?

  10. Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Approach: ERM, find ෡ 𝐁 near optimal algorithm over the set of samples. Key tools from learning theory Uniform convergence : for any algo in F , average performance • over samples “close” to its expected performance. Imply that ෡ 𝐁 has high expected performance. • N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. •

  11. Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Key tools from learning theory N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. dim 𝐆 (e.g. pseudo-dimension) : ability of fns in 𝐆 to fit complex patterns More complex patterns can fit, more samples needed for UC and generalization

  12. Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Key tools from learning theory N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. dim 𝐆 (e.g. pseudo-dimension) : ability of fns in 𝐆 to fit complex patterns 𝑧 Overfitting 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑦 5 𝑦 6 𝑦 7 Training set

  13. Statistical Learning Approach to AAD Challenge : “nearby” algos can have drastically different behavior. IQP objective value 𝑡 α ∈ ℝ Revenue Revenue 2 nd highest bid Reserve r 2 nd Highest Price Price highest bid bid Challenge : design a computationally efficient meta-algorithm.

  14. Algorithm Design as Distributional Learning Prior Work: [Gupta- Roughgarden, ITCS’16 &SICOMP’17] proposed model; analyzed greedy algos for subset selection pbs (knapsack & independent set) . Our results : New algorithm classes for a wide range of problems. Clustering: Parametrized Linkage Parametrized Lloyd’s [Balcan-Nagarajan-Vitercik-White, COLT 2017] [Balcan-Dick-White, NeurIPS 2018] [Balcan-Dick-Lang, 2019] dim (F) = O k log n DATA DATA dim (F) = O log n 𝛽 − Weighted comb … Random Farthest first … 𝑙𝑛𝑓𝑏𝑜𝑡 + + 𝐸 𝛽 sampling Complete linkage seeding traversal Single linkage Ward’s alg DP for DP for DP for 𝑀 2 -Local search 𝛾 -Local search k-means k-median k-center CLUSTERING CLUSTERING Alignment pbs (e.g., string alignment): parametrized dynamic prog. [Balcan-DeBlasio-Dick-Kingsford-Sandholm-Vitercik, 2019]

  15. Algorithm Design as Distributional Learning Our results : New algo classes applicable for a wide range of pbs. Partitioning pbs via IQPs: SDP + Rounding • Integer Quadratic Programming (IQP) [Balcan-Nagarajan-Vitercik-White, COLT 2017] Semidefinite Programming dim (F) = O log n Relaxation (SDP) E.g., Max-Cut, GW s-linear … 1-linear … … Max-2SAT, Correlation Clustering rounding rounding roundig Feasible solution to IQP • Automated mechanism design [Balcan-Sandholm-Vitercik, EC 2018] Generalized parametrized VCG auctions, posted prices, lotteries.

  16. Algorithm Design as Distributional Learning Our results : New algo classes applicable for a wide range of pbs. Branch and Bound Techniques for solving MIPs • [Balcan-Dick-Sandholm- Vitercik, ICML’18] Max 𝒅 ∙ 𝒚 s.t. 𝐵𝒚 = 𝒄 𝑦 𝑗 ∈ {0,1}, ∀𝑗 ∈ 𝐽 Max (40, 60, 10, 10, 30, 20, 60) ∙ 𝒚 1 2 , 1, 0, 0, 0, 0, 1 MIP instance s.t. 40, 50, 30, 10, 10, 40, 30 ∙ 𝒚 ≤ 100 𝒚 ∈ {0,1} 7 140 𝑦 1 = 0 𝑦 1 = 1 Choose a leaf of the search tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 1 2 Best-bound Depth-first 135 136 𝑦 6 = 0 𝑦 2 = 0 𝑦 2 = 1 𝑦 6 = 1 1 3 1 1 0, 1, 3 , 1, 0, 0, 1 0, 5 , 0, 0, 0, 1, 1 1, 0, 0, 1, 0, 2 , 1 1, 1, 0, 0, 0, 0, 3 Choose a variable to branch on 3 1 116 120 120 133 3 𝛽 -linear Product Most fractional 𝑦 3 = 1 𝑦 3 = 0 0, 1, 0, 1, 1, 0, 1 4 0, 5 , 1, 0, 0, 0, 1 Fathom if possible and terminate if possible 133 118

  17. Clustering Problems Clustering : Given a set objects (news articles, customer surveys, web pages, …) organize then into natural groups. Objective based clustering 𝒍 -means Input: Set of objects S, d Output: centers {c 1 , c 2 , … , c k } To minimize σ p min d 2 (p, c i ) i Or minimize distance to ground-truth

  18. Clustering: Linkage + Post-processing Family of poly time 2-stage algorithms: [Balcan-Nagarajan-Vitercik-White, COLT 2017] 1. Greedy linkage-based algo to get hierarchy (tree) of clusters. 2. Fixed algo (e.g., DP or last k-merges) to select a good pruning. A B C D E F A B C D E F DEF DEF A B C A B C D E D E A B A B A A B B C C D D E E F F

  19. Clustering: Linkage + Post-processing 1. Linkage-based algo to get a hierarchy. 2. Post-processing to identify a good pruning. Both steps can be done efficiently. DATA 𝛽 − Weighted Complete Ward’s Single … comb linkage linkage algo DP for DP for DP for k-means k-median k-center CLUSTERING

  20. Linkage Procedures for Hierarchical Clustering Bottom-Up (agglomerative) All topics Start with every point in its own cluster. • sports fashion Repeatedly merge the “closest” two • clusters. tennis Lacoste soccer Gucci Different defs of “closest” give different algorithms.

  21. Linkage Procedures for Hierarchical Clustering All topics Have a distance measure on pairs of objects. d(x,y) – distance between x and y sports fashion E.g., # keywords in common, edit distance, etc tennis Lacoste soccer Gucci • Single linkage: x∈A,x ′ ∈B dist(x, x ′ ) dist A, B = min • Complete linkage: x∈A,x ′ ∈B dist(x, x ′ ) dist A, B = max • Parametrized family, α -weighted linkage: x∈A,x ′ ∈B d(x, x ′ ) + α max x∈A,x ′ ∈B d(x, x ′ ) dist α A, B = (1 − 𝛽) min

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend