Classification from Positive, Unlabeled and Biased Negative Data - PowerPoint PPT Presentation

Dec 31, 2022 •367 likes •505 views

Classification from Positive, Unlabeled and Biased Negative Data Poster #180 Yu-Guan Hsieh 1 , Gang Niu 2 , Masashi Sugiyama 2,3 1 ENS Paris, France 2 RIKEN, Japan 3 The University of Tokyo, Japan Poster #180 Background and problem setup 1 / 7

Classification from Positive, Unlabeled and Biased Negative Data Poster #180 Yu-Guan Hsieh 1 , Gang Niu 2 , Masashi Sugiyama 2,3 1 ENS Paris, France 2 RIKEN, Japan 3 The University of Tokyo, Japan
Poster #180 Background and problem setup 1 / 7
Poster #180 Background and problem setup Supervised Positive (P) Negative (N) 1 / 7
Poster #180 Background and problem setup Supervised Semi-supervised Positive (P) Positive Negative (N) Negative Unlabeled (U) 1 / 7
Poster #180 Background and problem setup PUbN Supervised Semi-supervised Positive Positive (P) Positive Biased Negative (bN) Negative (N) Negative Unlabeled Unlabeled (U) 1 / 7
Poster #180 Background and problem setup PUbN Supervised Semi-supervised Positive Positive (P) Positive Biased Negative (bN) Negative (N) Negative Unlabeled Unlabeled (U) 1 / 7
Poster #180 Motivating examples Positive Samples Labeled Negative Samples Other Negative Samples ● Information retrieval, text classification, sentiment analysis ● Medical diagnosis: healthy population requesting physical exams is biased 2 / 7
Poster #180 Method: Empirical risk estimator Unbiased Estimator Risk Minimization Unbiased labeled data Empirical Risk Minimization 3 / 7
Poster #180 Method: Empirical risk estimator σ(x) = p(s=+1|x) probability of x being labeled η>0 determining how much we rely on the U data to approximate the risk 4 / 7
Poster #180 Method: Empirical risk estimator #P data #bN data #U data σ(x) = p(s=+1|x) probability of x being labeled η>0 determining how much we rely on the U data to approximate the risk 4 / 7
Poster #180 Method: Illustration Step 1 Step 2 bN P P Regarded as N U y = +1 y = -1 σ ↑ final classifier: y as label ERM: estimate σ = p(s=+1|.): s as label pseudo labeling + weight adjustment nnPU classifier (Kiryo+ NeurIPS 2017) 5 / 7
Poster #180 Estimation error bound With probability at least 1-δ #P data #bN data #U data Bias due to inexact approximation of σ 6 / 7
Poster #180 Experiments Models: ConvNet / ResNet / FCN + Training: Amsgrad Dataset P π bN ρ nnPU/nnPNU PUbN(\N) PU→PN Not given NA 5.76 ± 1.04 4.64 ± 0.62 NA MNIST 2, 4, 6, 8, 10 0.49 1, 3, 5 0.3 5.33 ± 0.97 4.05 ± 0.27 4.00 ± 0.30 9 > 5 > others 0.2 4.60 ± 0.65 3.91 ± 0.66 3.77 ± 0.31 Not given NA 12.02 ± 0.65 10.70 ± 0.57 NA Airplane, Cat, dog, horse 0.3 10.25 ± 0.38 9.71 ± 0.51 10.37 ± 0.65 CIFAR-10 automobile, 0.4 ship, truck Horse > deer 0.25 9.98 ± 0.53 9.92 ± 0.42 10.17 ± 0.35 = frog > others Not given NA 23.78 ± 1.04 21.13 ± 0.90 NA Cat, deer, dog, CIFAR-10 0.4 Bird, frog 0.2 22.00 ± 0.53 18.83 ± 0.71 19.88 ± 0.62 horse Car, truck 0.2 22.00 ± 0.74 20.19 ± 1.06 21.83 ± 1.36 Not given NA 14.67 ± 0.87 13.30 ± 0.53 NA sci. 0.21 14.69 ± 0.46 13.10 ± 0.90 13.58 ± 0.97 20 alt., comp., 0.56 Newsgroups misc., rec. talk. 0.17 14.38 ± 0.74 12.61 ± 0.75 13.76 ± 0.66 soc. > talk. > sci. 0.1 14.41 ± 0.76 12.18 ± 0.59 12.92 ± 0.51 7 / 7

Recommend

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Definitions and Some Examples of Biased Samples Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125 Definitions and Some Examples of Biased Samples Definitions and Some Examples of Biased Samples All

1.42k views • 127 slides

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

Definitions and Some Examples of Biased Samples Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125 Definitions and Some Examples of Biased Samples Definitions and Some Examples of Biased Samples All

1.3k views • 127 slides

Mimicking Word Embeddings using Subword RNNs Yuval Pinter, Robert Guthrie, Jacob Eisenstein

Mimicking Word Embeddings using Subword RNNs Yuval Pinter, Robert Guthrie, Jacob Eisenstein @yuvalpi Presented at EMNLP September 2017, Copenhagen The Word Embedding Pipeline Unlabeled Unlabeled corpus Unlabeled corpus Unlabeled corpus

1.27k views • 126 slides

10 Steps to Counting Unlabeled Planar Graphs: 20 Years Later Manuel Bodirsky October 2007

10 Steps to Counting Unlabeled Planar Graphs: 20 Years Later Manuel Bodirsky October 2007 Counting Unlabeled Planar Graphs A005470 Sloane Sequence A005470 (core, nice, hard): Number p ( n ) of unlabeled planar simple graphs with n nodes.

814 views • 70 slides

Clustering Clustering is an unsupervised classification method, i.e. unlabeled data is partitioned

Clustering Clustering is an unsupervised classification method, i.e. unlabeled data is partitioned into subsets (clusters), according to a similarity measure, such thatsimilardata is grouped into the same cluster. Unlabeled Data

442 views • 22 slides

Becky Coffin Kingfisher plc Net Positive 2 Net Positive 3 Net Positive 4 Creating the

Achieving Net Positive through responsible sourcing Becky Coffin Kingfisher plc Net Positive 2 Net Positive 3 Net Positive 4 Creating the leader Net Positive 5 Net Positive 6 WHAT HAVE WE LEARNED ABOUT RESPONSIBLE SOURCING?

450 views • 19 slides

Extreme Event-Size Extreme Event-Size Fluctuations in Biased Fluctuations in Biased Random

Extreme Event-Size Extreme Event-Size Fluctuations in Biased Fluctuations in Biased Random Walks on Networks Random Walks on Networks Vimal Kishore Physical Research Laboratory Ahmedabad, India phy.vimal@gmail.com vimal@prl.res.in Plan of

817 views • 25 slides

10701 Semi supervised learning Can Unlabeled Data improve supervised learning? Important

10701 Semi supervised learning Can Unlabeled Data improve supervised learning? Important question! In many cases, unlabeled data is plentiful, labeled data expensive Image classification (x=images from the web, y=image type) Text

886 views • 37 slides

Classification from Pairwise Similarity and Unlabeled Data Han Bao 1,2 , Gang Niu 2 , Masashi

Classification from Pairwise Similarity and Unlabeled Data Han Bao 1,2 , Gang Niu 2 , Masashi Sugiyama 2,1 1 The University of Tokyo, Japan / 2 RIKEN, Japan July 13 th , 2018 Gentle Start: Binary Classification 2 Boundary Training data

264 views • 13 slides

Unlabeled Motzkin numbers Max Alekseyev Dept. Computer Science and Engineering 2013 Max

Unlabeled Motzkin numbers Max Alekseyev Dept. Computer Science and Engineering 2013 Max Alekseyev Unlabeled Motzkin numbers Catalan numbers Catalan numbers can be defined by the explicit formula: 2 n 1 (2 n )! C n = = n + 1 n !( n +

690 views • 20 slides

Word2Vec Michael Collins, Columbia University Motivation We can easily collect very large

Word2Vec Michael Collins, Columbia University Motivation We can easily collect very large amounts of unlabeled text data Can we learn useful representations (e.g., word embeddings) from unlabeled data? Bigrams from Unlabeled Data

254 views • 6 slides

Graph Classification Classification Outline Introduction, Overview Classification using

Graph Classification Classification Outline Introduction, Overview Classification using Graphs Graph classification Direct Product Kernel Predictive Toxicology example dataset Vertex classification Laplacian Kernel

606 views • 33 slides

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification of Symmetry Protected Topological Phases Protected Topological Phases Protected Topological Phases Protected Topological Phases in Interacting

492 views • 32 slides

Nearest Neighbor Classification Seed classification by area and What should we compactness

Nearest Neighbor Classification Seed classification by area and What should we compactness predict for unlabeled Lecture 6 KNN and Decision Trees test points (stars)? Nearest neighbor classification: predict CS 335 label of

203 views • 3 slides

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014 Survivor Bias of Video Data Large-scale Video Classification with

760 views • 39 slides

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University (joint work with Ed Green, Rutgers University) OUTLINE : I. Introduction II. Some remarks on Shrinkage Estimators III. Combining Biased and

533 views • 24 slides

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN Todays Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster RCNN Object Detector

747 views • 29 slides

De Deer P Pop opula lation ion on on K Kaib aibab ab Pla Plateau G Game P Preserve

De Deer P Pop opula lation ion on on K Kaib aibab ab Pla Plateau G Game P Preserve IE604: System Dynamics Modeling and Analysis Jayendran Venkateswaran, IIT Bombay 14 TH FEBRUARY SPECIAL Ka Kaibab Plateau Kaibab Plateau is a flat

870 views • 27 slides

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of Wildlife: 10-Year Deer Management Plan Workshop #2: August 3rd and 4th, 2017 1 Welcome back!! Our goal for this workshop: Identify performance

769 views • 39 slides

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of Wildlife: 10-Year Deer Management Plan Workshop #2: August 3rd and 4th, 2017 1 Good Morning! Our goal for today: Identify performance measures for

442 views • 18 slides

Massively Parallel Computation Philip Bille Sequential Computation Computation. Read and

Massively Parallel Computation Philip Bille Sequential Computation Computation. Read and write in storage Arithmetic and boolean operations Control-flow (if-then-else, while-do, ..) Scalability. Massive data. 001111 E

337 views • 23 slides

Correlation Autoencoder Hashing for Supervised Cross-Modal Search . . . Yue Cao, Mingsheng

. Correlation Autoencoder Hashing for Supervised Cross-Modal Search . . . Yue Cao, Mingsheng Long, Jianmin Wang, and Han Zhu School of Software Tsinghua University The Annual ACM International Conference on Multimedia Retrieval ICMR 2016

423 views • 23 slides

JUST THE MATHS SLIDES NUMBER 15.1 ORDINARY DIFFERENTIAL EQUATIONS 1 (First order

JUST THE MATHS SLIDES NUMBER 15.1 ORDINARY DIFFERENTIAL EQUATIONS 1 (First order equations (A)) by A.J.Hobson 15.1.1 Introduction and definitions 15.1.2 Exact equations 15.1.3 The method of separation of the variables UNIT 15.1 -

470 views • 11 slides

Deep Learning With Differential Privacy Presenter: Xiaojun Xu Deep Learning Framework

Deep Learning With Differential Privacy Presenter: Xiaojun Xu Deep Learning Framework Autonomous Driving Gaming Face Recognition Healthcare Deep Learning Framework Dataset Server Model Privacy Issues of Training Data Dataset Server

368 views • 33 slides

Classification from Positive, Unlabeled and Biased Negative Data - PowerPoint PPT Presentation

Classification from Positive, Unlabeled and Biased Negative Data Poster #180 Yu-Guan Hsieh 1 , Gang Niu 2 , Masashi Sugiyama 2,3 1 ENS Paris, France 2 RIKEN, Japan 3 The University of Tokyo, Japan Poster #180 Background and problem setup 1 / 7

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

Mimicking Word Embeddings using Subword RNNs Yuval Pinter, Robert Guthrie, Jacob Eisenstein

10 Steps to Counting Unlabeled Planar Graphs: 20 Years Later Manuel Bodirsky October 2007

Clustering Clustering is an unsupervised classification method, i.e. unlabeled data is partitioned

Becky Coffin Kingfisher plc Net Positive 2 Net Positive 3 Net Positive 4 Creating the

Extreme Event-Size Extreme Event-Size Fluctuations in Biased Fluctuations in Biased Random

10701 Semi supervised learning Can Unlabeled Data improve supervised learning? Important

Classification from Pairwise Similarity and Unlabeled Data Han Bao 1,2 , Gang Niu 2 , Masashi

Unlabeled Motzkin numbers Max Alekseyev Dept. Computer Science and Engineering 2013 Max

Word2Vec Michael Collins, Columbia University Motivation We can easily collect very large

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Nearest Neighbor Classification Seed classification by area and What should we compactness

Learning from Unlabeled Video Carl Vondrick Columbia University Survivor Bias of Video Data

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

De Deer P Pop opula lation ion on on K Kaib aibab ab Pla Plateau G Game P Preserve

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of

Incorporating Stakeholders Values into Ohio Deer Management: Workshop #2 Ohio Division of

Massively Parallel Computation Philip Bille Sequential Computation Computation. Read and

Correlation Autoencoder Hashing for Supervised Cross-Modal Search . . . Yue Cao, Mingsheng

JUST THE MATHS SLIDES NUMBER 15.1 ORDINARY DIFFERENTIAL EQUATIONS 1 (First order

Deep Learning With Differential Privacy Presenter: Xiaojun Xu Deep Learning Framework

Sambuz

Useful Links

Newsletter

Mail Us