Beyond Credential Stuffing: Password Similarity Models using Neural - PowerPoint PPT Presentation

Beyond Credential Stuffing: Password Similarity Models using Neural Networks Bijeeta Pal*, Tal Daniel + , Rahul Chatterjee*, and Thomas Ristenpart* *Cornell Tech + Technion 1

Password Breaches Millions of passwords leaked every year First half of 2018 alone, about 4.5 billion records were exposed [1] 2 [1] "Data breaches compromised 4.5bn records in half year 2018 – Gemalto", The Citizen, October 17, 2018

Implication of breaches Username Password Username Password mark jicDfba1 mark jicDfba1 julia password charlie 123456 mark, jicDfba1 tom abc123 amelie y567dty56 Attacker Server … … … … Authentication Database Leaked Dataset Prior work: 40% users reuse passwords [2] Credential Stuffing Attack 90% of login traffic and most prevalent form of account compromise! [3] 3 [2] S. Pearman et al. “Let’s go in for a closer look:Observing passwords in their natural habitat,”.ACM CCS 2017, pp. 295–310. [3]Shape Security, “2017 Credential spill report,” http://info.shapesecurity. com/rs/935-ZAM-778/images/Shape-2017-Credential-Spill-Report.pdf/, 2018.

Countermeasures Username Password Username Password mark jicDfba1 mark jicDfba1 julia password charlie 123456 mark, jicDfba1 tom abc123 amelie y567dty56 Attacker Server … … … … Authentication Database Leaked Dataset mark Breach Notification Service Reset Password! 4

Countermeasures Username Password Username Password mark jicDfba1 mark jicDfba123 julia password charlie 123456 mark, jicDfba1 tom abc123 amelie y567dty56 Attacker Server … … … … Authentication Database Leaked Dataset 5

Credential tweaking attacks Username Password Username Password mark jicDfba1 mark jicDfba123 julia password charlie 123456 mark, jicDfba1 tom abc123 amelie y567dty56 Attacker Server … … mark, JicDfba … … Authentication Database Leaked Dataset mark, jicDfba123 6

Our contributions Defense Attack Personalized password Most damaging credential strength meters (PPSM) tweaking attack to date § Built using neural network § Built using state of art deep based embedding models learning framework § Robust against all known § 16% of accounts compromised in attacks less than 1000 guesses § Fast and light-weight (3MB) § Evaluated on real user accounts of a large universit y 7

Starting point: breach data First discovered by 4iQ on the Dark Web [4] User Password List mark jicDfba1, jicDfba123 1.4 billion email, password pairs 1.1 billion unique emails julia password, 123456, 1234567 463 million unique passwords tom abcd123, abcd More than 150 million users with … … 2 or more passwords Lots of similar Around 10% of distinct password pairs of passwords same user are within 1 edit distance 8 [4] J. Casal, “1.4 Billion Clear Text Credentials Discovered in a Single Database, ” https://medium.com/4iqdelvedeep/1-4-billion-clear-textcredentials-discovered-in-a-single- database-3131d0a1ae14, Dec, 2017.

Prior work: manually chosen transformation rules Previous work [5][6] User Password List • Can’t generate new guesses once rules mark jicDfba1, jicDfba123 exhaust julia password, 123456, • Might have missed similarity patterns 1234567 markFacebook à mark@facebook tom abcd123, abcd markSuperman à marcSuperman … … [5] A. Das et al., “The tangled web of password reuse.” in NDSS, vol. 14, 2014, pp. 23–26. 9 [6] D. Wang et al., “Targeted online password guessing: An underestimated threat,” in ACM CCS, 2016, pp. 1242–1254

Data-driven approach for learning similarity User Password List Similarity model mark jicDfba1, 𝑸 ( 𝒙 ’ | 𝒙 ) jicDfba123 Models probability user julia password, selects 𝑥 ’ given old Machine 123456, 1234567 password 𝑥 learning tom abcd123, abcd … … Goal: Build credential tweaking attacks using 𝑸 ( 𝒙 ’ | 𝒙 ) P( 𝒙 ’| 𝒙 ) Passwords ) jicDfba123 0.6 𝑥 = jicDfba1 jicDfba 0.2 JicDfba1 0.1 10

Training generative similarity models Encoder-decoder architecture built using character level recurrent neural network (RNN) <add,2,-1>,0.4 Key-press 0.2 <add,3,-1>,0.3 Encoder Decoder representation -0.1 jicDfba1 jicDfba123 RNN RNN -0.4 0.1 Pass2Path Trained on 144 million of password pairs Took 2 days on Nvidia GTX 1080 GPU and Intel Core i9 processor Model has 2.4 million parameters, takes 60 MB space 11

Simulation-based evaluation User Password Training data Pass2Path List (144 mn w,w’ pairs) mark jicDfba1, jicDfba123 julia password, 123456, 1234567 Test data Online credential tweak attack setting: tom abc123, (100,000 ftgKdu45 w’,w pairs) … … • Given 𝑥 , guess w’ with 𝑟 attempts 𝑟 ≤1000 • • Report fraction of passwords guessed 12

Credential tweaking attacks Our Algo - Pass2Path 53% increase 23% increase Wang et al. Almost 16% of accounts compromised Das et al. 0 2 4 6 8 10 12 14 16 18 % of password cracked given a leaked password of the user q ≤ 10 q ≤ 1000 Using multiple leaked passwords: P ( 𝑥 ’ | 𝑥 1, 𝑥 2,…) Pass2path-based attack compromising 23% of accounts (see paper) 13

Credential tweaking in practice Large-scale auth system Partnered with No real world • ~500,000 accounts Cornell University evaluation of cred • Use credential IT Security (ITSO) tweaking attacks stuffing defenses • Password rules 19,868 Cornell Total 1,374 emails in leaked active accounts dataset vulnerable Ran our attack on these accounts to Vulnerable accounts audit put under watchlist by ITSO 14

Defense against these attacks only considers To date no defenses against credential tweaking attacks population wide • 71% vulnerable passwords considered strong by zxcvbn pw distribution Warn users when passwords are vulnerable to credential tweaking attacks Expensive to run Our solution Run audits Personalized password using credential tweaking attacks strength meter (PPSM) 15

Personalized password strength meter (PPSM) Username Password mark jicDfba1 charlie 123456 … … Reset notification Authentication Database Mark Username Password Server mark jicDfba1 julia password Breach … … Notification Service Leaked Dataset 16

Personalized password strength meter (PPSM) Username Password mark jicDfba1 charlie 123456 jicDfba123 … … Similar password PPSM Authentication Database Mark password Username Password Server Weak password mark jicDfba1 julia password Breach DioWs@194 … … Notification Accepted Service Leaked Dataset 17

Building PPSMs Pass2path too big and slow for PPSM qwerty QWERTY1 Password Embedding QWERTY Qfhjs3$4fg4 Model jicDfba123 jicDfba1 Feed forward neural network 123456 jicDfba1 jicDfba123 Compressed model detects 96% vulnerable passwords Easy to deploy: 3 MB, Fast: 0.3 ms

Beyond credential stuffing Modeling similarity of human chosen passwords Build both damaging tweaking attack and first-ever defense against it Attack Defense • Data-driven, state-of-the-art deep learning • PPSM using password embedding model • Outperforms the best previous attacks • Prevents credential tweaking attacks • 1,374 active user accounts at Cornell • Fast and lean (3MB) University vulnerable Email: bp397@cornell.edu Thank you! Website: cs.cornell.edu/~bijeeta/ Github: github.com/Bijeeta/credtweak 19

Beyond Credential Stuffing: Password Similarity Models using Neural - PowerPoint PPT Presentation

Beyond Credential Stuffing: Password Similarity Models using Neural Networks Bijeeta Pal, Tal Daniel + , Rahul Chatterjee, and Thomas Ristenpart* *Cornell Tech + Technion 1 Password Breaches Millions of passwords leaked every year First half

MULTIPLE SUBJECT PROFESSIONAL DEVELOPMENT DAY FALL 2017 Credential Center Staff: Kit Van Wyk,

SINGLE SUBJECT PROFESSIONAL DEVELOPMENT DAY FALL 2017 Credential Center Staff: Kit Van Wyk,

Preventing bit stuffing in CAN G. Cena, I. Cibrario Bertolotti, T. Hu, and A. Valenzano

Team Password Manager Password Management Software for Groups http://teampasswordmanager.com

return password return hash( password ) return hash( password, salt )

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

CREDENTIAL, ADJUNCT AND OVERLOAD PROCESS-ACADEMIC AFFAIRS REQUIREMENTS May 29 th , 2020

Introduction to Intl Credential Evaluation 10/30/2013 Introduction to International Credential

Cisco Passwords - Enforcing Minimum Password Length Common Types of Password Attacks Brute-Force

Password, Authentication, Password Managers Week 4 Frank Chen | Spring 2017 Frank Chen | Spring

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Credential Access with Hashcat Dawid Czagan SECURITY INSTRUCTOR @dawidczagan Creator: Jens

LastPass An Introduction to Password Managers Why do I need a Password manager? Email is your

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and

A Large-scale Analysis of the Mnemonic Password Advice Johannes Kiesel , Benno Stein, Stefan Lucks

ECED2200 Digital Circuits Programmable Logic 18/07/2012 Colin OFlynn - CC BY-SA 1

Know Before You Buy Checking Trusted Sellers Spotting Fraudulent items Ratings

Dr. Patrick Engebretson Mr. Kyle Cronin Dr. Josh Pauli 1. Introductions Introductions 1 2. Why

The Problem @IITSEC NTSAToday 2 Mentorship Mentoring is a brain to pick, an ear to listen,

PROGRAMMABLE LOGIC DEVICES PLDs (combinatorial circuits): ROM, PLA, PAL, CPLD, and FPGA Store

Save (S) Duplicate (D) (This page can be quite slow to load!) Clicking on images

Algebraic Semantics and Model Completeness for Intuitionistic Public Announcement Logic Minghui

Protocols for Checking Compromised Credentials Lucy Li Bijeeta Pal Junade Ali Nick Sullivan

Sambuz

Useful Links

Newsletter

Mail Us

Beyond Credential Stuffing: Password Similarity Models using Neural - PowerPoint PPT Presentation

Beyond Credential Stuffing: Password Similarity Models using Neural Networks Bijeeta Pal*, Tal Daniel + , Rahul Chatterjee*, and Thomas Ristenpart* *Cornell Tech + Technion 1 Password Breaches Millions of passwords leaked every year First half

MULTIPLE SUBJECT PROFESSIONAL DEVELOPMENT DAY FALL 2017 Credential Center Staff: Kit Van Wyk,

SINGLE SUBJECT PROFESSIONAL DEVELOPMENT DAY FALL 2017 Credential Center Staff: Kit Van Wyk,

Preventing bit stuffing in CAN G. Cena, I. Cibrario Bertolotti, T. Hu, and A. Valenzano

Team Password Manager Password Management Software for Groups http://teampasswordmanager.com

return password return hash( password ) return hash( password, salt )

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

CREDENTIAL, ADJUNCT AND OVERLOAD PROCESS-ACADEMIC AFFAIRS REQUIREMENTS May 29 th , 2020

Introduction to Intl Credential Evaluation 10/30/2013 Introduction to International Credential

Cisco Passwords - Enforcing Minimum Password Length Common Types of Password Attacks Brute-Force

Password, Authentication, Password Managers Week 4 Frank Chen | Spring 2017 Frank Chen | Spring

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Credential Access with Hashcat Dawid Czagan SECURITY INSTRUCTOR @dawidczagan Creator: Jens

LastPass An Introduction to Password Managers Why do I need a Password manager? Email is your

Screen 1 Go to www.myenroll.com &lt; Click Request User ID and Password&gt; Acquire USER ID and

A Large-scale Analysis of the Mnemonic Password Advice Johannes Kiesel , Benno Stein, Stefan Lucks

ECED2200 Digital Circuits Programmable Logic 18/07/2012 Colin OFlynn - CC BY-SA 1

Know Before You Buy Checking Trusted Sellers Spotting Fraudulent items Ratings

Dr. Patrick Engebretson Mr. Kyle Cronin Dr. Josh Pauli 1. Introductions Introductions 1 2. Why

The Problem @IITSEC NTSAToday 2 Mentorship Mentoring is a brain to pick, an ear to listen,

PROGRAMMABLE LOGIC DEVICES PLDs (combinatorial circuits): ROM, PLA, PAL, CPLD, and FPGA Store

Save (S) Duplicate (D) (This page can be quite slow to load!) Clicking on images

Algebraic Semantics and Model Completeness for Intuitionistic Public Announcement Logic Minghui

Protocols for Checking Compromised Credentials Lucy Li Bijeeta Pal Junade Ali Nick Sullivan

Sambuz

Useful Links

Newsletter

Mail Us

Beyond Credential Stuffing: Password Similarity Models using Neural Networks Bijeeta Pal, Tal Daniel + , Rahul Chatterjee, and Thomas Ristenpart* *Cornell Tech + Technion 1 Password Breaches Millions of passwords leaked every year First half

Screen 1 Go to www.myenroll.com < Click Request User ID and Password> Acquire USER ID and