Analyzing Pwned Passwords with Spark Kelley Robinson - - PowerPoint PPT Presentation
Analyzing Pwned Passwords with Spark Kelley Robinson - - PowerPoint PPT Presentation
Analyzing Pwned Passwords with Spark Kelley Robinson @kelleyrobinson Developer Evangelist + @KELLEYROBINSON BIG DATA & SECURITY Spark: then and now The state of passwords Spark in action Big Data Security BIG DATA & SECURITY
Analyzing Pwned Passwords with Spark
Kelley Robinson
@kelleyrobinson
Developer Evangelist
+
Spark: then and now The state of passwords Spark in action Big Data ∩ Security
Apache Spark Ecosystem
Spark Abstractions
Then Now
RDD (Resilient Distributed Dataset) DataFrames / Datasets
RDDs
- Immutable & distributed
collection
- Unstructured data
- Low-level transformation
and control
Datasets
- Structured data
- Strongly typed
- Fast
Datasets
- Structured data
- Strongly typed
- Fast
- SQL DSLs
Apache Spark Ecosystem
Scala has the most robust language API
Spark: then and now The state of passwords Spark in action Big Data ∩ Security
Spark: then and now The state of passwords Spark in action Big Data ∩ Security
Benefits
Fast Flexible Good for exploration Proven for large systems
BIG DATA & SECURITY @KELLEYROBINSONChallenges
Opaque error messages Operationalizing Documentation
http://heather.miller.am/blog/launching-a-spark-cluster-part-1.html BIG DATA & SECURITY @KELLEYROBINSON👎💰
The missing Spark documentation
Spark: then and now The state of passwords Spark in action Big Data ∩ Security
THANK YOU!
@kelleyrobinson
Spark Resources
- Apache Spark
- Jacek's Spark Documentation
- Zeppelin
- RDDs vs. Datasets
- Running Spark on a Cluster
Security Resources
- Pwned Passwords
- Reverse SHA1 hashes
- LastPass and 1Password
- 2FA Guides