hadoop security design
play

Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer - PowerPoint PPT Presentation

Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer Black Hat USA 2010 https://www.isecpartners.com Agenda Conclusion What is Hadoop Old School Hadoop Risks The New Approach to Security Concerns


  1. Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer Black Hat USA 2010 https://www.isecpartners.com

  2. Agenda  Conclusion  What is Hadoop  Old School Hadoop Risks  The New Approach to Security  Concerns  Alternative Strategies  A Security Consultant Walks Into a Datacenter 2

  3. Conclusion Did HadoopGet Safer?

  4. Conclusion Hadoop made significant advances but faces several significant challenges 4

  5. What is Hadoop MapReduce Simplified View Who Is Using It

  6. MapReduce  Name Nodes & Data Nodes  Data Access  Job Tracker  Job Submission  Task Tracker  Work  Optional other services  Workflow managers  Bulk data distribution 6

  7. Simplified View User Job Tracker Task Tracker Task Tracker Task Task HDFS HDFS 7

  8. Who is Using It 8

  9. Hadoop Risks Insufficient Authentication No Privacy & No Integrity Arbitrary Code Execution Exploit Scenario

  10. Insufficient Authentication  Hadoop did not authenticate users  Hadoop did not authenticate services 10

  11. No Privacy & No Integrity  Hadoop used insecure network transports  Hadoop did not provide message level security 11

  12. Arbitrary Code Execution  Malicious users could submit jobs which would execute with the permissions of the Task Tracker 12

  13. Exploit Scenario  Alice had access the Hadoop cluster  Bob had access the Hadoop cluster  Alice and Bob had to trust each other completely  If Mallory got access to the cluster Alice and Bob both died in a fire. 13

  14. The New Approach Kerberos Delegation Tokens New Workflow Manager Stated Limitations

  15. Kerberos  Users authenticate to the edge of the cluster with Kerberos (via GSSAPI)  Users and group access is maintained in cluster specific access control lists 15

  16. Delegation Tokens  To prevent bottlenecks at the KDC Hadoop uses various tokens internally.  Delegation Token  Job Token  Block Access Token  SASL with a RPC Digest mechanism 16

  17. New Workflow Manager  Oozie  Users authenticate using some “pluggable” authentication mechanism  Oozie is a superuser and able to communicate with Job Trackers and Name Nodes on behalf of the user. 17

  18. Stated Limitations  Users cannot have administrator access to nodes in the cluster  HDFS will not transmit data over an untrusted networks  MapReduce will not transmit data over an untrusted networks  Security changes will not impact GridMix performance by more than 3%. 18

  19. Concerns Quality of Protection (QoP) Massive Scale Symmetric Cryptography Pluggable Web UI Authentication IP Based Authentication

  20. Quality of Protection (QoP) Authentication Integrity Privacy 20

  21. Symmetric Cryptography  Block Access Tokens are used to access data  TokenAuthenticator = HMAC-SHA1(key, TokenID)  The secret key must be shared between the Name Nodes and all of the Data Nodes  SHARED WITH ALL OF THE DATA NODES!!! That is a lot of nodes. 21

  22. Pluggable Web UI Authentication  There are multiple web Uis  Oozie  Job Tracker  Task Tracker  With no standard HTTP authentication mechanism I hope your developers are up to it. 22

  23. IP Based Authentication  HDFS proxies use the HSFTP protocol for bulk data transfers  HDFS proxies are authenticated by IP address 23

  24. Alternative Strategies Tahoe

  25. Tahoe - A Least Authority File System  Deserves its own talk  Aaron Cordova gave one at HadoopWorld NYC 2009  Disk is not trusted  Network is not trusted  Memory is trusted  Intended for use in Infrastructure as a Service cloud computing environments  Write performance is terrible but read performance is not so bad 25

  26. Assessing Hadoop Targets Tokens

  27. Targets  Oozie is a superuser capable of performing any operation as any user  Name Nodes or Data Nodes can give access to all of the data stored in HDFS by obtaining the shared “secret key”  Data may be transmitted over insecure transports including HSFTP, FTP and HTTP  Stealing the IP of an HDFS Proxy could allow one to extract large amounts of data quickly 27

  28. Tokens: Gotta Catch ‘em All  Kerberos Ticket Granting Token  Delegation Token  Get the Shared Key if Possible  Job Token  Get the Shared Key if Possible  Block Access Token  Get the Shared Key if Possible 28

  29. Thank you for coming! andrew@isecpartners.com 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend