Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer - - PowerPoint PPT Presentation

hadoop security design
SMART_READER_LITE
LIVE PREVIEW

Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer - - PowerPoint PPT Presentation

Hadoop Security Design? Just Add Kerberos? Really? Andrew Becherer Black Hat USA 2010 https://www.isecpartners.com Agenda Conclusion What is Hadoop Old School Hadoop Risks The New Approach to Security Concerns


slide-1
SLIDE 1

https://www.isecpartners.com

Just Add Kerberos? Really? Andrew Becherer

Black Hat USA 2010

Hadoop Security Design?

slide-2
SLIDE 2

2

Agenda

 Conclusion  What is Hadoop  Old School Hadoop Risks  The New Approach to Security  Concerns  Alternative Strategies  A Security Consultant Walks Into a Datacenter

slide-3
SLIDE 3

Conclusion

Did HadoopGet Safer?

slide-4
SLIDE 4

4

Conclusion

Hadoop made significant advances but faces several significant challenges

slide-5
SLIDE 5

What is Hadoop

MapReduce Simplified View Who Is Using It

slide-6
SLIDE 6

6

MapReduce

 Name Nodes & Data Nodes

 Data Access

 Job Tracker

 Job Submission

 Task Tracker

 Work

 Optional other services

 Workflow managers  Bulk data distribution

slide-7
SLIDE 7

7

Simplified View

User Job Tracker Task Tracker Task HDFS Task Tracker Task HDFS

slide-8
SLIDE 8

8

Who is Using It

slide-9
SLIDE 9

Hadoop Risks

Insufficient Authentication No Privacy & No Integrity Arbitrary Code Execution Exploit Scenario

slide-10
SLIDE 10

10

Insufficient Authentication

 Hadoop did not authenticate users  Hadoop did not authenticate services

slide-11
SLIDE 11

11

No Privacy & No Integrity

 Hadoop used insecure network transports  Hadoop did not provide message level security

slide-12
SLIDE 12

12

Arbitrary Code Execution

 Malicious users could submit jobs which would

execute with the permissions of the Task Tracker

slide-13
SLIDE 13

13

Exploit Scenario

 Alice had access the Hadoop cluster  Bob had access the Hadoop cluster  Alice and Bob had to trust each other completely  If Mallory got access to the cluster Alice and Bob both

died in a fire.

slide-14
SLIDE 14

The New Approach

Kerberos Delegation Tokens New Workflow Manager Stated Limitations

slide-15
SLIDE 15

15

Kerberos

 Users authenticate to the edge of the cluster with

Kerberos (via GSSAPI)

 Users and group access is maintained in cluster

specific access control lists

slide-16
SLIDE 16

16

Delegation Tokens

 To prevent bottlenecks at the KDC Hadoop uses

various tokens internally.

 Delegation Token  Job Token  Block Access Token

 SASL with a RPC Digest mechanism

slide-17
SLIDE 17

17

New Workflow Manager

 Oozie  Users authenticate using some “pluggable”

authentication mechanism

 Oozie is a superuser and able to communicate with

Job Trackers and Name Nodes on behalf of the user.

slide-18
SLIDE 18

18

Stated Limitations

 Users cannot have administrator access to nodes in

the cluster

 HDFS will not transmit data over an untrusted

networks

 MapReduce will not transmit data over an untrusted

networks

 Security changes will not impact GridMix

performance by more than 3%.

slide-19
SLIDE 19

Concerns

Quality of Protection (QoP) Massive Scale Symmetric Cryptography Pluggable Web UI Authentication IP Based Authentication

slide-20
SLIDE 20

20

Quality of Protection (QoP)

Authentication Integrity Privacy

slide-21
SLIDE 21

21

Symmetric Cryptography

 Block Access Tokens are used to access data  TokenAuthenticator = HMAC-SHA1(key, TokenID)  The secret key must be shared between the Name

Nodes and all of the Data Nodes

 SHARED WITH ALL OF THE DATA NODES!!! That is a

lot of nodes.

slide-22
SLIDE 22

22

Pluggable Web UI Authentication

 There are multiple web Uis

 Oozie  Job Tracker  Task Tracker

 With no standard HTTP authentication mechanism I

hope your developers are up to it.

slide-23
SLIDE 23

23

IP Based Authentication

 HDFS proxies use the HSFTP protocol for bulk data

transfers

 HDFS proxies are authenticated by IP address

slide-24
SLIDE 24

Alternative Strategies

Tahoe

slide-25
SLIDE 25

25

Tahoe - A Least Authority File System

 Deserves its own talk

 Aaron Cordova gave one at HadoopWorld NYC 2009

 Disk is not trusted  Network is not trusted  Memory is trusted  Intended for use in Infrastructure as a Service cloud

computing environments

 Write performance is terrible but read performance is

not so bad

slide-26
SLIDE 26

Assessing Hadoop

Targets Tokens

slide-27
SLIDE 27

27

Targets

 Oozie is a superuser capable of performing any

  • peration as any user

 Name Nodes or Data Nodes can give access to all of

the data stored in HDFS by obtaining the shared “secret key”

 Data may be transmitted over insecure transports

including HSFTP, FTP and HTTP

 Stealing the IP of an HDFS Proxy could allow one to

extract large amounts of data quickly

slide-28
SLIDE 28

28

Tokens: Gotta Catch ‘em All

 Kerberos Ticket Granting Token  Delegation Token

 Get the Shared Key if Possible

 Job Token

 Get the Shared Key if Possible

 Block Access Token

 Get the Shared Key if Possible

slide-29
SLIDE 29

29

Thank you for coming!

andrew@isecpartners.com