Security Service Challenge & Security Monitoring Jinny Chien - - PowerPoint PPT Presentation

security service challenge security monitoring
SMART_READER_LITE
LIVE PREVIEW

Security Service Challenge & Security Monitoring Jinny Chien - - PowerPoint PPT Presentation

Enabling Grids for E-sciencE Security Service Challenge & Security Monitoring Jinny Chien Academia Sinica Grid Computing OSCT Security Workshop on 7 th March in Taipei www.eu-egee.org 1 Jinny Chien, ASGC Training and Dissemination


slide-1
SLIDE 1

Training and Dissemination

Enabling Grids for E-sciencE

www.eu-egee.org

Jinny Chien, ASGC

1

Security Service Challenge & Security Monitoring

Jinny Chien Academia Sinica Grid Computing OSCT Security Workshop on 7th March in Taipei

slide-2
SLIDE 2

Enabling Grids for E-sciencE

Training and Dissemination

Motivation

  • After today’s training, we expect you to understand :

– Handle the Incident Response Procedure – Ensure communication channels with the involved admins are in place. – Deal with sudden security attacks – Etc…

  • Overview

– Introduction – Security Service Challenge – Security Monitoring – Conclusion

Jinny Chien, ASGC

2

slide-3
SLIDE 3

Enabling Grids for E-sciencE

Training and Dissemination

Security Service Challenge (SSC)

  • The objective:

The goal of the LCG/EGEE Security Service Challenge, is to investigate whether sufficient information is available to be able conduct an audit trace as part of an incident response, and to ensure that appropriate communications channels are available.

  • The concept:

At first CERN security team submit a testing job to the specific sites and site security contact must according to the clues and reply the answer at the limited time. In general the challenge executed once every year.

Jinny Chien, ASGC

3

slide-4
SLIDE 4

Enabling Grids for E-sciencE

Training and Dissemination

SSC-Objective

Jinny Chien, ASGC

4

slide-5
SLIDE 5

Enabling Grids for E-sciencE

Training and Dissemination

Stages / Role of SSC

  • Stages of the SSC
  • 1. Security Challenge targeting the principal site of each of the

LCG/EGEE Regional Operation Centers(ROC)

  • 2. Security Challenge targeting the individual sites in each ROC
  • Roles
  • 1. The Test Operator (TOP) : who submits the challenging job,

issues the alert, escalates the alert as required and checks the response.

  • 2. The Security Contact of the target site, who receives and

acknowledges the alert, makes the necessary investigation and submits the response back to TOP

Jinny Chien, ASGC

5

slide-6
SLIDE 6

Enabling Grids for E-sciencE

Training and Dissemination

SSC

The challenge is executed by submitting a Grid Job from a User Interface (UI).

  • SSC level 1 :

– challenges the Workload Management System(WMS) of the Grid: Resource Broker(RB) and Computing Element(CE)

  • SSC level 2 :

– challenges the Storage Elements(SE) on the Grid

  • SSC level 3 :

– challenges the Operational Diligence of the LCG/EGEE Grid Sites

  • SSC level 4 : coming soon
  • Materials for SSC

– The materials are available for download from https://twiki.cern.ch/ twiki/bin/view/LCG/LCGSecurityChallenge

Jinny Chien, ASGC

6

slide-7
SLIDE 7

Enabling Grids for E-sciencE

Training and Dissemination

SSC Common Setup

  • SSCs were run in two stages:

– Stage 1: targeting the principal sites in the regions – Stage 2: targeting the individual sites in each ROC

  • The jobs were submitted from an User Interface(UI) to a

chosen Grid Computing Element(CE) via a Resource Broker (RB) using standard Grid commands

  • They consist of a set of small, non-intrusive programs.
  • Not intrusive, only ‘legal’ operations are executed (job

submission), file transfer,…)

  • No penetration tests, no execution of exploits etc.

Jinny Chien, ASGC

7

slide-8
SLIDE 8

Enabling Grids for E-sciencE

Training and Dissemination

Security Service Challenge 1

Jinny Chien, ASGC

8

slide-9
SLIDE 9

Enabling Grids for E-sciencE

Training and Dissemination

SSC-1 Objective and Setup

  • SSC-1 (2005- March 2006) targeted the Workload

Management System(WMS) : Resource Broker (RB) and Computing Element (CE)

  • It tested whether sufficient information was available and

whether communication channels were sufficiently open.

  • Did not address the Security Incident Response Procedure
  • Used Savannah as the vehicle for communication between

the Test Operator (TOP) and the Target sites.

Jinny Chien, ASGC

9

slide-10
SLIDE 10

Enabling Grids for E-sciencE

Training and Dissemination

SSC-1 - Task

  • Given: Time range, IP-address of the target computer,

UNIX-UID of challenging job on target

  • The Sites had to find out
  • 1. The DN of grid-credentials/certificate used by the job submitter?
  • 2. The IP-address of the submitting network device (UI)?
  • 3. The name of the executable which ran on the target computer?
  • 4. The data and the precise time when the executable ran?

Jinny Chien, ASGC

10

slide-11
SLIDE 11

Enabling Grids for E-sciencE

Training and Dissemination

Sample: SSC-1

  • Subject: Security Service Challenge
  • Local date and time of request creation:
  • 2006-03-08 10:38:39 (CET, UTC+2)
  • Initials of test operator: psa
  • Dear LCG/EGEE Site Security Officer,
  • This e-mail constitutes a security service challenge
  • alert. You have received this because you have opened
  • an e-mail destined to this site's security officer. In
  • case you are not the security officer of this site,
  • please forward this e-mail to -
  • aproc-security@list.grid.sinica.edu.tw
  • just stating so. This will allow us to improve our
  • procedures, and we thank you in advance.
  • ……
  • We thank you for your collaboration,
  • Date - 2006-03-08
  • and time period of challenge,
  • between: 08:23:00 -and- 08:34:00 UTC
  • Virtual Organization (VO):
  • LCG/EGEE siteName:
  • Resource Broker (RB):
  • Regional Operation Center (ROC):
  • IP-address of the target computer:
  • lcg00189.grid.sinica.edu.tw
  • UNIX-UID of challenging job on target: 18118
  • --- Security_Service_Challenge_Description
  • Within the time period indicated above, a security
  • service challenge was launched on your site. The
  • UNIX-UID on the target computer as noted

above, was

  • associated with the challenge.
slide-12
SLIDE 12

Enabling Grids for E-sciencE

Training and Dissemination

SSC-1 in AP

  • Executed time : 2006/3/5 – 2006/3/13
  • Targeted Sites :
  • Australia-UNIMELB-LCG2
  • GOG-Singapore
  • INDIACMS-TIFR
  • LCG_KNU
  • Taiwan-IPAS-LCG2
  • Taiwan-NCUCC-LCG2
  • TOKYO-LCG2,
  • TW-NCUHEP

– Total sites are 8

  • The final report

– https://twiki.cern.ch/twiki/pub/LCG/SSC1/ SSC_1_Debrief_2006-04-18.pdf

Jinny Chien, ASGC

12

slide-13
SLIDE 13

Enabling Grids for E-sciencE

Training and Dissemination

Security Service Challenge 2

Jinny Chien, ASGC

13

slide-14
SLIDE 14

Enabling Grids for E-sciencE

Training and Dissemination

SSC-2 Objective and Setup

  • SSC-2 tested the traceability of

storage operations (2007).

  • From the Worker Node (WN) a

sequence of seven storage

  • perations have been executed.

– lcg_crx, lcg_lgx, lcg_repx, lcg_rx, lcg_cpx, lcg_delx

  • Did not address the Security

Incident Response Procedure

  • Used the Global Grid User

Support (GGUS) as the vehicle for communication between the Test Operator and the Target Sites.

Jinny Chien, ASGC

14

slide-15
SLIDE 15

Enabling Grids for E-sciencE

Training and Dissemination

SSC-2 - Task

  • Given: User DN, Time range and SE
  • The Sites had to find out:
  • 1. For each of the identified storage operation, please indicate:
  • The exact time (UTC).
  • The type of operation.
  • The URLs, filenames, catalog names and file paths involved.
  • 2. Please indicate the IP-address of the User Interface (UI)

that was used for the Job Submission

Jinny Chien, ASGC

15

slide-16
SLIDE 16

Enabling Grids for E-sciencE

Training and Dissemination

SSC-2 in AP

  • Executed time : 2007/4/20 – 2007/5/4
  • Targeted Sites :

– 18 sites, 8 countries

  • The procedure is

http://lists.grid.sinica.edu.tw/apwiki/Security_Service_Challenge? highlight=%28security%29

  • The final report could be found https://twiki.cern.ch/twiki/pub/

LCG/SSC2/SSC_2_Stage_2_Report_AsiaPacific.pdf

Jinny Chien, ASGC

16

slide-17
SLIDE 17

Enabling Grids for E-sciencE

Training and Dissemination

The result of SSC2

Site n name St Stat atus Re Reply Fe Feedbac ack Australia-UNIMELB-LCG2 OK YES YES GOG-Singapore Error NO NO HK-HKU-CC-01 OK YES YES IN-DAE-VECC-01 OK NO NO INDIACMS-TIFR Error NO NO JP-KEK-CRC-01 Error NO NO JP-KEK-CRC-02 OK YES NO KR-KISTI-GCRT-01 OK YES YES LCG_KNU OK YES NO NCP-LCG2 OK YES YES PAKGRID-LCG2 OK YES NO Taiwan-IPAS-LCG2 OK YES NO Taiwan-NCUCC-LCG2 OK YES YES TOKYO-LCG2 OK YES YES TW-FTT Error NO NO TW-NTCU-HPC-01 OK YES YES TW-NIU-EECS-01 OK YES NO TW-NCUHEP OK NO NO

Status : (1) Error – could not submit a SSC job (2) OK – success Reply : (1) Yes – Reply the answer (2) No – Not reply the answer Feedback : (1) Yes – provide the feedback (2) No – Not provide the feedback

slide-18
SLIDE 18

Enabling Grids for E-sciencE

Training and Dissemination

Security Service Challenge 3

Jinny Chien, ASGC

18

slide-19
SLIDE 19

Enabling Grids for E-sciencE

Training and Dissemination

Preparing/Running Regional SSC3

TestOperator (TOp) is attacker and incident coordinator and ...

– Get/Install SSC software from svn repository.

  • Malicious binary (might need some tweaking)
  • Job-Submission framework (scripts). Available for gLite, globus

(Aashish).

  • Job-Monitoring webserver.

– Certificate, VO and all the rest.

  • Get a grid certificate (short lived) for the TOp.
  • Negotiate an identity used for TOp with a VO (this VO has to be

supported by all sites).

  • Make sure the default communication channels to the sites to be

challenged work.

  • Check sufficient queue length/WallClockTime. 72h nice, everything

less needs some additional tweaking, but possible. Min. is 12h.

Jinny Chien, ASGC

19

slide-20
SLIDE 20

Enabling Grids for E-sciencE

Training and Dissemination

SSC-3 Objective and Setup

  • SSC-3 -a more realistic

simulation of an incident, it challenges the Operational Responsiveness of LCG/ EGEE Grid Sites.

  • The Job is launched from a

User Interface (UI);

– It runs with valid credentials. – Once running, it will exploit its environment to conceal its activities. – Sign of life will be reported through an out-of-band channel.

Jinny Chien, ASGC

20

slide-21
SLIDE 21

Enabling Grids for E-sciencE

Training and Dissemination

SSC-3 Objective and Setup II

Alert

  • The Alert is sent to the CSIRT e-mail address registered in the

Grid Operations Center Data Base (GOCDB)

– The text clearly identifies the alert as a test. – The Grid identity of the submitting user is indicated. – The Site is asked to deal with the Alert following approved Incident Response Procedures.

  • Send alert mails to :

– VO managers 4 weeks ago – Alert-mail to sites – roc-security-contact to 2 weeks ago

Jinny Chien, ASGC

21

slide-22
SLIDE 22

Enabling Grids for E-sciencE

Training and Dissemination

SSC-3 –Incident Response

The Incident Response is broken up in three activities:

  • Communication

– Acknowledgment/Heads-up report to the indicated e-mail address. – Alert to the VO manager. – Verification that the responsible Certification Authority (CA) has been notified. – Filing of the final report.

  • Containment

– Identification of the Job and killing of its processes. – Suspension of the offending user at the challenged Site.

  • Forensics

– Discovery of emitting Site and contact to the Sites CSIRT. – Analysis of network traffic. – Analysis of the submitted binaries.

Jinny Chien, ASGC

22

slide-23
SLIDE 23

Enabling Grids for E-sciencE

Training and Dissemination

SSC-3 in AP

Jinny Chien, ASGC

23

  • Receive a ticket from GGUS
  • Send a notification to ROC
  • Initial analysis and classification
  • Contact Certification Authority manager
  • Contact Virtual Organization manager
  • Post-incident analysis
slide-24
SLIDE 24

Enabling Grids for E-sciencE

Training and Dissemination

Result of SSC3

Jinny Chien, ASGC

24

slide-25
SLIDE 25

Enabling Grids for E-sciencE

Training and Dissemination

Comment for SSC

  • Material for SSC

– The material is available for download from https://twiki.cern.ch/twiki/bin/view/LCG/LCGSecurityChallenge

  • More details at OSCT public web

– http://osct.web.cern.ch/osct/ssc.html

  • SSC4 will coming soon~

Jinny Chien, ASGC

25

slide-26
SLIDE 26

Enabling Grids for E-sciencE

Training and Dissemination

Security Monitoring

Jinny Chien, ASGC

26

slide-27
SLIDE 27

Enabling Grids for E-sciencE

Training and Dissemination

Security Monitoring

27

Goals

  • Detecting operational problems or event incidents
  • Help sites to keep their resources secure

– Warning sites exposing vulnerabilities

  • Only a basic set of probes currently
  • Main focus on higher levels (ROC, project)

– Provide the project and ROC (OSCT) with information about site status – not concerned with site level

  • No special privileges required from sites

– Only public interfaces used

  • https://twiki.cern.ch/twiki/pub/LCG/OSCT-EGEEIII-

tasks/security-monitoring-v0.12.pdf

slide-28
SLIDE 28

Enabling Grids for E-sciencE

Training and Dissemination

Security Monitoring

28

Current Status

  • A few SAM tests used

– CRL, file permission checks, Pakiti (patching status) – Results encrypted and only available to ROC security contacts

  • Further focus on Nagios-based framework

– Project and ROC view

  • SAM probes ported
  • Tests to be launched from ROC-level Nagios

– Results collected in a standard way via message bus

  • Encryption must be applied

– Access allowed to ROC security contacts and site admins

  • Synchronized with GOC DB

– Hopefully new probes will be developed

slide-29
SLIDE 29

Enabling Grids for E-sciencE

Training and Dissemination

Incident statistics

Jinny Chien, ASGC

29

  • A number of local root exploits released in 2009
  • Main entry points:
  • Compromised user accounts at other sites (very difficult to control)
  • Vulnerable Web applications
  • Weak passwords (!)
  • Main escalation factors (= how the attacker got root)
  • Failure to apply security patches (Pakiti does help here)
  • Weak passwords (!)
slide-30
SLIDE 30

Enabling Grids for E-sciencE

Training and Dissemination

Recent patching campaigns

  • Lots of efforts to eliminate critical vulnerabilities in 2009
  • Most common reasons for not patching were:

– In the majority of the cases, this was due to a communication problem (the recipients of our alerts, in the ROCs, at the sites, etc. thought somebody else would take care of this) – Only a part of the farm was upgraded for some reason – Some tried an exploit that did not work and concluded they were safe – Some did not understand/agree with the implications of the risk and ignored

  • ur alerts

– Some thought they closed the job queues and were surprised (malicious) jobs could still be submitted – Some upgraded, but did not rebooted the hosts – A very small number of sites reported they could not upgrade due to missing third party drivers

Jinny Chien, ASGC

30

slide-31
SLIDE 31

Enabling Grids for E-sciencE

Training and Dissemination

Jinny Chien, ASGC

31

Improve

slide-32
SLIDE 32

Enabling Grids for E-sciencE

Training and Dissemination

Pakiti

  • Security Patching status monitoring
  • Simple design:

– A lightweight, unprivileged, shell client sends data to a server:

  • List of installed packages (“rpm -qa”)
  • Running kernel and operating system version
  • The Pakiti client DOES NOT modify/patch the system

– The Pakiti server:

  • Collects security + repository data from vendors
  • Compares the input from the client and the repo information
  • Concludes on the missing packages and applicable CVEs
  • Displays the results on a Web interface and offers many views/search
  • ptions

– Pakiti can help with many common issues:

  • Is my cluster fully patched? Is there any node where auto-update is

broken?

  • Do I have any node vulnerable to CVE-2010-1234?

Jinny Chien, ASGC

32

slide-33
SLIDE 33

Enabling Grids for E-sciencE

Training and Dissemination

Security Monitoring

33

Pakiti (cont.)

  • Open source tool to check patching status

– http://sourceforge.net/projects/pakiti/ – Any site can run its own Pakiti server to monitor internal machines

  • Server evaluates packages installed on clients

– Detects security patches not applied – Allows for searching for particular vulnerabilities (CVE)

  • Proved very useful recently (CVE-2009-2692, CVE-2009-2698)
  • Currently maintained by OSCT

– A lot of improvements applied recently – New version designed and prototyped during summer

  • OSCT operates Pakiti server for EGEE

– Information collected with SAM/Nagios probes (WNs) – Attention: Only OSCT members allowed to access

slide-34
SLIDE 34

Enabling Grids for E-sciencE

Training and Dissemination

EGEE09: Security Monitoring

34

Pakiti (cont.)

  • Pakiti server

– https://pakiti.cern.ch/ – Data collected by production SAM probes (4500 hosts) – Any OSCT member can ask for access – Check the results and talk to sites – avoid miscommunications (PMB)

  • Maintanence, development

– New version prototyped

  • Sites installation possible

– New release is now available to all from SourceForge

  • Metrics for proper evaluations missing

– Many vulnerable packages don‘t harm often

slide-35
SLIDE 35

Enabling Grids for E-sciencE

Training and Dissemination

Security Monitoring

35

Pakiti Results

  • 4500 machines (all ROCs represented)
  • Only 135 sites fully patched

– Note, that not all unpatched sites are vulnerable!

slide-36
SLIDE 36

Enabling Grids for E-sciencE

Training and Dissemination

New Release

  • Pakiti has been used internally by the OSCT to track

CVE-2009-3547, CVE-2009-2692, CVE-2009-2698, etc.

  • Pakiti 2.1 is now available to all from SourceForge

http://pakiti.sourceforge.net/

Jinny Chien, ASGC

36

slide-37
SLIDE 37

Enabling Grids for E-sciencE

Training and Dissemination

Conclusion

SSC

  • The challenge is from EGEE Operational Security Coordination Team

(OSCT)

  • The goal of the LCG/EGEE Security challenge is to conduct an audit

trace as part of an incident response to ensure that appropriate communication channels with available sufficient information

  • SSC4 will come soon!!

Pakiti

  • Open source could be found from http://sourceforge.net/projects/pakiti/
  • Security Patching status monitoring
  • Any site can run its own Pakiti server to monitor internal machines
  • Do not forget to restart your hosts after a kernel update

Jinny Chien, ASGC

37

slide-38
SLIDE 38

Enabling Grids for E-sciencE

Training and Dissemination

Reference

  • OSCT public webpage http://osct.web.cern.ch/osct/
  • Security Service Challenge

https://twiki.cern.ch/twiki/bin/view/LCG/LCGSecurityChallenge

  • Incident Response Procedure

https://edms.cern.ch/file/428035/LAST_RELEASED/ Incident_Response_Guide.pdf

  • The SSC toolkit

https://twiki.cern.ch/twiki/bin/view/LCG/LCGSecurityChallenge

  • Pakiti Source https://www.sf.net/projects/pakiti

Jinny Chien, ASGC

38

slide-39
SLIDE 39

Enabling Grids for E-sciencE

Training and Dissemination

Question

Jinny Chien, ASGC

39