Achieving High Throughput and Scalability with JRuby Fernando - - PowerPoint PPT Presentation

achieving high throughput and scalability with jruby
SMART_READER_LITE
LIVE PREVIEW

Achieving High Throughput and Scalability with JRuby Fernando - - PowerPoint PPT Presentation

Achieving High Throughput and Scalability with JRuby Fernando Castano fernando.castano@sun.com Sun Microsystems Agenda What is Project Kenai Early tests and re-architecture How, where and what we benchmark Tuning our


slide-1
SLIDE 1
slide-2
SLIDE 2

Achieving High Throughput and Scalability with JRuby

Fernando Castano fernando.castano@sun.com Sun Microsystems

slide-3
SLIDE 3

Agenda

  • What is Project Kenai
  • Early tests and re-architecture
  • How, where and what we benchmark
  • Tuning our stack
  • References
  • Q&A
slide-4
SLIDE 4

Project Kenai (Kenai.com)

  • Project Kenai is a platform for:
  • Developer Collaboration and Tools as a Service
  • Enables buildings communities for “connected developer”
  • Integrated collaboration services stack
  • We develop Project Kenai using Kenai
  • Features: (per project)
  • SCM (SVN, Hg)
  • Bug Tracking
  • Forums
  • Wiki
  • Mailing Lists
slide-5
SLIDE 5

First Design: Junction1

wiki forum jira svn hg bugzilla issues scm xml tender junction 1 Apache2 xml sympa lists Services Solr search auth api html

slide-6
SLIDE 6

Simple Test: Junction1

  • why so slow?
  • mpstat+jstack
  • too chatty
  • XML expensive
  • json slow too
  • CPU hungry
  • no CPU scaling
slide-7
SLIDE 7

Improved Design: Junction2

wiki forum jira svn hg bugzilla issues scm Apache2 sympa lists junction2 Solr search auth api/html services

slide-8
SLIDE 8

Simple Test: Junction2

  • no chatter
  • better CPU

usage

  • CPU scales
  • much better
slide-9
SLIDE 9

Infrastructure

  • Sun Fire T2000 (web and app tier)
  • 8 cores x 4 threads @1.4Ghz
  • Sun Fire X4500 (storage)
  • quad AMD core, 9.7 TB mirrored, NFS server
  • opensolaris nevada 70b
  • containers
  • smf
  • zfs solaris feature
  • storage pool with RAIDZ
  • nfs protocol
  • snapshots
  • coolstack and blastwave packages (~lamp stack)
slide-10
SLIDE 10

Workload Definition

  • statistics from one of Sun's busiest collaboration sites
  • less than 2,000,000 trans/month (46 trans/min)
  • less than 800 logins/day
  • extracted mix of activity (R/W = 80/20)
  • Requirements
  • Avg response time for 90% in stdy state less and 2 sec
  • 500 projects and 1000 concurrent users
  • match 80/20 mix
  • achieve at least 2000 trans/min
  • randomized activities for each user
  • don't get static content (images, jsp, etc)
  • no think time for now
slide-11
SLIDE 11

Kenai Benchmark Kit

  • jmeter chosen (vs Faban and loadrunner)
  • gnuplot + light scripting for reporting
  • beanshell vs TCP server (for forking unix commands)
  • not requesting embedded objects (no cache)
  • dtrace very helpful (permspace, io, mysql, etc)
  • collect mpstat, vmstat, trapstat, netsum, iostat, ... (~ nagios)
  • save everything and document changes
  • scale 1 dimension at the time
  • stickshift profiling (or newrelic) very useful
slide-12
SLIDE 12

Baselines

  • single thread
  • exclusive operation
  • prstat (-L -m -p)
  • jstack
  • stickshift

Operation comment Login 0.45 Logout 0.26 home 0.16 people 0.17 update profile internal error project create internal error projects 0.43parameter show=5 hg_del 5.30 hg_pull 3.10recurring proxy error hg_push 6.90 svn_del 5.04 svn_pull 3.05recurring proxy error svn_push 12.06 Forum_Edit 1.03 0.64 1.90 Wiki_Post 1.18 Wiki_verify 0.68 Wiki_view 0.42 Baselin e (sec) OASIS-1625 (out of memory) Forum_Topic_ Show Forum_Topics _List short wiki, regex bug, 401 returned & jsession lost view + assertion

  • verhead
slide-13
SLIDE 13

Response Time vs users

slide-14
SLIDE 14

trans/min vs users

slide-15
SLIDE 15

CPU vs users

slide-16
SLIDE 16

Application server at peak

  • vmstat and prstat
slide-17
SLIDE 17

2 Application servers

slide-18
SLIDE 18

High Availability strategy

  • Web tier
  • 2 servers with Apache2 (hardware load balancer)
  • Application tier
  • 2 or more servers (Appache2 in web tier load balancing)
  • 1 glassfish with 6 domains (jvms) in each app server
  • Feature server (sympa, bugzilla, search)
  • active-standby with manual failover (chg DNS alias)
  • mysql 5.0.45 database
  • active-standby with manual failover (chg DNS alias)
  • local database (146G), replication coming soon
  • NFS server
  • active-standby with rsync and manual failover (DNS chg)
slide-19
SLIDE 19

Low Level Tuning

  • Opensolaris (70b)
  • maxusers=4096
  • tcp tuning in web tier (spec.org T2000 publications)
  • use FX scheduler in app tier: priocntl -s -c FX -i all
  • 8k blocksize for zfs pool in NFS server
  • java 1.6
  • server, LargePageSizeInBytes=256m
  • parallelGC, AggresiveOpts, MaxPermSize=512m
  • Xmx=Xms=2560m
slide-20
SLIDE 20

More Tuning

  • Apache 2.2.8
  • built our own (studio compiler with -fast)
  • using pre-fork module (mpm not so good for us)
  • MaxClients = ServerLimit = 600
  • 4 virtual hosts to serve static content (jpg, jsp, etc)
  • proxy balancing with sticky sessions
  • Memcache 1.1.12
  • so far only for SCM permissions
  • adding as needed if SQL becomes heavy
slide-21
SLIDE 21

Jruby 1.1.3 (Rails 2.1) Tuning

  • need many runtimes for T2000
  • First approach: 1 32bit jvm with 20 runtimes
  • runtimes are memory hungry (20MB + objects)
  • expensive and frequent full GCs
  • performance bad
  • Second approach:
  • use 6 to 8 glassfish domains per app server
  • deploy only 5 runtimes per domain (jvm)
  • full GC under control and use more mem (32G available)
  • compile.mode=JIT
  • bjectspace.enable=false
  • bugs fixed: permspace, joni, activerecord (dtrace+prstat)
slide-22
SLIDE 22

Glassfish Tuning

  • 5 acceptor-threads
  • 5 request-processing threads (and warbler)
  • connection-pool validation = table
  • accepts lots of connections
  • connection-pool queue-size-in-bytes=30000
  • connection-pool max-pending-count=30000
  • Dcom.sun.enterprise.server.ss.ASQuickStartup=false
slide-23
SLIDE 23

mysql 5.0.45 Tuning

  • So far Query cache hit 98%
  • CPU usage < 10%
  • Planning to move to 64bit mysql
  • 32GB of RAM available for buffers
  • ZFS/NFS slow compared to FC storage array
slide-24
SLIDE 24

Benchmark constantly or ...

slide-25
SLIDE 25

Project Kenai live

slide-26
SLIDE 26

References

  • Nick Sieger (team leader)
  • http://blog.nicksieger.com
  • Dtrace toolkit
  • http://opensolaris.org/os/community/dtrace/dtracetoolkit/
  • More Kenai performance details
  • http://jfdo.blogspot.com
  • Project Kenai
  • http://kenai.com
  • Solaris Inernals (Richard McDougall)
  • http://www.solarisinternals.com
slide-27
SLIDE 27

Q&A