W13 Wednesday, May 17, 2006 3:00PM P ROGRESSIVE P ERFORMANCE T ESTING - - PDF document

w13
SMART_READER_LITE
LIVE PREVIEW

W13 Wednesday, May 17, 2006 3:00PM P ROGRESSIVE P ERFORMANCE T ESTING - - PDF document

BIO PRESENTATION SUPPLEMENTAL MATERIALS W13 Wednesday, May 17, 2006 3:00PM P ROGRESSIVE P ERFORMANCE T ESTING : A DAPTING TO C HANGING C ONDITIONS Jeff Jew ell ProtoTest LLC International Conference On Software Testing Analysis and Review May


slide-1
SLIDE 1

BIO PRESENTATION SUPPLEMENTAL MATERIALS

International Conference On Software Testing Analysis and Review May 15-19, 2006 Orlando, Florida USA

W13

Wednesday, May 17, 2006 3:00PM

PROGRESSIVE PERFORMANCE TESTING: ADAPTING TO CHANGING CONDITIONS

Jeff Jew ell ProtoTest LLC

slide-2
SLIDE 2

Jeff Jew ell

Jeff Jewell is a senior software testing consultant with ProtoTest in Denver, Colorado, and is a Certified Software Test Engineer. Jeff has more than ten years of experience in software development and quality assurance. He has held lead and management positions in software testing at a variety of companies, from small web startups to developers of ERP systems. He has experience with automated regression testing of web and Windows client applications, and performance testing of web applications. Jeff’s tool experience includes implementing and using both commercial and open source test tools for test management, automated testing, performance testing, and defect tracking. As a ProtoTest consultant Jeff has been involved in a number of different performance testing projects for clients. He has also conducted QA process and test tool assessments and implementations for a number of clients. Jeff also helps to develop and deliver ProtoTest’s software testing training classes.

slide-3
SLIDE 3

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0

Progressive Performance Progressive Performance Testing Testing

Adapting to Changing Conditions Jeff Jew ell

slide-4
SLIDE 4

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2

Overview Overview

  • Two projects covered

– Online Banking – E-commerce Web Site

  • For each project will show

– Initial Approach – Initial Results and Issues – Test and System Adaptations – Intermediate Results and Issues – Test and System Adaptations – Final Results – Lessons Learned

slide-5
SLIDE 5

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 3

Online Banking Project Online Banking Project

  • Rewrite of existing banking system

– New ASP.NET front end – Third party legacy middleware and back-end

  • Performance Test Goals

– Match performance of previous system – Test peak number of concurrent users – Page loads usually under 5 seconds

slide-6
SLIDE 6

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 4

Tool Selection Tool Selection

  • Semi-formal tool selection process

– Based on requirements – Budget a big concern

  • Selected OpenSTA

– Open Source tool – Active user & developer communities – Supported recording HTTPS

slide-7
SLIDE 7

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 5

Initial Approach Initial Approach

  • Script typical user tasks

– Login – Account Detail – Account History – Balance Transfer

  • System administrators to monitor servers

– Web, Application, and Database Servers – Specific processes identified to monitor

  • Establish baselines with initial test runs
  • Initial pool of 200 test user accounts
slide-8
SLIDE 8

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 6

Initial Results Initial Results

  • Poor login performance with light load
slide-9
SLIDE 9

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 7

Initial Troubleshooting Initial Troubleshooting

  • Only login process slowed down

– Login includes authentication and account info – Web server still responded to other requests – No significant processor usage on web and application servers

  • No monitoring of database servers

– Some servers shared with production – Not concerned about database performance – Harder to monitor

slide-10
SLIDE 10

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 8

Initial Issues and Resolutions Initial Issues and Resolutions

  • Page size for accounts page > 200KB

– ASP.NET session state variables large – Reduced initial size to < 30KB

  • Login database server maxed out

– Only used for authentication – Old hardware – Shared with production – Replaced server hardware

slide-11
SLIDE 11

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 9

Test Adaptations Test Adaptations

  • Better server monitoring
  • Focusing only on login script

– Login seemed to be the biggest problem – Retrieved all current account info

  • Better handling of session state
  • Better logging of test info
slide-12
SLIDE 12

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 0

Intermediate Results Intermediate Results

  • Login performance much improved
  • Still higher than expected
slide-13
SLIDE 13

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 1

Intermediate Intermediate Issues &

Issues & Resolutions Resolutions

  • Accounts info now a bottleneck
  • Application server doing the most work

– Java processes – Third party applications

  • Bypassed some middleware processes
  • Optimized SQL Queries
  • First run of each day was slow due to

server caching

slide-14
SLIDE 14

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 2

Final Results Final Results

Login Timer Comparison

0.00 5.00 10.00 15.00 20.00 25.00 30.00 2 5 8 1 1 1 4 1 7 2 2 3 2 6 2 9 3 2 3 5 3 8 4 1 4 4 4 7 5 Active Users Seconds April 19 Run May 4 Run

slide-15
SLIDE 15

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 3

Lessons Learned Lessons Learned

  • Faulty assumptions waste time

– Did need to worry about database servers – Hardware was an issue

  • Didn’t need all the scripts we originally

planned

  • Didn’t need as many VUs as we thought
  • Had to rerecord scripts after code changes
  • Account variations created test anomalies
  • Understand server caching mechanisms
slide-16
SLIDE 16

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 4

E E-

  • Commerce Project

Commerce Project

  • Rewrite of web-based order entry system
  • Company located in Dallas
  • Java back-end, Oracle DB, HPUX servers
  • Third party payment processing
  • Performance Test Goals

– Want to own test scripts when finished – Initial goal of 50 concurrent users – Transactions under 10 seconds – Average 60 items on each order

slide-17
SLIDE 17

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 5

Initial Approach Initial Approach

  • Lots of test scripts

– Main scripts submitted orders

  • 3 different payment methods
  • Random # of items on orders (30-100)
  • Random items selected for each order

– Secondary scripts somewhat useful

  • 30 test user accounts for each

payment type

  • Lots of logging of script info
  • Testing remotely from Denver office
slide-18
SLIDE 18

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 6

Initial Results Initial Results – – Timers Timers

  • Poor timer values across the board
slide-19
SLIDE 19

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 7

Initial Results Initial Results – – Portal Page Portal Page

  • Primary HTTP request didn’t take much time
  • Other page elements contributed to timer length
  • Response size directly proportional to time

Portal HTTP Get vs. Timer

50 100 150 Elapsed Time Seconds HPPT Get Portal T03_PORTAL Timer

HTTP Response Time vs. Response Size

5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 Bytes or Milliseconds Time Size

slide-20
SLIDE 20

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 8

Issues & Resolutions Issues & Resolutions

  • No noticeable load on server processors
  • Bandwidth constraints on load generator

– Only T1 connection from Denver office – Wasted time for all involved in test run – Planned future test runs on-site

  • Load balancing wasn’t working

– Originally configured based on IP Address – Changed to use session cookie

slide-21
SLIDE 21

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 1 9

Test Adaptations Test Adaptations

  • Numbers of VUs increased

– Peak times on web site have 1500 users

  • Orders were standardized to 60 items

– Numbers of items impacted performance – Variableness made comparison difficult

  • 1000 accounts for each payment type
  • Changed the user ramp-up mechanisms
slide-22
SLIDE 22

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 0

Intermediate Intermediate Results

Results – – Slow Timers Slow Timers

Tim er Averages by Active Users

20 40 60 80 100 120 140 160 11 -20 21 -30 31 -40 41 -50 51 -60 61 -70 71 -80 81 -90 91 -10 0 10 1-1 1 0 11 1-1 2 0 12 1-1 3 0 13 1-1 4 0 14 1-1 5 0 1 5 1-1 6 0 16 1-1 7 0 17 1-1 8 0

Virtual Users Seconds

T0 6 _ PRI CE_ ORDER T0 7 _ PRI CE_ ORDER T1 2 _ CONFI RM_ SUBMI T

  • Pricing and Submitting orders very slow
slide-23
SLIDE 23

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 1

Intermediate Intermediate Results

Results – – Response Times Response Times

958 GET http://URL/secure/logout 85,822 POST http://URL/oe/submitOrder.jsp 12,637 GET http://URL/oe/submitOrder.jsp 4,360 POST http://URL/account/accountPaymentElectronicCheck.jsp 2,449 POST http://URL/account/accountPaymentCreditCard.jsp 2,467 POST http://URL/account/accountPayment.jsp 2,089 GET http://URL/account/accountPayment.jsp?fromOrder=true 54,329 GET http://URL/oe/priceOrder.jsp 2,312 POST http://URL/oe/orderList.jsp 1,763 GET http://URL/oe/orderList.jsp?showHistory=false 9,192 POST http://URL/oe/buildOrder.jsp 1,047 GET http://URL/portal 1,072 POST http://URL/secure/login 789 GET http://URL/secure/login

Average (ms) URL

slide-24
SLIDE 24

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 2

Intermediate Intermediate Results

Results – – HTTP HTTP Responses Responses

  • Shows responses per second from server
  • Wide variations near the end of the test
slide-25
SLIDE 25

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 3

Issues & Resolutions Issues & Resolutions

  • Database record locking conflicts
  • Page requests timing out
  • System Adaptations

– New hardware – Server configuration changes – Software code fixes – Database query optimization

  • Fewer system changes between runs
slide-26
SLIDE 26

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 4

Test Adaptations Test Adaptations

  • Ran only a single script
  • Increased repeatability

– Standard list of items on each order – Looping through user list sequentially

  • Script tuning

– Rerecorded test script – Looking for and logging specific errors – Better simulation of browser caching

  • User/ product data cleanup
slide-27
SLIDE 27

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 5

Final Results Final Results – – Timers Timers

  • Huge improvements in timer values
  • Still seeing some problems at 700+ users

0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 201-225 226-250 251-275 276-300 301-325 326-350 351-375 376-400 401-425 426-450 451-475 476-500 501-525 526-550 551-575 576-600 601-625 626-650 651-675 676-700 701-725 726-750 751-775 776-800 801-825 826-850 851-875 876-900 T01_HOMEPAGE T02_LOGIN T03_PORTAL T04_ORDER_LIST T05_BUILD_ORDER T06_SAVE_ORDER T08_PAY_ORDER T09_CREDIT_CARD T10_CARD_PROFILE T11_SUBMIT_PAYMENT T12_SUBMIT_ORDER T14_LOGOUT Average of Elapsed Time Active Users Timer Name

slide-28
SLIDE 28

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 6

Results Results – – Timer Comparisons Timer Comparisons

71% 1.35 0.39 T14_LOGOUT 68% 37.55 12.05 T13_CONFIRM_SUBMIT 38% 4.28 2.67 T12_SUBMIT_ORDER 24% 5.27 4.02 T11_SUBMIT_PAYMENT 38% 3.9 2.4 T10_CARD_PROFILE 42% 5.15 3 T09_CREDIT_CARD 54% 5.09 2.33 T08_PAY_ORDER 79% 50.65 10.49 T07_PRICE_ORDER 83% 4.67 0.8 T06_SAVE_ORDER 36% 4.32 2.75 T05_BUILD_ORDER 31% 3.33 2.3 T04_ORDER_LIST 25% 2.57 1.92 T03_PORTAL 77% 0.89 0.2 T02_LOGIN 58% 2.15 0.91 T01_HOMEPAGE

% Improvement Intermediate Run Final Run Timer Name

10 20 30 40 50 60

Timers

slide-29
SLIDE 29

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 7

Lessons Learned Lessons Learned

  • Get enough bandwidth for generating load

– Consider internal and external network limitations – Caching is more important with lots of users

  • Script/ Test variations made comparisons difficult

– Keep ramp-up methods consistent – Reduce randomness in test data – Change one thing at a time

  • Log lots of test script information

– User info, data variations, errors, results

  • Modularize scripts to ease maintainability
  • Clean up test data before test runs
slide-30
SLIDE 30

ProtoTest, LLC w w w .prototest.com 3 0 3 .7 0 3 .1 5 1 0 Slide 2 8

Conclusions Conclusions

  • Develop a plan for performance testing

– Document assumptions and risks – Be flexible in its implementation

  • Make adjustments as you learn about

– System bottlenecks – Script/ Test variations – Monitoring needs

  • Don’t jump to conclusions about issues
slide-31
SLIDE 31

Progressive Performance Testing: Adapting to Changing Conditions By Jeff Jewell

When you start working on a performance testing project (or any software testing project) you usually approach the job with a plan. You define performance objectives, what test scripts you’re going to need, what measurements you will take during test executions, and what types of tests you will run. Planning all of those things gives you a guide to follow as you start to execute your plan. But things don’t always go as planned, especially with performance testing. There are a lot of things that you will uncover during your performance testing project, which should lead you to adjust parts of your system to improve performance, including code changes, server and network hardware, and software configurations. However, you will also likely find that you need to adjust the way you are conducting your testing so that you can better test and measure system performance. This paper will outline the progression that occurred during two performance testing projects conducted by Jeff Jewell of ProtoTest. The testing approach for each project had to be adapted as the testing progressed, and each presented unique learning opportunities and chances for improvement.

Online Banking Project

ProtoTest was asked to help with performance testing of an online banking application. This application was a complete rewrite of the user interface for users to conduct their banking

  • transactions. The new interface was developed with ASP.NET as the main application interface

layer running on the web server. There were several third party Java middleware programs running on a separate application server. These programs interfaced with a legacy database system which maintained all of the user bank account information. Because the main back-end components were third party legacy applications the software developers had little control over those systems. The performance goals for the new system were that the new system should perform as well or better than the old online banking site that was in production at the time. This would require some baseline testing of the production system to which we could compare results. There were no hard targets defined for the number of concurrent users or transactions that should be

  • supported. Analysis of production web server logs yielded some guidance to help identify how

many users were logged into the system during normal and peak hours. A test tool selection process found that OpenSTA was the best product to use for this project. It met the cost requirements because it is an open source tool, and was the only one we found that

slide-32
SLIDE 32

could record HTTPS transactions. OpenSTA has an active user community and an active developer community, so some support would be available if needed.

Initial Approach

We began this performance testing project with a test plan that identified objectives, important metrics, test scripts, and other elements of our project. Initially we planned to develop four test scripts, simulating the following typical user banking activities:

  • Logging in and viewing account balances
  • Viewing account details
  • Viewing account history
  • Transferring money between accounts

We identified a number of processes that we would measure on the various servers, and hoped to measure them through OpenSTA directly. However, system security prevented us from gaining direct access to the servers from our load testing machines, so we could not set up automatic monitoring during our test runs. We were forced to rely on having system administrators manually monitor the web, application, and database servers while we ran our tests. An initial pool of 200 test user accounts was created to use for testing. Four test scripts were recorded based on the four typical user activities. The first test runs were configured to run up to 50 virtual users, with 1 new user added every 3 seconds.

Initial Results

Results from the first few runs indicated that there was a problem with performance of the system when users logged in. The amount of time it took users to log in was slow, even under light user load, and got only worse as more users connected to the system. The following chart shows the average amount of time it took to accomplish user tasks in one of the early runs. Notice that the T_LOGIN_LOGIN timer rises steadily as the number of virtual users increases. And even at the lowest point it took the script an average of 10 seconds to log in.

slide-33
SLIDE 33

What is also noticeable from this graph is that the other timers did not increase dramatically like the login timer did. This indicates that the web server was still responding well to requests, and that the bottleneck was most likely within the code that logged the user in. The login timer consists of two primary server transactions. The first is authentication of the username and

  • password. The second process consists of retrieving a summary of the user’s account

information for all bank accounts and loans.

Issues and Adaptations

There were two things that we discovered through this testing. First, the page size for the accounts page that was displayed after logging in was over 200 KB in size. Some of this size came from large JavaScript and style sheet files, but a good portion of the page consisted of the ASP.NET session state variable. A great deal of the user’s session information was being stored in this variable, and it was contributing to a very large page for the user to download. The second problem that we uncovered concerned the servers. Monitoring of the web and application servers did not point to any significant processor, disk, or memory usage issues. However, we had done no monitoring of the database servers during our test run. There were two different database servers, one for maintaining user login information and the other for storing account information. The login database was shared with the production system, and we did not anticipate it having any problems because it did nothing but validate login information. After seeing the poor results we began monitoring the performance of the login database server. We found that the processor on this server was being used nearly 100% of the time during our test run. We then discovered that the server was very old and was not very powerful, so it was having a great deal of difficulty keeping up with production usage on top of our small test. We also realized that our test executions were having a negative effect on production, and that we would have to stop testing until changes had been made. The first step needed to resolve the system’s performance problems was to order new hardware for the login database. A new server was purchased and configured so that we would not be sharing it with production. We never again saw any problems with processor usage on our login database server. The second problem that was resolved was the page size of the accounts page. By removing most of the unnecessary information that was being stored in the session state variable we were able to reduce the size of the page from more than 200 KB to less than 30 KB. This decreased the amount of time it would take to download the accounts page. Changes were also needed in the way that we conducted our tests. First we realized that we needed better monitoring of the servers while we ran tests. This meant scheduling test runs with system administrators, and setting up saved configurations for Windows Performance Monitor settings. We also realized that our biggest problem was going to be logging the user in, because that process required the most work by the application server to retrieve all of the user’s account

slide-34
SLIDE 34
  • information. We stopped development of other test scripts and ran most of the rest of our tests

with just the login script. Finally, we found that we did not know enough about what was going on in each script as the tests ran. We made changes to the scripts to check for and log information about the user and the script during testing. This helped us to better investigate what was causing tests to slow down.

Intermediate Results

After making changes to the code, the system hardware, and our tests we were able to see significant improvements in the test results. The following chart again shows the timer averages for the number of virtual users in the system. We can see here that the average times for logging in are better than before, starting out at less than 2 seconds (as compared with 10 seconds before), and topping out at less than 30 seconds (as compared with 45 seconds before). However, the login timer values increased significantly under higher user loads, and were much higher than we hoped for. What we found was that retrieving the accounts information was now the bottleneck, and took longer and longer as the number of users increased. Monitoring of the application server showed significant processor usage by the Java processes, with lots of connections and calls to the accounts database. Because these middleware components were third party applications there wasn’t much that we could do to improve performance within them. Instead, the developers wrote their own queries and processes to bypass most of these slower third party systems. We also found that the database server was caching accounts data, which resulted in some differences between test runs. We needed to keep this factor in mind as we analyzed results.

Final Results

When we completed our testing the results exceeded our initial performance objectives. The chart below shows a comparison of the login performance between the intermediate and final test

  • runs. The final results show no significant increase in the amount of time for logging in as the

number of virtual users increased, and the average time for logging in was under five seconds.

slide-35
SLIDE 35

Login Timer Comparison

0.00 5.00 10.00 15.00 20.00 25.00 30.00 2 5 8 1 1 1 4 1 7 2 2 3 2 6 2 9 3 2 3 5 3 8 4 1 4 4 4 7 5 Active Users Seconds April 19 Run May 4 Run

Lessons Learned

There are a number of things that we learned as we went through this project that would have helped us do things better from the beginning. First, we had made several faulty assumptions that wasted time in our testing. We had assumed that we did not need to worry about performance of the login database server because it was doing so little work and was already managing production load just fine. We had also assumed that the hardware that was already in place would be sufficient. We found out fairly quickly that the hardware for the login database server was insufficient for our needs, and we lost a lot of testing time while we waited for that hardware to be replaced. We also learned some things about how we had planned to do our testing, and had to change those plans as our testing progressed. We had initially planned to run tests with four different scripts, but soon found that only one of those was really necessary. We did run small tests with the other scripts we had created, but we did not need to generate load with those scripts. We also did not need as many virtual users as we had originally planned to use, because the number of concurrent users in production was not as high as we first expected. We found that we could simulate production concurrent user levels with smaller numbers of users logging in

  • faster. This could have been a very costly mistake if we had used commercial software and

purchased licenses for virtual users that we didn’t need. Additionally, we discovered that we needed to record our test scripts over again several times after code changes were made. The old versions of the scripts no longer represented what users would actually be doing on the site. By breaking down the scripts into different components and creating functions for some tasks it made it easier for us to get the newly recorded scripts working again. Finally, we found it difficult to understand how the accounts database server was caching information and what impact this had on our testing. When we realized that some database information was cached we increased our virtual user pool so that we did not reuse the same

slide-36
SLIDE 36

users during a single test run. We also coordinated with database administrators to clear caches before starting a test run.

E-Commerce Project

The e-commerce project that I worked on was for a company located in Dallas. The company wanted ProtoTest to help with performance testing of the system, and came to us because of our experience with using OpenSTA. They wanted to be able to learn how to do the testing themselves and keep the tests when we were finished. This application was a rewrite of a web-based ordering system, and the software was being developed by a third party. It was developed in Java, the servers ran HPUX, and relied on an Oracle database. Performance test goals for this project included an initial target of 50 concurrent users with transactions taking less than 10 seconds.

Initial Approach

We started the project by identifying which test scripts we would need to simulate the common user activities on the site. The main thing that users did was to create and submit orders, using

  • ne of 2 payment methods: credit card or electronic check. A separate script was created for

each payment type. The average user’s order had 60 different items on it. We wanted to try to simulate a fairly realistic load on the server, so we set up our scripts to randomize the number of items on each

  • rder, from 30 to 100 items. We also randomly selected which items went on each order. We

applied some of the lessons we had learned from the online banking project to this effort, and made sure that we created reusable components to reduce rework between the scripts. We also made better use of validation within the test scripts to check for and log errors. A set of 30 test user accounts were set up for each payment type, and we were not worried about reuse because the same user could be logged in more than once at the same time. Each user was set up with a large credit limit in case there were problems with third party payment processing. Because the client was located in Dallas we decided to do the work remotely. All the test scripts were created and debugged from the ProtoTest headquarters in Denver. Tests were scheduled to be run with system administrators and developers monitoring servers manually, and we coordinated execution with one another through conference calls.

Initial Results

The first few test runs showed some very poor results in all the timer values. We ran several tests, none of which seemed to generate much processor or disk usage on the servers, but the response times for requests increased significantly as the number of virtual users increased. The following chart shows some of the timer values for one of the order submission scripts. Even those timers that were not very server intensive show marked increases.

slide-37
SLIDE 37

Since the servers did not appear to be taxed we wondered whether the network bandwidth was an

  • issue. One way to examine this was to evaluate how long it was taking to download files other

than HTML pages. We looked at how much time the primary request for the portal page took compared with how long the entire portal timer was. The single HTTP request for the portal page is shown in the graph below in blue, and the timer is shown in magenta. Notice that the timer values are much higher than the request for the page itself, indicating that other files associated with that page were adding greatly to the overall timer value. We then compared the amount of time each page took to load with how large the page size was, and came up with the following graph: Portal HTTP Get vs. Timer

50 100 150 Elapsed Time Seconds HPPT Get Portal T03_PORTAL Timer

slide-38
SLIDE 38

This graph shows that the size of the files is almost directly proportional to how long it took to load that file. This was almost a sure indication that bandwidth from the web server to our load generating machines was a bottleneck.

Issues and Adaptations

What we discovered was that our plan to run the tests from our Denver office was flawed, because we did not have sufficient bandwidth there to receive the requested server responses. This resulted in a lot of wasted time for all of the people who had been involved in the test executions because the results were not usable. Future test runs were run at the client’s site in Dallas to avoid this problem. We did gain one good piece of information from the tests, however. Load balancing logs indicated that the entire load was being handled by one web server, and not shared among the three servers available. We discovered that the load balancing was configured to spread load based on the user’s IP address, and all of our virtual users had the same IP address. A change was made to use cookies for load balancing in the future. In addition to running tests at the client site instead of from our Denver office, we needed to make some other changes to our testing approach. Further analysis of current user trends showed that we needed to try to approximate the load of about 1,500 users on the site submitting orders within an hour. Since we had seen so many problems with just 100 users this became a concern. We adjusted our test configurations to simulate this new load and appropriate ramp-up time. We also increased the number of test user accounts from 30 to 1000 for each payment type, so that we would not be using the same user accounts simultaneously. Finally, we realized that the randomness within our test scripts made it difficult to compare

  • results. An order with only 30 items took much less time to process than an order with 100
  • items. So we changed our scripts so that they used the same 60 items on each order, reducing the

variability between each virtual user. HTTP Response Time vs. Response Size

5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 Bytes or Milliseconds Time Size

slide-39
SLIDE 39

Intermediate Results

The results from our intermediate tests showed that we were no longer experiencing the bandwidth problems that we had seen at first. We didn’t observer large increases in every timer value, but we did see large increases in a few of them.

T i m e r A v e r a g e s b y A c t i v e U s e r s

2 0 4 0 6 0 8 0 1 0 0 1 2 0 1 4 0 1 6 0 1 1

  • 2

2 1

  • 3

3 1

  • 4

4 1

  • 5

5 1

  • 6

6 1

  • 7

7 1

  • 8

8 1

  • 9

9 1

  • 1

1 1

  • 1

1 1 1 1

  • 1

2 1 2 1

  • 1

3 1 3 1

  • 1

4 1 4 1

  • 1

5 1 5 1

  • 1

6 1 6 1

  • 1

7 1 7 1

  • 1

8

V i r t u a l U s e r s Seconds

T 0 6 _ P R I C E _ O R D E R T 0 7 _ P R I C E _ O R D E R T 1 2 _ C O N F I R M _ S U B M I T

This chart shows three timer values and their average times for the number of virtual users. Two

  • f these timers are for determining pricing of the order for different payment types, and the third

shows how long it took to submit the order at the end. These timer values were slow to begin with, and got worse as the user load increased. Analysis of the individual HTTP requests showed that the server was taking a long time to process the pricing and submission requests, especially as the user load increased. Similarly, the average number of HTTP responses became more erratic as the tests progressed, as seen in the chart below. The wide variations near the end of the test show that the server was having difficulty responding to the load.

Issues and Adaptations

We discovered a few problems after analyzing the results, which gave us great information about where to focus our troubleshooting efforts. We found that a number of the page requests were

slide-40
SLIDE 40

timing out, or were ending in errors that we did not log. Also, the application servers were having a harder time keeping up with requests than the web servers, so one of the web servers was reconfigured as an application server to help handle the load. We also discovered that because we were using the same items on every order there were record locking conflicts within the database. This resulted in processes queuing up to update inventory amounts as orders were submitted. Database queries were refactored to optimize the pricing and submission processes. The developers found bugs in the code and places where they could also optimize dealing with pricing and order submission. In addition, the developers and system administrators began to

  • ptimize the server software configurations to see which settings would provide the best results.

They began making one or two configuration changes between each test run to make it easier to identify what settings were or were not making things better. We also continued to make adjustments to our test scripts and test configurations. We realized that the type of payment processing we used did not have an impact on the system, so we started using just a single script in most of our test runs. We did run other scripts with light load, just to make sure that there were no other big bottlenecks, but our focus became trying to reduce the time for pricing and order submittal. We also tuned our test scripts to make them more efficient. This involved rerecording the scripts to match the latest code, looking for and logging additional errors, and paying better attention to simulated wait times. We also became concerned about bandwidth again because we were testing with larger numbers of virtual users. We needed to be more mindful of simulating browser caching better and understanding our current bandwidth limitations. We generated load from two machines instead of one, and used a third for controlling the tests and collecting the

  • results. Finally, we made a better effort to keep user and product data cleaned up between test

runs, so that we did not run out of money to pay for orders and didn’t run out of items.

Final Results

During our final tests we were very gratified to see huge improvements in our timer data. These improvements were a result of the combination of changes to server hardware, software configurations, and code/database optimizations. The following chart shows all of the timer values for our order submission script vs. the number of virtual users.

slide-41
SLIDE 41

You can see here that all of the timer averages are less than 5 seconds, including order pricing and submission, until the user load reached more than 700 users. This is a tremendous improvement over the first few test runs. The following table compares the final and intermediate runs’ timer averages. Timer Name Final Run Intermediate Run % Improvement T01_HOMEPAGE 0.91 2.15 58% T02_LOGIN 0.2 0.89 77% T03_PORTAL 1.92 2.57 25% T04_ORDER_LIST 2.3 3.33 31% T05_BUILD_ORDER 2.75 4.32 36% T06_SAVE_ORDER 0.8 4.67 83% T07_PRICE_ORDER 10.49 50.65 79% T08_PAY_ORDER 2.33 5.09 54% T09_CREDIT_CARD 3 5.15 42% T10_CARD_PROFILE 2.4 3.9 38% T11_SUBMIT_PAYMENT 4.02 5.27 24% T12_SUBMIT_ORDER 2.67 4.28 38% T13_CONFIRM_SUBMIT 12.05 37.55 68% T14_LOGOUT 0.39 1.35 71% We still saw an increase in all timer values above 700 users, but the data gives some indications that this could again be related to bandwidth problems because of the large number of users involved.

Lessons Learned

One of the most important lessons we learned from this project was not to overestimate your network bandwidth. We wasted the time of a lot of people with initial test runs that were slowed

0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 1-25 26-50 51-75 76-100 101-125 126-150 151-175 176-200 201-225 226-250 251-275 276-300 301-325 326-350 351-375 376-400 401-425 426-450 451-475 476-500 501-525 526-550 551-575 576-600 601-625 626-650 651-675 676-700 701-725 726-750 751-775 776-800 801-825 826-850 851-875 876-900 T01_HOMEPAGE T02_LOGIN T03_PORTAL T04_ORDER_LIST T05_BUILD_ORDER T06_SAVE_ORDER T08_PAY_ORDER T09_CREDIT_CARD T10_CARD_PROFILE T11_SUBMIT_PAYMENT T12_SUBMIT_ORDER T14_LOGOUT Average of Elapsed Time Active Users Timer Name

slide-42
SLIDE 42

down simply by network limitations. When running performance tests you need to be aware of internal and external network limits, and be mindful of how browser caching may impact the system performance. We also saw again that variations in the scripts between test runs can make it difficult to compare results. We found that keeping the following things consistent between runs made it easier for us to isolate and resolve problems:

  • User ramp-up schemes
  • Number of items on orders
  • Configuration changes

We also found that better checking and logging of errors within test scripts helped us troubleshoot problems that we came across much better. We were able to see whether user transactions were successful, and find out what users would actually be experiencing on the

  • system. Finally, we saw that modularizing our test scripts made them more maintainable, and

that we needed to have better control of test data.

Conclusion

The key to our success in these performance testing projects was making a plan for testing and then being flexible in applying that plan. Documenting your assumptions up front can help you to see them more clearly, and determine if they are valid. Likewise it is important to document risks and make contingency plans for dealing with them. As you apply your plan and carry out your testing you should do careful analysis of test results to help prevent you from jumping to conclusions about the real cause of system problems. This analysis should extend to not only the system under test but also to the test scripts and the tests

  • themselves. Then you need to make adjustments to the system hardware, software

configurations, code, and tests to better identify and resolve performance problems.