SE 433/333 Software Testing & Quality Assurance

1 SE 433/333 Software Testing & Quality AssuranceMay 23, ...
Author: May Robbins
0 downloads 3 Views

1 SE 433/333 Software Testing & Quality AssuranceMay 23, 2017 SE 433/333 Software Testing & Quality Assurance Dennis Mumaugh, Instructor Office: CDM, Room 428 Office Hours: Tuesday, 4:00 – 5:30 May 23, 2017 SE 433: Lecture 9 Lecture 9

2 Administrivia Comments and feedbackSE 433 May 23, 2017 Comments and feedback Assignment 1-8 solutions are posted Assignments and exams: Assignment 9: Due May 30 Final exam: June 1-7 Take home exam/paper: June 6 [For SE433 students only] May 23, 2017 SE 433: Lecture 9 Lecture 9

3 Assignment comments SE 433 May 23, 2017 Have tried for days to get my test files to compile and have failed to get the JUnit test to recognize the .class files included in the zip.   The files were placed in the bin/edu.depaul.se433 directory and the classpath was set to  C:\Users\xxxx\workspace\Assignment7\bin\edu.depaul.se433.   Try bin\edu\depaul\se433 directory May 23, 2017 SE 433: Lecture 9 Lecture 9

4 SE 433 – Class 9 Topic: Reading: Integration & System TestingMay 23, 2017 Topic: Integration & System Testing Reading: Chapter 21.1, 21.2 of the textbook. Chapter 22 of the textbook. Articles on the Reading List May 23, 2017 SE 433: Lecture 9 Lecture 9

5 Final Exam Paper This paper is for SE433 students only.May 23, 2017 This paper is for SE433 students only. Due June 6, No late submissions! Cloud Storage plans to immediately hire a Director of Software Quality Assurance, and you have been identified as a finalist. The CEO of the company has asked you to present an executive summary of your recommendations and plans to steer the project through its initial launch if you were hired as the Director of Software Quality Assurance and were given necessary authorities to execute your plans. If you have questions regarding the take-home part of the final exam, you may post your questions on the mailing list on or before Sunday June 4, 2017. May 23, 2017 SE 433: Lecture 9 Lecture 9

6 “Software testing proves the existence of bugs not their absence.”May 23, 2017 Thought for the Day “Software testing proves the existence of bugs not their absence.” – Anonymous “The principle objective of software testing is to give confidence in the software.” May 23, 2017 SE 433: Lecture 9 Lecture 9

7 Case Study  Mars Climate OrbiterMay 23, 2017 Case Study  Mars Climate Orbiter Click to add notes May 23, 2017 SE 433: Lecture 9 Lecture 9

8 Case Study  Mars Climate OrbiterMay 23, 2017 NASA’s Mars Climate Orbiter Launched on December 11, 1998 Intended to enter an orbit at 140 –150 km above Mars. On September 23, 1999 It smashed into the planet's atmosphere and was destroyed. Cost: $328M May 23, 2017 SE 433: Lecture 9 Lecture 9

9 Case Study  Mars Climate OrbiterMay 23, 2017 Cause of failure The software controlling the thrusters on the spacecraft used different units. Software modules were developed by teams in US and Europe Engineers failed to convert the measure of rocket thrusts English unit: Pound-Force Metric unit: Newton, kg  m / s2 Difference: a factor of ≈ 4.45 May 23, 2017 SE 433: Lecture 9 Lecture 9

10 SE 433 May 23, 2017 Integration Testing Integration testing (sometimes called integration and testing, abbreviated I&T) is the phase in software testing in which individual software modules are combined and tested as a group. It occurs after unit testing and before validation testing. May 23, 2017 SE 433: Lecture 9 Lecture 9

11 Objectives Understand the purpose of integration testingMay 23, 2017 Understand the purpose of integration testing Distinguish typical integration faults from faults that should be eliminated in unit testing Understand the nature of integration faults and how to prevent as well as detect them Understand strategies for ordering construction and testing Approaches to incremental assembly and testing to reduce effort and control risk May 23, 2017 SE 433: Lecture 9 Lecture 9 11

12 Integration vs. Unit TestingSE 433 May 23, 2017 Unit (module) testing is a necessary foundation Unit level has maximum controllability and visibility Integration testing can never compensate for inadequate unit testing Integration testing may serve as a process check If module faults are revealed in integration testing, they signal inadequate unit testing If integration faults occur in interfaces between correctly implemented modules, the errors can be traced to module breakdown and interface specifications May 23, 2017 SE 433: Lecture 9 Lecture 9

13 Integration Testing SE 433 May 23, 2017 The entire system is viewed as a collection of subsystems (sets of classes) determined during the system and object design Goal: Test all interfaces between subsystems and the interaction of subsystems The Integration testing strategy determines the order in which the subsystems are selected for testing and integration. May 23, 2017 SE 433: Lecture 9 Lecture 9

14 Why do we do integration testing?SE 433 May 23, 2017 Unit tests only test the unit in isolation Many failures result from faults in the interaction of subsystems Often many Off-the-shelf components are used that cannot be unit tested Without integration testing the system test will be very time consuming Failures that are not discovered in integration testing will be discovered after the system is deployed and can be very expensive. May 23, 2017 SE 433: Lecture 9 Lecture 9

15 What is Integration Testing?SE 433 May 23, 2017 Unit/module test Integration test System test Specification: Module interface Interface specs, module breakdown Requirements specification Visible structure: Coding details Modular structure (software architecture) — none — Scaffolding required: Some Often extensive Looking for faults in: Modules Interactions, compatibility System functionality May 23, 2017 SE 433: Lecture 9 Lecture 9

16 What is Software Integration Testing?SE 433 May 23, 2017 Testing activities that integrate software components together to form a complete system. To perform a cost-effective software integration, integration test strategy, integration test set are needed. Major testing focuses: Interfaces between modules (or components) Integrated functional features Interacting protocols and messages System architectures Who performs software integration: Developers and test engineers What do you need?: Integration strategy Integration test environment and test suite Module (or component) specifications Interface and design documents May 23, 2017 SE 433: Lecture 9 Lecture 9

17 What is a software integration strategy?SE 433 May 23, 2017 Software test strategy provides the basic strategy and guidelines to test engineers to perform software testing activities in a rational way. Software integration strategy usually refers to an integration sequence (or order) to integrate different parts (or components) together May 23, 2017 SE 433: Lecture 9 Lecture 9

18 Integration Faults Inconsistent interpretation of parameters or valuesSE 433 May 23, 2017 Inconsistent interpretation of parameters or values Example: Mixed units (Pound/Newton) in Martian Lander Violations of value domains, capacity, or size limits Example: Buffer overflow Side effects on parameters or resources Example: Conflict on (unspecified) temporary file May 23, 2017 SE 433: Lecture 9 Lecture 9

19 Integration Faults Omitted or misunderstood functionalitySE 433 May 23, 2017 Omitted or misunderstood functionality Example: Inconsistent interpretation of web requests Nonfunctional properties Example: Unanticipated performance issues Dynamic mismatches Example: Incompatible polymorphic method calls May 23, 2017 SE 433: Lecture 9 Lecture 9

20 Example: A Memory Leak Apache web server, version 2.0.48May 23, 2017 Apache web server, version Response to normal page request on secure (https) port static void ssl_io_filter_disable(ap filter t *f) { bio_filter_in_ctx_t *inctx = f->ctx; inctx->ssl = NULL; inctx->filter_ctx->pssl = NULL; } No obvious error, but Apache leaked memory slowly (in normal use) or quickly (if exploited for a DOS attack) May 23, 2017 SE 433: Lecture 9 Lecture 9

21 Example: A Memory Leak SE 433 May 23, 2017 Apache web server, version Response to normal page request on secure (https) port static void ssl_io_filter_disable(ap filter t *f) { bio_filter_in_ctx_t *inctx = f->ctx; SSL_free(inctx -> ssl); inctx->ssl = NULL; inctx->filter_ctx->pssl = NULL; } The missing code is for a structure defined and created elsewhere, accessed through an opaque pointer. May 23, 2017 SE 433: Lecture 9 Lecture 9

22 Example: A Memory Leak SE 433 May 23, 2017 Apache web server, version Response to normal page request on secure (https) port static void ssl_io_filter_disable(ap filter t *f) { bio_filter_in_ctx_t *inctx = f->ctx; SSL_free(inctx -> ssl); inctx->ssl = NULL; inctx->filter_ctx->pssl = NULL; } Almost impossible to find with unit testing. (Inspection and some dynamic techniques could have found it.) May 23, 2017 SE 433: Lecture 9 Lecture 9

23 Integration Test StrategiesSE 433 May 23, 2017 Integration Test Strategies May 23, 2017 SE 433: Lecture 9 Lecture 9

24 Maybe You’ve Heard ... SE 433 May 23, 2017 Yes, I implemented module A, but I didn’t test it thoroughly yet. It will be tested along with module B when that’s ready. May 23, 2017 SE 433: Lecture 9 Lecture 9

25 Translation ... SE 433 May 23, 2017 Yes, I implemented module A, but I didn’t test it thoroughly yet. It will be tested along with module B when that’s ready. I didn’t think at all about the strategy for testing. I didn’t design module A for testability and I didn’t think about the best order to build and test modules A and B May 23, 2017 SE 433: Lecture 9 Lecture 9

26 Integration Plan & Test PlanSE 433 May 23, 2017 Integration test plan drives and is driven by the project “build plan” A key feature of the system architecture and project plan System Architecture ... ... Build Plan Test Plan ... May 23, 2017 SE 433: Lecture 9 Lecture 9

27 Types of Testing Unit Testing: Integration Testing:SE 433 May 23, 2017 Unit Testing: Individual subsystem Carried out by developers (of components) Goal: Confirm that subsystems is correctly coded and carries out the intended functionality Integration Testing: Groups of subsystems (collection of classes) and eventually the entire system Carried out by developers Goal: Test the interface and the interplay among the subsystems May 23, 2017 SE 433: Lecture 9 Lecture 9

28 Types of Testing System Testing: Acceptance and Installation Testing:SE 433 May 23, 2017 System Testing: The entire system Carried out by developers (testers!) Goal: Determine if the system meets the requirements (functional and global) Functional Testing: Test of functional requirements Performance Testing: Test of non-functional requirements Acceptance and Installation Testing: Evaluates the system delivered by developers Carried out by the client. Goal: Demonstrate that the system meets customer requirements and is ready to use May 23, 2017 SE 433: Lecture 9 Lecture 9

29 Drivers and Stubs SE 433 May 23, 2017 Driver: A program that calls the interface procedures of the module being tested and reports the results A driver simulates a module that calls the module currently being tested Stub: A program that has the same interface as a module that is being used by the module being tested, but is simpler. A stub simulates a module called by the module currently being tested May 23, 2017 SE 433: Lecture 9 Lecture 9

30 Drivers and Stubs Driver Module Under Test Stub procedure callSE 433 May 23, 2017 Driver Module Under Test Stub procedure call access to global variables Driver and Stub should have the same interface as the modules they replace Driver and Stub should be simpler than the modules they replace May 23, 2017 SE 433: Lecture 9 Lecture 9

31 Stubs and drivers Driver: Driver Stub: Tested Unit StubSE 433 Stubs and drivers May 23, 2017 Driver: A component, that calls the TestedUnit Controls the test cases Stub: A component, the TestedUnit depends on Partial implementation Returns fake values. Driver Tested Unit Stub May 23, 2017 SE 433: Lecture 9 Lecture 9

32 Example: A 3-Layer-Design (Spreadsheet)SE 433 May 23, 2017 Layer I Layer II Layer III Spread SheetView A Calculator C BinaryFile Storage E XMLFile Storage F Currency DataBase G Currency Converter D Data Model B Entity Model For the following integration testing strategies, we use this call hierarchy of an example design consisting of 3 layers and 7 subsystems. May 23, 2017 SE 433: Lecture 9 Lecture 9

33 Integration Testing StrategySE 433 May 23, 2017 The entire system is viewed as a collection of subsystems (sets of classes) determined during the system and object design. Assumption: System Decomposition is hierarchical The order in which the subsystems are selected for testing and integration determines the testing strategy Big bang integration (Nonincremental) Bottom up integration Top down integration Sandwich testing Variations of the above May 23, 2017 SE 433: Lecture 9 Lecture 9

34 “Big Bang” Integration TestSE 433 May 23, 2017 An extreme and desperate approach: Test only after integrating all modules Does not require scaffolding The only excuse, and a bad one Minimum observability, diagnosability, efficacy, feedback High cost of repair Recall: Cost of repairing a fault rises as a function of time between error and repair May 23, 2017 SE 433: Lecture 9 Lecture 9

35 Structural vs. Functional StrategiesSE 433 May 23, 2017 Structural orientation: Modules constructed, integrated and tested based on a hierarchical project structure Top-down, Bottom-up, Sandwich, Backbone Functional orientation: Modules integrated according to application characteristics or features Threads, Critical modules May 23, 2017 SE 433: Lecture 9 Lecture 9

36 Top-down Testing StrategySE 433 May 23, 2017 Test the top layer or the controlling subsystem first Then combine all the subsystems that are called by the tested subsystems and test the resulting collection of subsystems Do this until all subsystems are incorporated into the test Test Stubs are used to simulate the components of lower layers that have not yet been integrated. No drivers are needed May 23, 2017 SE 433: Lecture 9 Lecture 9

37 Top Down Integration StrategySE 433 May 23, 2017 Working from the top level (in terms of “use” or “include” relation) toward the bottom. No drivers required if program tested from top-level interface (e.g. GUI, CLI, web app, etc.) May 23, 2017 SE 433: Lecture 9 Lecture 9

38 Top Down Integration StrategySE 433 May 23, 2017 Write stubs of called or used modules at each step in construction May 23, 2017 SE 433: Lecture 9 Lecture 9

39 Top Down Integration StrategySE 433 May 23, 2017 As modules replace stubs, more functionality is testable May 23, 2017 SE 433: Lecture 9 Lecture 9

40 Top Down Integration StrategySE 433 May 23, 2017 ... until the program is complete, and all functionality can be tested May 23, 2017 SE 433: Lecture 9 Lecture 9

41 Bottom-up Testing StrategySE 433 May 23, 2017 The subsystem in the lowest layer of the call hierarchy are tested individually Then the next subsystems are integrated and tested from the next layer up that call the previously tested subsystems This is done repeatedly until all subsystems are included in the testing Only Test Drivers are used to simulate the components of higher layers No Test Stubs! May 23, 2017 SE 433: Lecture 9 Lecture 9

42 Bottom Up Integration StrategySE 433 May 23, 2017 Starting at the leaves of the “uses” hierarchy, we never need stubs May 23, 2017 SE 433: Lecture 9 Lecture 9

43 Bottom Up Integration StrategySE 433 May 23, 2017 ... but we must construct drivers for each module (as in unit testing) ... May 23, 2017 SE 433: Lecture 9 Lecture 9

44 Bottom Up Integration StrategySE 433 May 23, 2017 ... an intermediate module replaces a driver, and needs its own driver ... May 23, 2017 SE 433: Lecture 9 Lecture 9

45 Bottom Up Integration StrategySE 433 Bottom Up Integration Strategy May 23, 2017 May 23, 2017 SE 433: Lecture 9 Lecture 9

46 Bottom Up Integration StrategySE 433 May 23, 2017 ... so we may have several working subsystems ... May 23, 2017 SE 433: Lecture 9 Lecture 9

47 Bottom Up Integration StrategySE 433 May 23, 2017 ... that are eventually integrated into a single system. May 23, 2017 SE 433: Lecture 9 Lecture 9

48 Sandwich Testing StrategySE 433 May 23, 2017 Combines top-down strategy with bottom-up strategy (parallel testing is possible) The system is viewed as having three layers A target layer in the middle A layer above the target (top layer) A layer below the target (bottom layer) Testing converges towards the target layer No Test Stubs and Drivers are necessary for bottom and top layer May 23, 2017 SE 433: Lecture 9 Lecture 9

49 Sandwich Integration StrategySE 433 May 23, 2017 Working from the extremes (top and bottom) toward center, we may use fewer drivers and stubs May 23, 2017 SE 433: Lecture 9 Lecture 9

50 Sandwich Integration StrategySE 433 May 23, 2017 Sandwich integration is flexible and adaptable, but complex to plan May 23, 2017 SE 433: Lecture 9 Lecture 9

51 Thread Integration StrategySE 433 May 23, 2017 A “thread” is a portion of several modules that together provide a user-visible program feature. May 23, 2017 SE 433: Lecture 9 Lecture 9

52 Thread Integration StrategySE 433 May 23, 2017 Integrating one thread, then another, etc., we maximize visibility for the user May 23, 2017 SE 433: Lecture 9 Lecture 9

53 Thread Integration StrategySE 433 May 23, 2017 As in sandwich integration testing, we can minimize stubs and drivers, but the integration plan may be complex May 23, 2017 SE 433: Lecture 9 Lecture 9

54 Critical Modules Integration StrategySE 433 May 23, 2017 Strategy: Start with riskiest modules Risk assessment is necessary first step May include technical risks (is X feasible?), process risks (is schedule for X realistic?), other risks May resemble thread or sandwich process in tactics for flexible build order E.g., constructing parts of one module to test functionality in another Key point is risk-oriented process Integration testing as a risk-reduction activity, designed to deliver any bad news as early as possible May 23, 2017 SE 433: Lecture 9 Lecture 9 54

55 Continuous Testing Continuous build: Requires integrated tool support:SE 433 May 23, 2017 Continuous build: Build from day one Test from day one Integrate from day one System is always runnable Requires integrated tool support: Continuous build server Automated tests with high coverage Tool supported refactoring Software configuration management Issue tracking. May 23, 2017 SE 433: Lecture 9 Lecture 9

56 Continuous Testing StrategySE 433 Continuous Testing Strategy May 23, 2017 A Spread SheetView Layer I B C D Data Model Calculator Currency Converter Layer II E F G BinaryFile Storage XMLFile Storage Currency DataBase Layer III SCRUM Scheibe vom Brot XXXX + Cells + Addition Sheet View + File Storage May 23, 2017 SE 433: Lecture 9 Lecture 9

57 Which Integration Strategy should you use?May 23, 2017 Factors to consider Location of critical parts in the system Availability of hardware Availability of components Scheduling concerns Bottom up approach good for object oriented design methodologies Test driver interfaces must match component interfaces Top-level components are usually important and cannot be neglected up to the end of testing Detection of design errors postponed until end of testing May 23, 2017 SE 433: Lecture 9 Lecture 9

58 Which Integration Strategy should you use?May 23, 2017 Top down approach Test cases can be defined in terms of functions examined Need to maintain correctness of test stubs Writing stubs can be difficult Functional strategies require more planning Structural strategies (bottom up, top down, sandwich) are simpler But thread and critical modules testing provide better process visibility, especially in complex systems Possible to combine Top-down, bottom-up, or sandwich are reasonable for relatively small components and subsystems Combinations of thread and critical modules integration testing are often preferred for larger subsystems May 23, 2017 SE 433: Lecture 9 Lecture 9

59 Steps in Integration TestingSE 433 May 23, 2017 1. Based on the integration strategy, select a component to be tested. Unit test all the classes in the component. 2. Put selected component together; do any preliminary fix-up necessary to make the integration test operational (drivers, stubs) 3. Test functional requirements: Define test cases that exercise all use cases with the selected component 4. Test subsystem decomposition: Define test cases that exercise all dependencies 5. Test non-functional requirements: Execute performance tests 6. Keep records of the test cases and testing activities. 7. Repeat steps 1 to 7 until the full system is tested. The primary goal of integration testing is to identify failures with the (current) component configuration. . May 23, 2017 SE 433: Lecture 9 Lecture 9

60 Summary Integration testing focuses on interactionsMay 23, 2017 Integration testing focuses on interactions Must be built on foundation of thorough unit testing Integration faults often traceable to incomplete or misunderstood interface specifications Prefer prevention to detection, and make detection easier by imposing design constraints Strategies tied to project build order Order construction, integration, and testing to reduce cost or risk May 23, 2017 SE 433: Lecture 9 Lecture 9 60

61 System, Acceptance, and Regression TestingSE 433 May 23, 2017 System, Acceptance, and Regression Testing May 23, 2017 SE 433: Lecture 9 Lecture 9

62 Objectives Distinguish system and acceptance testingSE 433 May 23, 2017 Distinguish system and acceptance testing How and why they differ from each other and from unit and integration testing Understand basic approaches for quantitative assessment (reliability, performance, ...) Understand interplay of validation and verification for usability and accessibility How to continuously monitor usability from early design to delivery Understand basic regression testing approaches Preventing accidental changes May 23, 2017 SE 433: Lecture 9 Lecture 9 62

63 System Testing Functional Testing Validates functional requirementsSE 433 May 23, 2017 Functional Testing Validates functional requirements Performance Testing Validates non-functional requirements Acceptance Testing Validates client’s expectations Installation Testing Impact of requirements on system testing: The more explicit the requirements, the easier they are to test. Quality of use cases determines the ease of functional testing Quality of nonfunctional requirements and constraints determines the ease of performance tests When we are system testing, we are testing all subsystems together. The requirements have a large impact on the quality of system testing: The more explicit the requirements, the easier they are to test. We distinguish the following types of system testing Let’s walk through each of these system testing types May 23, 2017 SE 433: Lecture 9 Lecture 9

64 Types of Testing Unit Testing: Integration Testing:SE 433 May 23, 2017 Unit Testing: Individual subsystem Carried out by developers (of components) Goal: Confirm that subsystems is correctly coded and carries out the intended functionality Integration Testing: Groups of subsystems (collection of classes) and eventually the entire system Carried out by developers Goal: Test the interface and the interplay among the subsystem May 23, 2017 SE 433: Lecture 9 Lecture 9

65 Types of Testing System Testing: Acceptance and Installation Testing:SE 433 May 23, 2017 System Testing: The entire system Carried out by developers (testers!) Goal: Determine if the system meets the requirements (functional and global) Functional Testing: Test of functional requirements Performance Testing: Test of non-functional requirements Acceptance and Installation Testing: Evaluates the system delivered by developers Carried out by the client. Goal: Demonstrate that the system meets customer requirements and is ready to use May 23, 2017 SE 433: Lecture 9 Lecture 9

66 Functional Testing SE 433 May 23, 2017 Functional testing finds differences between functional requirements and the implemented system Essentially the same as black box testing Goal: Test functionality of system Test cases are designed from the requirements analysis document (better: user manual) and centered around requirements and key functions (use cases) The system is treated as black box Unit test cases can be reused, but new test cases have to be developed as well. Select tests that are relevant to the user and have a high probability of uncovering a failure Use techniques like equivalence tests May 23, 2017 SE 433: Lecture 9 Lecture 9

67 Performance Testing Stress Testing Volume testingSE 433 May 23, 2017 Stress Testing Checks if the system can respond to many simultaneous requests (maximum # of users, peak demands) Volume testing Test what happens if large amounts of data are handled Configuration testing Test the various software and hardware configurations Compatibility test Test backward compatibility with existing systems Security testing Try to violate security requirements May 23, 2017 SE 433: Lecture 9 Lecture 9

68 Performance Testing Timing testing Environmental test Quality testingSE 433 May 23, 2017 Timing testing Evaluate response times and time to perform a function Environmental test Test tolerances for heat, humidity, motion, portability Quality testing Test reliability, maintainability & availability of the system Recovery testing Tests system’s response to presence of errors or loss of data. Human factors testing Tests user interface with user May 23, 2017 SE 433: Lecture 9 Lecture 9

69 Test Cases for Performance TestingMay 23, 2017 Goal: Try to violate non-functional requirements Push the (integrated) system to its limits. Goal: Try to break the subsystem Test how the system behaves when overloaded. Can bottlenecks be identified? (First candidates for redesign in the next iteration) Try unusual orders of execution Call a receive() before send() Check the system’s response to large volumes of data If the system is supposed to handle 1000 items, try it with 1001 items. What is the amount of time spent in different use cases? Are typical cases executed in a timely fashion? May 23, 2017 SE 433: Lecture 9 Lecture 9

70 Types of Performance TestingSE 433 Types of Performance Testing May 23, 2017 Stress Testing Stress limits of system Volume testing Test what happens if large amounts of data are handled Configuration testing Test the various software and hardware configurations Compatibility test Test backward compatibility with existing systems Timing testing Evaluate response times and time to perform a function Security testing Try to violate security requirements Environmental test Test tolerances for heat, humidity, motion Quality testing Test reliability, maintain- ability & availability Recovery testing Test system’s response to presence of errors or loss of data Human factors testing Test with end users. Stress Testing Stress limits of system (maximum # of users, peak demands, extended operation) Volume testing Test what happens if large amounts of data are handled Configuration testing Test the various software and hardware configurations Compatibility test Test backward compatibility with existing systems Security testing Try to violate security requirements May 23, 2017 SE 433: Lecture 9 Lecture 9

71 Types of Acceptance TestingSE 433 May 23, 2017 Acceptance testing is a formal testing conducted to determine whether a system satisfies its acceptance criteria There are two categories of acceptance testing: User Acceptance Testing (UAT) It is conducted by the customer to ensure that system satisfies the contractual acceptance criteria before being signed-off as meeting user needs. Business Acceptance Testing (BAT) It is undertaken within the development organization of the supplier to ensure that the system will eventually pass the user acceptance testing. May 23, 2017 SE 433: Lecture 9 Lecture 9

72 Types of Acceptance TestingSE 433 May 23, 2017 Three major objectives of acceptance testing: Confirm that the system meets the agreed upon criteria Identify and resolve discrepancies, if there is any Determine the readiness of the system for cut-over to live operations May 23, 2017 SE 433: Lecture 9 Lecture 9

73 Acceptance Testing SE 433 May 23, 2017 Goal: Demonstrate system is ready for operational use Choice of tests is made by client Many tests can be taken from integration testing Acceptance test is performed by the client, not by the developer. Majority of all bugs in software is typically found by the client after the system is in use, not by the developers or testers. Therefore two kinds of additional tests: Alpha test: Sponsor uses the software at the developer’s site. Software used in a controlled setting, with the developer always ready to fix bugs. Beta test: Conducted at sponsor’s site (developer is not present) Software gets a realistic workout in target environment Potential customer might get discouraged May 23, 2017 SE 433: Lecture 9 Lecture 9

74 System Testing Key characteristics:SE 433 May 23, 2017 Key characteristics: Comprehensive (the whole system, the whole spec) Based on the specification of observable behavior Verification against a requirements specification, not validation, and not opinions Independent of design and implementation Independence: Avoid repeating software design errors in system test design May 23, 2017 SE 433: Lecture 9 Lecture 9

75 What is System Testing? System Acceptance Regression Test for ...SE 433 May 23, 2017 System Acceptance Regression Test for ... Correctness, completion Usefulness, satisfaction Accidental changes Test by ... Development test group Test group with users Verification Validation May 23, 2017 SE 433: Lecture 9 Lecture 9

76 Independent V&V One strategy for maximizing independence:SE 433 May 23, 2017 One strategy for maximizing independence: System (and acceptance) test performed by a different organization Organizationally isolated from developers no pressure to say “ok” Sometimes outsourced to another company or agency Especially for critical systems Outsourcing for independent judgment, not to save money May be additional system test, not replacing internal V&V Not all outsourced testing is IV&V Not independent if controlled by development organization May 23, 2017 SE 433: Lecture 9 Lecture 9

77 Achieving Independence Without Changing StaffSE 433 May 23, 2017 If the development organization controls system testing ... Perfect independence may be unattainable, but we can reduce undue influence Develop system test cases early As part of requirements specification, before major design decisions have been made Agile “test first” Conventional “V model” Critical system testing early in project May 23, 2017 SE 433: Lecture 9 Lecture 9

78 Incremental System TestingSE 433 May 23, 2017 System tests are often used to measure progress System test suite covers all features and scenarios of use As project progresses, the system passes more and more system tests Assumes a “threaded” incremental build plan: Features exposed at top level as they are developed May 23, 2017 SE 433: Lecture 9 Lecture 9

79 Global Properties Some system properties are inherently globalSE 433 May 23, 2017 Some system properties are inherently global Performance, latency, reliability, ... Early and incremental testing is still necessary, but provide only estimates A major focus of system testing The only opportunity to verify global properties against actual system specifications Especially to find unanticipated effects, e.g., an unexpected performance bottleneck May 23, 2017 SE 433: Lecture 9 Lecture 9

80 Context-Dependent PropertiesSE 433 May 23, 2017 Beyond system-global: Some properties depend on the system context and use Example: Performance properties depend on environment and configuration Example: Privacy depends both on system and how it is used Medical records system must protect against unauthorized use, and authorization must be provided only as needed Example: Security depends on threat profiles And threats change! Testing is just one part of the approach May 23, 2017 SE 433: Lecture 9 Lecture 9

81 Establishing an Operational EnvelopeSE 433 May 23, 2017 When a property (e.g., performance or real-time response) is parameterized by use ... requests per second, size of database, ... Extensive stress testing is required varying parameters within the envelope, near the bounds, and beyond Goal: A well-understood model of how the property varies with the parameter How sensitive is the property to the parameter? Where is the “edge of the envelope”? What can we expect when the envelope is exceeded? May 23, 2017 SE 433: Lecture 9 Lecture 9

82 Stress Testing SE 433 May 23, 2017 Often requires extensive simulation of the execution environment With systematic variation: What happens when we push the parameters? What if the number of users or requests is 10 times more, or 1000 times more? Often requires more resources (human and machine) than typical test cases Separate from regular feature tests Run less often, with more manual control Diagnose deviations from expectation Which may include difficult debugging of latent faults! May 23, 2017 SE 433: Lecture 9 Lecture 9

83 Acceptance Testing Estimating dependabilitySE 433 May 23, 2017 Estimating dependability Measuring quality, not searching for faults Fundamentally different goal than systematic testing Quantitative dependability goals are statistical Reliability Availability Mean time to failure ... Requires valid statistical samples from operational profile Fundamentally different from systematic testing The systematic testing techniques discussed in previously (specification-based testing, structural testing, model-based testing, et al) are all designed to make the search for faults as effective as possible. They are intentionally “biased” to take more samples where we think faults might be. Statistical measures of dependability require, instead, unbiased samples from the population of operational behaviors. May 23, 2017 SE 433: Lecture 9 Lecture 9

84 Statistical Sampling We need a valid operational profile (model)SE 433 May 23, 2017 We need a valid operational profile (model) Sometimes from an older version of the system Sometimes from operational environment (e.g., for an embedded controller) Sensitivity testing reveals which parameters are most important, and which can be rough guesses And a clear, precise definition of what is being measured Failure rate? Per session, per hour, per operation? And many, many random samples Especially for high reliability measures May 23, 2017 SE 433: Lecture 9 Lecture 9

85 Is Statistical Testing Worthwhile?SE 433 May 23, 2017 Necessary for ... Critical systems (safety critical, infrastructure, ...) But difficult or impossible when ... Operational profile is unavailable or just a guess Often for new functionality involving human interaction But we may factor critical functions from overall use to obtain a good model of only the critical properties Reliability requirement is very high Required sample size (number of test cases) might require years of test execution Ultra-reliability can seldom be demonstrated by testing May 23, 2017 SE 433: Lecture 9 Lecture 9

86 Process-Based MeasuresMay 23, 2017 Less rigorous than statistical testing Based on similarity with prior projects System testing process Expected history of bugs found and resolved Alpha, beta testing Alpha testing: Real users, controlled environment Beta testing: Real users, real (uncontrolled) environment May statistically sample users rather than uses Expected history of bug reports An early release of half-baked software is not what we mean by alpha and beta testing. Note that today “alpha” and “beta” are often used informally, but here we are using them in their established technical sense for a testing process. An alpha test involves bringing users on-site to use the system. A beta test means providing the software to a controlled sample of users to use the system in their own environment. In both cases, to make any reasonable inference of dependability we need a valid sample of users. Using the history of system testing was discussed in Chapter 20, Planning and monitoring, and is illustrated May 23, 2017 SE 433: Lecture 9 Lecture 9

87 UI testing ("acceptance")Automated UI testing ("automation") Scripts and such that use your app and look for failures A black-box system test Manual tests Human beings click through predetermined paths Need to write down the specific tests each time Ad-hoc tests Human beings are "turned loose" on the app to see if they can break it May 23, 2017 SE 433: Lecture 9

88 Usability Test A usable product Objective criteriais quickly learned allows users to work efficiently is pleasant to use Objective criteria Time and number of operations to perform a task Frequency of user error Plus overall, subjective satisfaction May 23, 2017 SE 433: Lecture 9

89 Load testing load testingHow many hits/requests should the system be able to handle? What should be its performance under these circumstances? May 23, 2017 SE 433: Lecture 9

90 Accessibility TestingCheck usability by people with disabilities Blind and low vision, deaf, color-blind, ... Use accessibility guidelines Direct usability testing with all relevant groups is usually impractical; checking compliance to guidelines is practical and often reveals problems Example: W3C Web Content Accessibility Guidelines Parts can be checked automatically but manual check is still required e.g., is the “alt” tag of the image meaningful? May 23, 2017 SE 433: Lecture 9

91 Installation Testing Before the testing The testingSE 433 May 23, 2017 Before the testing Configure the system Attach proper number and kind of devices Establish communication with other system The testing Regression tests: to verify that the system has been installed properly and works May 23, 2017 SE 433: Lecture 9 Lecture 9

92 Regression Test Yesterday it worked, today it doesn’tI was fixing X, and accidentally broke Y That bug was fixed, but now it’s back Tests must be re-run after any change Adding new features Changing, adapting software to new conditions Fixing other bugs Regression testing can be a major cost of software maintenance May 23, 2017 SE 433: Lecture 9

93 Basic Problems of Regression TestMaintaining test suite If I change feature X, how many test cases must be revised because they use feature X? Which test cases should be removed or replaced? Which test cases should be added? Cost of re-testing Often proportional to product size, not change size Big problem if testing requires manual effort May 23, 2017 SE 433: Lecture 9

94 Test Case Maintenance Some maintenance is inevitableIf feature X has changed, test cases for feature X will require updating Some maintenance should be avoided Example: Trivial changes to user interface or file format should not invalidate large numbers of test cases Test suites should be modular! Avoid unnecessary dependence Generating concrete test cases from test case specifications can help May 23, 2017 SE 433: Lecture 9

95 Obsolete and RedundantObsolete: A test case that is not longer valid Tests features that have been modified, substituted, or removed Should be removed from the test suite Redundant: A test case that does not differ significantly from others Unlikely to find a fault missed by similar test cases Has some cost in re-execution Has some (maybe more) cost in human effort to maintain May or may not be removed, depending on costs May 23, 2017 SE 433: Lecture 9

96 Selecting and Prioritizing Regression Test CasesMay 23, 2017 Should we re-run the whole regression test suite? If so, in what order? Maybe you don’t care. If you can re-rerun everything automatically over lunch break, do it. Sometimes you do care ... Selection matters when Test cases are expensive to execute Prioritization matters when A very large test suite cannot be executed every day May 23, 2017 SE 433: Lecture 9 Lecture 9

97 Code-Based Regression Test SelectionMay 23, 2017 Observation: A test case can’t find a fault in code it doesn’t execute In a large system, many parts of the code are untouched by many test cases So: Only execute test cases that execute changed or new code Note: Code-based regression test case selection tends to be helpful for large software systems with many independent features (e.g., Eclipse with its plugins, Microsoft Word with its many tools). Almost every test case executes the core parts of the application, so if the application core is changed, code-based regression test selection will degenerate into “retest all”. May 23, 2017 SE 433: Lecture 9 Lecture 9 97

98 Specification-Based Regression Test SelectionLike code-based regression test case selection Pick test cases that test new and changed functionality Difference: No guarantee of independence A test case that isn’t “for” changed or added feature X might find a bug in feature X anyway Typical approach: Specification-based prioritization Execute all test cases, but start with those that related to changed and added features May 23, 2017 SE 433: Lecture 9

99 Prioritized Rotating SelectionBasic idea: Execute all test cases, eventually Execute some sooner than others Possible priority schemes: Round Robin: Priority to least-recently-run test cases Track record: Priority to test cases that have detected faults before Structural: Priority for executing elements that have not been recently executed May 23, 2017 SE 433: Lecture 9

100 Summary System testing is verificationSE 433 May 23, 2017 System testing is verification System consistent with specification? Especially for global properties (performance, reliability) Acceptance testing is validation Includes user testing and checks for usability Usability and accessibility require both Usability testing establishes objective criteria to verify throughout development Regression testing repeated after each change After initial delivery, as software evolves May 23, 2017 SE 433: Lecture 9 Lecture 9 100

101 Reading Chapter 21.1, 21.2 of the textbook.May 23, 2017 SE 433: Lecture 9

102 Next Class Topic: Reading: Assignments and exams:SE 433 May 23, 2017 Topic: Software Quality Assurance vs. System testing; statistics, and metrics. Review Reading: Text: Chapter 20, 24 Articles on the reading list Assignments and exams: Assignment 9: Due May 30 Final exam: June 1-7 Take home exam/paper: June 6 [For SE433 students only] May 23, 2017 SE 433: Lecture 9 Lecture 9