ISTQB Advanced Level Syllabi
5. Testing of Software Characteristics
Accessibility testing, accuracy testing, efficiency testing, heuristic evaluation, interoperability testing, maintainability testing, operational acceptance test (OAT), operational profile, portability testing, recoverability testing, reliability growth model, reliability testing, security testing, suitability testing, SUMI, usability testing
While the previous chapter described specific techniques available to the tester, this chapter considers the application of those techniques in evaluating the principal attributes used to describe the quality of software applications or systems.
In this syllabus the quality attributes which may be evaluated by a Test Analyst and Technical Test Analyst are considered in separate sections. The description of quality attributes provided in ISO 9126 is used as a guide to describing the attributes.
An understanding of the various quality attributes is a basic learning objective of all three modules. Depending on the specific quality attribute covered, a deeper understanding is developed in either the test analyst or the technical test analyst module, so that typical risks can be recognized, appropriate testing strategies developed and test cases specified.
Functional testing is focused on "what" the product does. The test basis for functional testing is generally a requirements or specification document, specific domain expertise or implied need. Functional tests vary according to the test level or phase in which they are conducted. For example, a functional test conducted during integration testing will test the functionality of interfacing modules which implement a single defined function. At the system test level, functional tests include testing the functionality of the application as a whole. For systems of systems, functional testing will focus primarily on end to end testing across the integrated systems.
A wide variety of test techniques is employed during functional test (see section 4). Functional testing may be performed by a dedicated tester, a domain expert, or a developer (usually at the component level).
The following quality attributes are considered:
Functional accuracy involves testing the application's adherence to the specified or implied requirements and may also include computational accuracy. Accuracy testing employs many of the test techniques explained in chapter 4.
Suitability testing involves evaluating and validating the appropriateness of a set of functions for its intended specified tasks. This testing can be based on use cases or procedures.
Interoperability testing tests whether a given application can function correctly in all intended target environments (hardware, software, middleware, operating system, etc.). Specifying tests for interoperability requires that combinations of the intended target environments are identified, configured and available to the test team. These environments are then tested using a selection of functional test cases which exercises the various components present in the environment.
Interoperability relates to how different software systems interact with each other. Software with good interoperability characteristics can be integrated easily with a number of other systems without requiring major changes. The number of changes and the effort required to perform those changes may be used as a measure of interoperability.
Testing for software interoperability may, for example, focus on the following design features:
Interoperability testing may be particularly significant for
This form of testing is primarily performed in system integration testing.
Functional security testing (penetration testing) focuses on the ability of the software to prevent unauthorized access, whether accidental or deliberate, to functions and data. User rights, access and privileges are included in this testing. This information should be available in the specifications for the system. Security testing also includes a number of aspects which are more relevant for Technical Test Analysts and are discussed in section 5.3 below.
It is important to understand why users might have difficulty using the proposed software system. To do this it is first necessary to appreciate that the term "user" may apply to a wide range of different types of persons, ranging from IT experts to children or people with disabilities.
Some national institutions (e.g. the British Royal National Institute for the Blind), recommend that web pages are accessible for disabled, blind, partially sighted, mobility impaired, deaf and cognitivelydisabled users. Checking that applications and web sites are usable for the above users, would also improve the usability for everyone else.
Usability testing measures the suitability of the software for its users, and is directed at measuring the following factors with which specified users can achieve specified goals in particular environments or contexts of use:
Attributes that may be measured are:
Usability evaluation has two purposes:
Tester skills should include expertise or knowledge in the following areas:
Performing validation of the actual implementation should be done under conditions as close as possible to those under which the system will be used. This may involve setting up a usability lab with video cameras, mock up offices, review panels, users, etc. so that development staff can observe the effect of the actual system on real people.
Many usability tests may be executed as part of other tests, for example during functional system test. To achieve a consistent approach to the detection and reporting of usability faults in all stages of the lifecycle, usability guidelines may be helpful.
188.8.131.52 Usability Test Specification
Principal techniques for usability testing are:
Inspection evaluation or review
Inspection or review of the specification and designs from a usability perspective that increase the user’s level of involvement can be cost effective in finding problems early.
Heuristic Evaluation (systematic inspection of a user interface design for usability) can be used to find the usability problems in the design so that they can be attended to as part of an iterative design process. This involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles (the "heuristics").
Validation of the actual implementation
For performing validation of the actual implementation, tests specified for functional system test may be developed as usability test scenarios. These test scenarios measure specific usability attributes, such as speed of learning or operability, rather than functional outcomes.
Test scenarios for usability may be developed to specifically test syntax and semantics.
Techniques used to develop these test scenarios may include:
Test scenarios for usability testing include user instructions, allowance of time for pre and post test interviews for giving instructions and receiving feedback and an agreed protocol for running the sessions. This protocol includes a description of how the test will be carried out, timings, note taking and session logging, and the interview and survey methods to be used.
Surveys and questionnaires
Survey and questionnaire techniques may be applied to gather observations of user behavior with the system in a usability test lab. Standardized and publicly available surveys such as SUMI (Software Usability Measurement Inventory) and WAMMI (Website Analysis and MeasureMent Inventory) permit benchmarking against a database of previous usability measurements. In addition, since SUMI provides concrete measurements of usability, this provides a good opportunity to use them as completion / acceptance criteria.
It is important to consider the accessibility of software to those with particular requirements or restrictions in its use. This includes those with disabilities. It should consider the relevant standards, such as the Web Content Accessibility Guidelines, and legislation, such as Disability Discrimination Acts (UK, Australia) and Section 508 (US).
Quality attributes for Technical Test Analysts focus on "how" the product works, rather than the functional aspects of "what" it does. These tests can take place at any test level, but have particular relevance for:
Component test (especially real time and embedded systems)
System Test and Operational Acceptance Test (OAT)
Frequently, the tests continue to be executed after the software has entered production, often by a separate team or organization. Measurements of quality attributes gathered in pre-production tests may form the basis for Service Level Agreements (SLA) between the supplier and the operator of the software system.
The following quality attributes are considered:
Security testing differs from other forms of domain or technical testing in two significant areas:
Many security vulnerabilities exist where the software not only functions as designed, but also performs extra actions which are not intended. These side-effects represent one of the biggest threats to software security. For example, a media player which correctly plays audio but does so by writing files out to unencrypted temporary storage exhibits a side-effect which may be exploited by software pirates.
The principal concern regarding security is to prevent information from unintended use by unauthorized persons. Security testing attempts to compromise a system’s security policy by assessing a system’s vulnerability to threats, such as
Particular security concerns may be grouped as follows:
It should be noted that improvements which may be made to the security of a system may affect its performance. After making security improvements it is advisable to consider the repetition of performance tests
184.108.40.206 Security Test Specification
The following approach may be used to develop security tests.
An objective of reliability testing is to monitor a statistical measure of software maturity over time and compare this to a desired reliability goal. The measures may take the form of a Mean Time Between Failures (MTBF), Mean Time to Repair (MTTR) or any other form of failure intensity measurement (e.g. number of failures per week of a particular severity). The development of the monitored values over time can be expressed as a Reliability Growth Model.
Reliability testing may take the form of a repeated set of predetermined tests, random tests selected from a pool or test cases generated by a statistical model. As a result, these tests may take a significant time (in the order of days).
Analysis of the software maturity measures over time may be used as exit criteria (e.g. for production release). Specific measures such as MTBF and MTTR may be established as Service Level Agreements and monitored in production.
Software Reliability Engineering and Testing (SRET) is a standard approach for reliability testing.
220.127.116.11 Tests for Robustness
While functional testing may evaluate the software’s tolerance to faults in terms of handling unexpected input values (so-called negative tests), technically oriented tests evaluate a system’s tolerance to faults which occur externally to the application under test. Such faults are typically reported by the operating system (e.g. disk full, process or service not available, file not found, memory not available). Tests of fault tolerance at the system level may be supported by specific tools.
18.104.22.168 Recoverability Testing
Further forms of reliability testing evaluate the software system’s ability to recover from hardware or software failures in a predetermined manner which subsequently allows normal operations to be resumed. Recoverability tests include Failover and Backup & Restore tests.
Failover tests are performed where the consequences of a software failure are so high that specific hardware and/or software measures have been implemented to ensure system operation even in the event of failure. Failover tests may be applicable, for example, where the risk of financial losses is extreme or where critical safety issues exist. Where failures may result from catastrophic events this form of recoverability testing may also be called "disaster recovery" testing.
Typical hardware measures might include load balancing across several processors, clustering servers, processors or disks so that one can immediately take over from another should it fail (e.g. RAID: Redundant Array of Inexpensive Disks). A typical software measure might be the implementation of more than one independent instance of a software system (for example, an aircraft’s flight control system) in so-called redundant dissimilar systems. Redundant systems are typically a combination of software and hardware measures and may be called duplex, triplex or quadruplex systems, depending on the number of independent instances (2, 3 or 4 respectively). The dissimilar aspect for the software is achieved when the same software requirements are provided to two (or more) independent and not connected development teams, with the objective of having the same services provided with different software. This protects the redundant dissimilar systems in that a similar defective input is less likely to have the same result. These measures taken to improve the recoverability of a system may directly influence its reliability as well and may also be considered when performing reliability testing.
Failover testing is designed to explicitly test such systems by simulating failure modes or performing them in a controlled environment. Following failure the failover mechanism is tested to ensure that data is not lost or corrupted and that any agreed service levels are maintained (e.g. function availability, response times). For more information on failover testing, see www.testingstandards.co.uk.
Backup and Restore tests focus on the procedural measures set up to minimize the effects of a failure. Such tests evaluate the procedures (usually documented in a manual) for taking different forms of backup and for restoring that data should a loss or corruption of data take place. Test cases are designed to ensure that critical paths through the procedure are covered. Technical reviews may be performed to "dry-run" these scenarios and validate the manuals against the actual installation procedure. Operational Acceptance Tests (OAT) exercise the scenarios in a production or production like environment to validate their actual use.
Measures for Backup and Restore tests may include the following:
22.214.171.124 Reliability Test Specification
Reliability tests are mostly based on patterns of use (sometimes referred to as "Operational Profiles") and can be performed formally or according to risk. Test data may be generated using random or pseudo-random methods.
The choice of reliability growth curve should be justified and tools can be used to analyze a set of failure data to determine the reliability growth curve that most closely fits the currently available data. Reliability tests may specifically look for memory leaks. The specification of such tests requires that particular memory-intensive actions be executed repeatedly to ensure that reserved memory is correctly released.
The efficiency quality attribute is evaluated by conducting tests focused on time and resource behavior. Efficiency testing relating to time behavior is covered below under the aspects of performance, load, stress and scalability testing.
126.96.36.199 Performance Testing
Performance testing in general may be categorized into different test types according to the nonfunctional requirements in focus. Test types include performance, load, stress and scalability tests.
Specific performance testing focuses on the ability of a component or system to respond to user or system inputs within a specified time and under specified conditions (see also load and stress below). Performance measurements vary according to the objectives of the test. For individual software components performance may be measured according to CPU cycles, while for client-based systems performance may be measured according to the time taken to respond to a particular user request. For systems whose architectures consist of several components (e.g. clients, servers, databases) performance measurements are taken between individual components so that performance "bottlenecks" can be identified.
188.8.131.52 Load Testing
Load testing focuses on the ability of a system to handle increasing levels of anticipated realistic loads resulting from the transaction requests generated by numbers of parallel users. Average response times of users under different scenarios of typical use (operational profiles) can be measured and analyzed. See also [Splaine01]
There are two sub-types of load testing, multi-user (with realistic numbers of users) and volume testing (with large numbers of users). Load testing looks at both response times and network throughput.
184.108.40.206 Stress Testing
Stress testing focuses on the ability of a system to handle peak loads at or beyond maximum capacity. System performance should degrade slowly and predictably without failure as stress levels are increased. In particular, the functional integrity of the system should be tested while the system is under stress in order to find possible faults in functional processing or data inconsistencies.
One possible objective of stress testing is to discover the limits at which the system actually fails so that the "weakest link in the chain" can be determined. Stress testing allows additional components to be added to the system in a timely manner (e.g. memory, CPU capability, database storage).
In spike testing, combinations of conditions which may result in a sudden extreme load being placed on the system are simulated. "Bounce tests" apply several such spikes to the system with periods of low usage between the spikes. These tests will determine how well the system handles changes of loads and whether it is able to claim and release resources as needed. See also [Splaine01].
220.127.116.11 Scalability Testing
Scalability testing focuses on the ability of a system to meet future efficiency requirements, which may be beyond those currently required. The objective of the tests is to judge the system’s ability to grow (e.g. with more users, larger amounts of data stored) without exceeding agreed limits or failing. Once these limits are known, threshold values can be set and monitored in production to provide a warning of impending problems.
18.104.22.168 Test of Resource Utilization
Efficiency tests relating to resource utilization evaluate the usage of system resources (e.g. memory space, disk capacity and network bandwidth). These are compared under both normal loads and stress situations, such as high levels of transaction and data volumes.
For example for real-time embedded systems memory usage (sometimes referred to as a "memory footprint") plays a significant role in performance testing.
22.214.171.124 Efficiency Test Specification
The specification of tests for efficiency test types such as performance, load and stress are based on the definition of operational profiles. These represent distinct forms of user behavior when interacting with an application. There may be several operational profiles for a given application.
The numbers of users per operational profile may be obtained by using monitoring tools (where the actual or comparable application is already available) or by predicting usage. Such predictions may be based on algorithms or provided by the business organization, and are especially important for specifying the operational profile for scalability testing.
Operational profiles are the basis for test cases and are typically generated using test tools. In this case the term "virtual user" is typically used to represent a simulated user within the operational profile.
Maintainability tests in general relate to the ease with which software can be analyzed, changed and tested. Appropriate techniques for maintainability testing include static analysis and checklists.
126.96.36.199 Dynamic Maintainability Testing
Dynamic maintainability testing focuses on the documented procedures developed for maintaining a particular application (e.g. for performing software upgrades). Selections of maintenance scenariosare used as test cases to ensure the required service levels are attainable with the documented procedures.
This form of testing is particularly relevant where the underlying infrastructure is complex, and support procedures may involve multiple departments/organizations. This form of testing may take place as part of Operational Acceptance Testing (OAT). [www.testingstandards.co.uk]
188.8.131.52 Analyzability (corrective maintenance)
This form of maintainability testing focuses on measuring the time taken to diagnose and fix problems identified within a system. A simple measure can be the mean time taken to diagnose and fix an identified fault.
184.108.40.206 Changeability, Stability and Testability (adaptive maintenance)
The maintainability of a system can also be measured in terms of the effort required to make changes to that system (e.g. code changes). Since the effort required is dependent on a number of factors such as software design methodology (e.g. object orientation), coding standards etc., this form of maintainability testing may also be performed by analysis or review. Testability relates specifically to the effort required to test the changes made. Stability relates specifically to the system’s response to change. Systems with low stability exhibit large numbers of "knock-on" problems (also known as "ripple effect") whenever a change is made. [ISO9126] [www.testingstandards.co.uk]
Portability tests in general relate to the ease with which software can be transferred into its intended environment, either initially or from an existing environment. Portability tests include tests for installability, co-existence/compatibility, adaptability and replaceability.
220.127.116.11 Installability Testing
Installability testing is conducted on the software used to install other software on its target environment. This may include, for example, the software developed to install an operating system onto a processor, or an installation "wizard" used to install a product onto a client PC. Typical installability testing objectives include:
Functionality testing is normally conducted after the installation test to detect any faults which may have been introduced by the installation (e.g. incorrect configurations, functions not available). Usability testing is normally conducted in parallel to installability testing (e.g. to validate that users are provided with understandable instructions and feedback/error messages during the installation).
Computer systems which are not related to each other are said to be compatible when they can run in the same environment (e.g. on the same hardware) without affecting each other's behavior (e.g. resource conflicts). Compatibility tests may be performed, for example, where new or upgradedsoftware is rolled-out into environments (e.g. servers) which already contain installed applications.
Compatibility problems may arise where the application is tested in an environment where it is the only installed application (where incompatibility issues are not detectable) and then deployed onto another environment (e.g. production) which also runs other applications.
Typical compatibility testing objectives include:
Compatibility testing is normally performed when system and user acceptance testing have been successfully completed.
18.104.22.168 Adaptability Testing
Adaptability testing tests whether a given application can function correctly in all intended target environments (hardware, software, middleware, operating system, etc.). Specifying tests for adaptability requires that combinations of the intended target environments are identified, configured and available to the test team. These environments are then tested using a selection of functional test cases which exercise the various components present in the environment.
Adaptability may relate to the ability of software to be ported to various specified environments by performing a predefined procedure. Tests may evaluate this procedure.
Adaptability tests may be performed in conjunction with installability tests and are typically followed by functional tests to detect any faults which may have been introduced in adapting the software to a different environment.
22.214.171.124 Replaceability Testing
Replaceability focuses on the ability of software components within a system to be exchanged for others. This may be particularly relevant for systems which use commercial off-the-shelf software (COTS) for specific system components.
Replaceability tests may be performed in parallel to functional integration tests where more than one alternative component is available for integration into the complete system. Replaceability may be evaluated by technical review or inspection, where the emphasis is placed on the clear definition of interfaces to potential replacement components.