written 5.3 years ago by |
Exhaustive or complete software testing means that every statement in the program and every possible path combination with every possible combination of data must be executed. However, soon, we will realize that exhaustive testing is out of scope. That is why the questions arise: (i) When are we done with testing? or (ii) How do we know that we have tested enough? There may be many answers to these questions with respect to time, cost, customer, quality, etc. This section will explore why exhaustive or complete testing is not possible. We should concentrate on effective testing that emphasizes efficient techniques to test the software so that important features will be tested within the constrained resources.
The testing process should be understood as a domain of possible tests (see Fig.1) There are subsets of these possible tests. However, the domain of possible tests becomes infinite, as we cannot test every possible combination.
This combination of possible tests is infinite, that is, the processing resources and time are not sufficient for performing tests. Computer speed and time constraints limit the possibility of performing all the tests. Complete testing requires the organization to invest a long time, which is not cost-effective. Therefore, testing must be performed on selected subsets that can be performed within the constrained resources. This selected group of subsets (not the whole domain of testing) makes software testing effective. Effective testing can be enhanced if subsets are selected based on the factors that are required in a particular environment.
Domain of Possible inputs to the Software is too Large to Test
Even if we consider the input data as the only part of the domain of testing, we are not able to test the complete input data combination. The domain of input data has four sub-parts: (a) valid inputs, (b) invalid inputs, (c) edited inputs, and (d) race condition inputs (See Fig.2).
Valid inputs: It seems that we can test every valid input on the software. Let us look at a very simple example of adding two-digit two numbers. Their range is from -99 to 99(total 199). So the total number of test case combinations will be $199 \times 199=39601 .$ Further, if we increase the range from two digits to four-digits, then the number of test cases will be 399,960,001. Most addition programs accept 8 or 10 digit numbers or more. How can we test all these combinations of valid inputs? When we test software with valid data, it is known as positive testing. Positive testing is always performed keeping in view the valid range or limits of the test data in test cases.
Invalid inputs: Testing the software with valid inputs is only one part of the input sub-domain. There is another part, invalid inputs, which must be tested for testing the software effectively. When we test a software with invalid data, it is known as negative testing. Negative testing is always performed keeping in view that the software must work properly when it is passed through an invalid set of data. Thus, negative testing basically tries to break the software. The important thing, in this case, is the behavior of the program as to how it responds when a user feeds invalid inputs. A set of invalid inputs is also too large to test. If we consider again the example of adding two numbers, then the following possibilities may occur:
(i) Numbers out of range
(ii) Combination of alphabets and digits
(iii) Combination of all alphabets
(iv) Combination of control characters
(v) Combination of any other key on the keyboard.
Edited inputs: If we can edit inputs at the time of providing them to the program, then many unexpected input events may occur. For example, you can add many spaces in the input, which are not visible to the user. It can be a reason for the non-functioning of the program. In another example, it may be possible that a user is pressing a number key, then Backspace key continuously and finally after some time, presses another number key and Enter. Its input buffer overflows and the system crashes.
The behavior of users cannot be judged. They can behave in a number of ways, causing a defect in testing a program. That is why edited inputs are also not tested completely.
Race condition inputs: The timing variation between two or more inputs is also one of the issues that limit the testing. For example, there are two input events, A and B. According to the design, A precedes B in most of the cases. However, B can also come first in rare and restricted conditions. There is the race condition, whenever B precedes A. Usually the program fails due to race conditions, as the possibility of preceding B in restricted condition has not been taken care, resulting in a race condition bug. In this way, there may be many race conditions in the system, especially in multiprocessing and interactive systems. Race conditions are among the least tested.
There are too Many Possible Paths Through the Program to Test
A program path can be traced through the code from the start of a program to its termination. Two paths differ if the program executes different statements in each, or executes the same statements but in a different order. A testing person may think that if all the possible paths of control flow through the program are executed, then possibly the program can be said to be completely tested. However, there are two flaws in this statement.
(i) The number of unique logic paths through a program is too large. This was demonstrated by Myers [2] with an example shown in Fig.3. It depicts a $10-20$ statements program consisting of a DO loop that iterates up to 20 times. Within the body of the DO loop is a set of nested IF statements. The number of all the paths from point A to B is approximately $10^{14} .$ Thus, all these paths cannot be tested, as it may take years to complete.
(ii) The complete path testing, if performed somehow, does not guarantee that there will not be errors. For example, it does not claim that a program matches its specification. If one were asked to write an ascending order sorting program, but the developer mistakenly produces a descending order program, then exhaustive path testing will be of little value. In another case, a program may be incorrect because of missing paths. In this case, exhaustive path testing would not detect the missing path.