TY - GEN
T1 - On the effectiveness of random testing for Android
T2 - 13th ACM/IEEE International Workshop on Automation of Software Test, AST 2018
AU - Patel, Priyam
AU - Srinivasan, Gokul
AU - Rahaman, Sydur
AU - Neamtiu, Iulian
N1 - Publisher Copyright:
© 2018 ACM.
PY - 2018/5/28
Y1 - 2018/5/28
N2 - Random testing of Android apps is attractive due to ease-of-use and scalability, but its effectiveness could be questioned. Prior studies have shown that Monkey - a simple approach and tool for random testing of Android apps - is surprisingly effective, "beating" much more sophisticated tools by achieving higher coverage. We study how Monkey's parameters affect code coverage (at class, method, block, and line levels) and set out to answer several research questions centered around improving the effectiveness of Monkey-based random testing in Android, and how it compares with manual exploration. First, we show that random stress testing via Monkey is extremely efficient (85 seconds on average) and effective at crashing apps, including 15 widely-used apps that have millions (or even billions) of installs. Second, we vary Monkey's event distribution to change app behavior and measured the resulting coverage. We found that, except for isolated cases, altering Monkey's default event distribution is unlikely to lead to higher coverage. Third, we manually explore 62 apps and compare the resulting coverages; we found that coverage achieved via manual exploration is just 2 - 3% higher than that achieved via Monkey exploration. Finally, our analysis shows that coarse-grained coverage is highly indicative of fine-grained coverage, hence coarse-grained coverage (which imposes low collection overhead) hits a performance vs accuracy sweet spot.
AB - Random testing of Android apps is attractive due to ease-of-use and scalability, but its effectiveness could be questioned. Prior studies have shown that Monkey - a simple approach and tool for random testing of Android apps - is surprisingly effective, "beating" much more sophisticated tools by achieving higher coverage. We study how Monkey's parameters affect code coverage (at class, method, block, and line levels) and set out to answer several research questions centered around improving the effectiveness of Monkey-based random testing in Android, and how it compares with manual exploration. First, we show that random stress testing via Monkey is extremely efficient (85 seconds on average) and effective at crashing apps, including 15 widely-used apps that have millions (or even billions) of installs. Second, we vary Monkey's event distribution to change app behavior and measured the resulting coverage. We found that, except for isolated cases, altering Monkey's default event distribution is unlikely to lead to higher coverage. Third, we manually explore 62 apps and compare the resulting coverages; we found that coverage achieved via manual exploration is just 2 - 3% higher than that achieved via Monkey exploration. Finally, our analysis shows that coarse-grained coverage is highly indicative of fine-grained coverage, hence coarse-grained coverage (which imposes low collection overhead) hits a performance vs accuracy sweet spot.
KW - code coverage
KW - google Android
KW - mobile applications
KW - random testing
KW - stress testing
UR - http://www.scopus.com/inward/record.url?scp=85051234181&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051234181&partnerID=8YFLogxK
U2 - 10.1145/3194733.3194742
DO - 10.1145/3194733.3194742
M3 - Conference contribution
AN - SCOPUS:85051234181
T3 - Proceedings - International Conference on Software Engineering
SP - 34
EP - 37
BT - Proceedings 2018 ACM/IEEE 13th International Workshop on Automation of Software Test, AST 2018
PB - IEEE Computer Society
Y2 - 28 May 2018 through 29 May 2018
ER -