Stress testing
Stress testing (sometimes called torture testing) is a form of deliberately intense or thorough testing used to determine the stability of a given system, critical infrastructure or entity. It involves testing beyond normal operational capacity, often to a breaking point, in order to observe the results. Reasons can include:
- to determine breaking points or safe usage limits
- to confirm mathematical model is accurate enough in predicting breaking points or safe usage limits
- to confirm intended specifications are being met
- to determine modes of failure (how exactly a system fails)
- to test stable operation of a part or system outside standard usage
Reliability engineers often test items under expected stress or even under accelerated stress in order to determine the operating life of the item or to determine modes of failure.[1]
The term "stress" may have a more specific meaning in certain industries, such as material sciences, and therefore stress testing may sometimes have a technical meaning – one example is in fatigue testing for materials.
Computing
Hardware
Stress testing, in general, should put computer hardware under exaggerated levels of stress in order to ensure stability when used in a normal environment. These can include extremes of workload, type of task, memory use, thermal load (heat), clock speed, or voltages. Memory and CPU are two components that are commonly stress tested in this way.
There is considerable overlap between stress testing software and benchmarking software, since both seek to assess and measure maximum performance. Of the two, stress testing software aims to test stability by trying to force a system to fail; benchmarking aims to measure and assess the maximum performance possible at a given task or function.
When modifying the operating parameters of a CPU, such as temperature, overclocking, underclocking, overvolting, and undervolting, it may be necessary to verify if the new parameters (usually CPU core voltage and frequency) are suitable for heavy CPU loads. This is done by running a CPU-intensive program for extended periods of time, to test whether the computer hangs or crashes. CPU stress testing is also referred to as torture testing. Software that is suitable for torture testing should typically run instructions that utilise the entire chip rather than only a few of its units. Stress testing a CPU over the course of 24 hours at 100% load is, in most cases, sufficient to determine that the CPU will function correctly in normal usage scenarios such as in a desktop computer, where CPU usage typically fluctuates at low levels (50% and under).
Hardware stress testing and stability are subjective and may vary according to how the system will be used. A stress test for a system running 24/7 or that will perform error sensitive tasks such as distributed computing or "folding" projects may differ from one that needs to be able to run a single game with reasonably reliability. For example, a comprehensive guide on overclocking Sandy Bridge found that:[2]
Even though in the past IntelBurnTest was just as good, it seems that something in the SB uArch [Sandy Bridge microarchitecture] is more heavily stressed with Prime95 ... IBT really does pull more power [make greater thermal demands]. But ... Prime95 failed first every time, and it failed when IBT would pass. So same as Sandy Bridge, Prime95 is a better stability tester for Sandy Bridge-E than IBT/LinX.
Stability is subjective; some might call stability enough to run their game, other like folders [folding projects] might need something that is just as stable as it was at stock, and ... would need to run Prime95 for at least 12 hours to a day or two to deem that stable ... There are [bench testers] who really don't care for stability like that and will just say if it can [complete] a benchmark it is stable enough. No one is wrong and no one is right. Stability is subjective. [But] 24/7 stability is not subjective.
An engineer at ASUS advised in a 2012 article on overclocking an Intel X79 system, that it is important to choose testing software carefully in order to obtain useful results:[3]
Unvalidated stress tests are not advised (such as Prime95 or LinX or other comparable applications). For high grade CPU/IMC and System Bus testing Aida64 is recommended along with general applications usage like PC Mark 7. Aida has an advantage as its stability test has been designed for the Sandy Bridge E architecture and test specific functions like AES, AVX and other instruction sets that prime and like synthetics do not touch. As such not only does it load the CPU 100% but will also test other parts of CPU not used under applications like Prime 95. Other applications to consider are SiSoft 2012 or Passmark BurnIn. Be advised validation has not been completed using Prime 95 version 26 and LinX (10.3.7.012) and OCCT 4.1.0 beta 1 but once we have internally tested to ensure at least limited support and operation.
Software commonly used in stress testing
- Aida
- IBM Teleprocessing Network Simulator
- IBM Workload Simulator
- Intel processor diagnostic test
- Intel Burn Test
- LinX (AVX)
- Memtest86+ – memory
- OCCT
- Passmark Burn-in
- Prime95, and derivatives such as HyperPi – CPU/heat
- Siege
- S&M
- Tsung - free software tool
Software
In software testing, a system stress test refers to tests that put a greater emphasis on robustness, availability, and error handling under a heavy load, rather than on what would be considered correct behavior under normal circumstances. In particular, the goals of such tests may be to ensure the software does not crash in conditions of insufficient computational resources (such as memory or disk space), unusually high concurrency, or denial of service attacks.
Examples:
- A web server may be stress tested using scripts, bots, and various denial of service tools to observe the performance of a web site during peak loads. These attacks generally are under an hour long, or until a limit in the amount of data that the web server can tolerate is found.
Stress testing may be contrasted with load testing:
- Load testing examines the entire environment and database, while measuring the response time, whereas stress testing focuses on identified transactions, pushing to a level so as to break transactions or systems.
- During stress testing, if transactions are selectively stressed, the database may not experience much load, but the transactions are heavily stressed. On the other hand, during load testing the database experiences a heavy load, while some transactions may not be stressed.
- System stress testing, also known as stress testing, is loading the concurrent users over and beyond the level that the system can handle, so it breaks at the weakest link within the entire system.
Critical Infrastructure
Critical infrastructure (CI) such as highways, railways, electric power networks, dams, port facilities, major gas pipelines or oil refineries are exposed to multiple natural and human-induced hazards and stressors, including earthquakes, landslides, floods, tsunami, wildfires, climate change effects or explosions. These stressors and abrupt events can cause failures and losses, and hence, can interrupt essential services for the society and the economy.[4] Therefore, CI owners and operators need to identify and quantify the risks posed by the CIs due to different stressors, in order to define mitigation strategies[5] and improve the resilience of the CIs.[6][7] Stress tests are advanced and standardised tools for hazard and risk assessment of CIs, that include both low-probability high-consequence (LP-HC) events and so-called extreme or rare events, as well as the systematic application of these new tools to classes of CI.
Stress testing is the process of assessing the ability of a CI to maintain a certain level of functionality under unfavourable conditions, while stress tests consider LP-HC events, which are not always accounted for in the design and risk assessment procedures, commonly adopted by public authorities or industrial stakeholders. A multilevel stress test methodology for CI has been developed in the framework of the European research project STREST,[8] consisting of four phases:[9]
Phase 1: Preassessment, during which the data available on the CI (risk context) and on the phenomena of interest (hazard context) are collected. The goal and objectives, the time frame, the stress test level and the total costs of the stress test are defined.
Phase 2: Assessment, during which the stress test at the component and the system scope is performed, including fragility[10] and risk[11] analysis of the CIs for the stressors defined in Phase 1. The stress test can result in three outcomes: Pass, Partly Pass and Fail, based on the comparison of the quantified risks to acceptable risk exposure levels and a penalty system.
Phase 3: Decision, during which the results of the stress test are analyzed according to the goal and objectives defined in Phase 1. Critical events (events that most likely cause the exceedance of a given level of loss) and risk mitigation strategies are identified.
Phase 4: Report, during which the stress test outcome and risk mitigation guidelines based on the findings established in Phase 3 are formulated and presented to the stakeholders.
This stress-testing methodology has been demonstrated to six CIs in Europe at component and system level:[12] an oil refinery and petrochemical plant in Milazzo, Italy; a conceptual alpine earth-fill dam in Switzerland; the Baku–Tbilisi–Ceyhan pipeline in Turkey; part of the Gasunie national gas storage and distribution network in the Netherlands; the port infrastructure of Thessaloniki, Greece; and an industrial district in the region of Tuscany, Italy. The outcome of the stress testing included the definition of critical components and events and risk mitigation strategies, which are formulated and reported to stakeholders.
See also
- Burn-in
- Destructive testing
- Load and performance test tools
- Black box testing
- Load testing
- Software performance testing
- Scenario analysis
- Simulation
- Software testing
- White box testing
- Technischer Überwachungsverein (TÜV) – product testing and certification
- Concurrency testing using the CHESS model checker
- Jinx automates stress testing by automatically exploring unlikely execution scenarios.
- Highly accelerated life test
References
- Nelson, Wayne B., (2004), Accelerated Testing - Statistical Models, Test Plans, and Data Analysis, John Wiley & Sons, New York, ISBN 0-471-69736-2
- Sin0822 (2011-12-24). "Sandy Bridge E Overclocking Guide: Walk through, Explanations, and Support for all X79". overclock.net. Retrieved 2 February 2013. (some text condensed)
- Juan Jose Guerrero III - ASUS (2012-03-29). "Intel X79 Motherboard Overclocking Guide". benchmarkreviews.com. Retrieved 2 February 2013.
- Pescaroli, Gianluca; Alexander, David (2016-05-01). "Critical infrastructure, panarchies and the vulnerability paths of cascading disasters". Natural Hazards. 82 (1): 175–192. doi:10.1007/s11069-016-2186-3. ISSN 1573-0840.
- Mignan, A.; Karvounis, D.; Broccardo, M.; Wiemer, S.; Giardini, D. (March 2019). "Including seismic risk mitigation measures into the Levelized Cost Of Electricity in enhanced geothermal systems for optimal siting". Applied Energy. 238: 831–850. doi:10.1016/j.apenergy.2019.01.109.
- Linkov, Igor; Bridges, Todd; Creutzig, Felix; Decker, Jennifer; Fox-Lent, Cate; Kröger, Wolfgang; Lambert, James H.; Levermann, Anders; Montreuil, Benoit; Nathwani, Jatin; Nyer, Raymond (June 2014). "Changing the resilience paradigm". Nature Climate Change. 4 (6): 407–409. Bibcode:2014NatCC...4..407L. doi:10.1038/nclimate2227. ISSN 1758-6798.
- Argyroudis, Sotirios A.; Mitoulis, Stergios A.; Hofer, Lorenzo; Zanini, Mariano Angelo; Tubaldi, Enrico; Frangopol, Dan M. (April 2020). "Resilience assessment framework for critical infrastructure in a multi-hazard environment: Case study on transport assets" (PDF). Science of the Total Environment. 714: 136854. Bibcode:2020ScTEn.714m6854A. doi:10.1016/j.scitotenv.2020.136854. PMID 32018987.
- "STREST-Harmonized approach to stress tests for critical infrastructures against natural hazards. Funded from the European Union's Seventh Framework Programme FP7/2007-2013, under grant agreement no. 603389. Project Coordinator: Domenico Giardini; Project Manager: Arnaud Mignan, ETH Zurich".
- Esposito Simona; Stojadinović Božidar; Babič Anže; Dolšek Matjaž; Iqbal Sarfraz; Selva Jacopo; Broccardo Marco; Mignan Arnaud; Giardini Domenico (2020-03-01). "Risk-Based Multilevel Methodology to Stress Test Critical Infrastructure Systems". Journal of Infrastructure Systems. 26 (1): 04019035. doi:10.1061/(ASCE)IS.1943-555X.0000520.
- Pitilakis, K.; Crowley, H.; Kaynia, A.M., eds. (2014). SYNER-G: Typology Definition and Fragility Functions for Physical Elements at Seismic Risk. Geotechnical, Geological and Earthquake Engineering. 27. Dordrecht: Springer Netherlands. doi:10.1007/978-94-007-7872-6. ISBN 978-94-007-7871-9. S2CID 133078584.
- Pitilakis, K.; Franchin, P.; Khazai, B.; Wenzel, H., eds. (2014). SYNER-G: Systemic Seismic Vulnerability and Risk Assessment of Complex Urban, Utility, Lifeline Systems and Critical Facilities. Geotechnical, Geological and Earthquake Engineering. 31. Dordrecht: Springer Netherlands. doi:10.1007/978-94-017-8835-9. ISBN 978-94-017-8834-2. S2CID 107566163.
- Argyroudis, Sotirios A.; Fotopoulou, Stavroula; Karafagka, Stella; Pitilakis, Kyriazis; Selva, Jacopo; Salzano, Ernesto; Basco, Anna; Crowley, Helen; Rodrigues, Daniela; Matos, José P.; Schleiss, Anton J. (2020). "A risk-based multi-level stress test methodology: application to six critical non-nuclear infrastructures in Europe" (PDF). Natural Hazards. 100 (2): 595–633. doi:10.1007/s11069-019-03828-5. ISSN 1573-0840. S2CID 209432723.