Unspecified behavior

Unspecified behavior is behavior that may vary on different implementations of a programming language. A program can be said to contain unspecified behavior when its source code may produce an executable that exhibits different behavior when compiled on a different compiler, or on the same compiler with different settings, or indeed in different parts of the same executable. While the respective language standards or specifications may impose a range of possible behaviors, the exact behavior depends on the implementation and may not be completely determined upon examination of the program's source code.[1] Unspecified behavior will often not manifest itself in the resulting program's external behavior, but it may sometimes lead to differing outputs or results, potentially causing portability problems.

Definition

To enable compilers to produce optimal code for their respective target platforms, programming language standards do not always impose a certain specific behavior for a given source code construct.[2] Failing to explicitly define the exact behavior of every possible program is not considered an error or weakness in the language specification, and doing so would be infeasible.[1] In the C and C++ languages, such non-portable constructs are generally grouped into three categories: Implementation-defined, unspecified, and undefined behavior.[3]

The exact definition of unspecified behavior varies. In C++, it is defined as "behavior, for a well-formed program construct and correct data, that depends on the implementation."[4] The C++ Standard also notes that the range of possible behaviors is usually provided.[4] Unlike implementation-defined behavior, there is no requirement for the implementation to document its behavior.[4] Similarly, the C Standard defines it as behavior for which the standard "provides two or more possibilities and imposes no further requirements on which is chosen in any instance".[5] Unspecified behavior is different from undefined behavior. The latter is typically a result of an erroneous program construct or data, and no requirements are placed on the translation or execution of such constructs.[6]

Implementation-defined behavior

C and C++ distinguish implementation-defined behavior from unspecified behavior. For implementation-defined behavior, the implementation must choose a particular behavior and document it. An example in C/C++ is the size of integer data types. The choice of behavior must be consistent with the documented behavior within a given execution of the program.

Examples

Order of evaluation of subexpressions

Many programming languages do not specify the order of evaluation of the sub-expressions of a complete expression. This non-determinism can allow optimal implementations for specific platforms e.g. to utilise parallelism. If one or more of the sub-expressions has side effects, then the result of evaluating the full-expression may be different depending on the order of evaluation of the sub-expressions.[1] For example, given

a = f(b) + g(b);

, where f and g both modify b, the result stored in a may be different depending on whether f(b) or g(b) is evaluated first.[1] In the C and C++ languages, this also applies to function arguments. Example:[2]

#include <iostream>
int f() {
  std::cout << "In f\n";
  return 3;
}

int g() {
  std::cout << "In g\n";
  return 4;
}

int sum(int i, int j) {
  return i + j;
}

int main() {
  return sum(f(), g()); 
}

The resulting program will write its two lines of output in an unspecified order.[2] In other languages, such as Java, the order of evaluation of operands and function arguments is explicitly defined.[7]

See also

References

  1. ISO/IEC (2009-05-29). ISO/IEC PDTR 24772.2: Guidance to Avoiding Vulnerabilities in Programming Languages through Language Selection and Use
  2. Becker, Pete (2006-05-16). "Living By the Rules". Dr. Dobb's Journal. Retrieved 26 November 2009.
  3. Henricson, Mats; Nyquist, Erik (1997). Industrial Strength C++. Prentice Hall. ISBN 0-13-120965-5.
  4. ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.3.13 unspecified behavior [defns.unspecified]
  5. ISO/IEC (1999). ISO/IEC 9899:1999(E): Programming Languages - C §3.4.4 para 1
  6. ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.3.12 undefined behavior [defns.undefined]
  7. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha (2005). The Java Language Specification, Third Edition. Addison-Wesley. ISBN 0-321-24678-0
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.