Pure function
In computer programming, a pure function is a function that has the following properties:[1][2]
- Its return value is the same for the same arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams from I/O devices).
- Its evaluation has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or I/O streams).
Thus a pure function is a computational analogue of a mathematical function. Some authors, particularly from the imperative language community, use the term "pure" for all functions that just have the above property 2[3][4] (discussed below).
Examples
Pure functions
The following examples of C++ functions are pure:
floor
, returning the floor of a number;max
, returning the maximum of two values.- the function f, defined as
void f() { static std::atomic<unsigned int> x = 0; ++x; }
- Although this code sample looks like it is not pure, it actually is. The value of
x
can be only observed inside other invocations off()
, and asf()
does not communicate the value ofx
to its environment, it is indistinguishable from functionvoid f() {}
that does nothing. Note thatx
isstd::atomic
so that modifications from multiple threads executingf()
concurrently do not result in a data race, which has undefined behavior in C and C++.
Pure functions can be implemented in Wolfram Language using #
to refer to inputs and &
to indicate the end of the function. For example, the following code defines the Factorial function, such that Factorial[n]
is equivalent to n!
.[5]
Factorial = If[#1 == 1, 1, #1 #0[#1 - 1]] &
#1
refers to the first input, while #0
refers to the function itself, making a pure recursive function simple to implement.
Impure functions
The following C++ functions are impure as they lack the above property 1:
- because of return value variation with a non-local variable
int f() { return x; }
- For the same reason, e.g. the C++ library function
sin()
is not pure, since its result depends on the IEEE rounding mode which can be changed at runtime.
- because of return value variation with a mutable reference argument
int f(int* x) { return *x; }
- because of inconsistent defined/undefined behavior:
void f() { static int x = 0; ++x; }
- Overflow of a signed integer is an undefined behavior per C++ specification. Also, if
f()
is called concurrently, the code exhibits a data race. Pure functions can fail or never return, but they must do this consistently (for the same input). However,f()
may, or may not, fail, depending on whether the upper bound of allowedsigned int
value was reached or a data race happens, or not.
The following C++ functions are impure as they lack the above property 2:
- because of mutation of a local static variable
void f() { static int x = 0; ++x; }
- because of mutation of a non-local variable
void f() { ++x; }
- because of mutation of a mutable reference argument
void f(int* x) { ++*x; }
- because of mutation of an output stream
void f() { std::cout << "Hello, world!" << std::endl; }
The following C++ functions are impure as they lack both the above properties 1 and 2:
- because of return value variation with a local static variable and mutation of a local static variable
int f() { static int x = 0; ++x; return x; }
- because of return value variation with an input stream and mutation of an input stream
int f() { int x = 0; std::cin >> x; return x; }
I/O in pure functions
I/O is inherently impure: input operations undermine referential transparency, and output operations create side effects. Nevertheless, there is a sense in which function can perform input or output and still be pure, if the sequence of operations on the relevant I/O devices is modeled explicitly as both an argument and a result, and I/O operations are taken to fail when the input sequence does not describe the operations actually taken since the program began execution.
The second point ensures that the only sequence usable as an argument must change with each I/O action; the first allows different calls to an I/O-performing function to return different results on account of the sequence arguments having changed.[6][7]
The I/O monad is a programming idiom typically used to perform I/O in pure functional languages.
Compiler optimizations
Functions that have just the above property 2 allow for compiler optimization techniques such as common subexpression elimination and loop optimization similar to arithmetic operators.[3] A C++ example is the length
method, returning the size of a string, which depends on the memory contents where the string points to, therefore lacking the above property 1. Nevertheless, in a single-threaded environment, the following C++ code
std::string s = "Hello, world!";
int a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int l = 0;
for (int i = 0; i < 10; ++i) {
l += s.length() + a[i];
}
can be optimized such that the value of s.length()
is computed only once, before the loop.
In Fortran, the pure
keyword can be used to declare a function to be just side-effect free (i.e. have just the above property 2).
Unit testing
Since pure functions have the same return value for the same arguments, they are well suited to unit testing.
See also
- Compile time function execution: the evaluation of pure functions at compile time
- Deterministic algorithm
- Purely functional data structure
- Lambda calculus
- Side effect (computer science)
- Pure procedure
- Idempotence
- pure keyword in Fortran annotating pure functions
- constexpr keyword in C++ annotating pure functions usable at compile-time
References
- Bartosz Milewski (2013). "Basics of Haskell". School of Haskell. FP Complete. Archived from the original on 2016-10-27. Retrieved 2018-07-13.
Here are the fundamental properties of a pure function: 1. A function returns exactly the same result every time it's called with the same set of arguments. In other words a function has no state, nor can it access any external state. Every time you call it, it behaves like a newborn baby with blank memory and no knowledge of the external world. 2. A function has no side effects. Calling a function once is the same as calling it twice and discarding the result of the first call.
- Brian Lonsdorf (2015). "Professor Frisby's Mostly Adequate Guide to Functional Programming". GitHub. Retrieved 2020-03-20.
A pure function is a function that, given the same input, will always return the same output and does not have any observable side effect.
- "GCC 8.1 Manual". GCC, the GNU Compiler Collection. Free Software Foundation, Inc. 2018. Retrieved 2018-06-28.
- Fortran 95 language features#Pure Procedures
- Wolfram Research (1988). "Slot—Wolfram Language Documentation". reference.wolfram.com. Retrieved 2021-02-01.
- Peyton Jones, Simon L. (2003). Haskell 98 Language and Libraries: The Revised Report (PDF). Cambridge, United Kingdom: Cambridge University Press. p. 95. ISBN 0-521 826144. Retrieved 17 July 2014.
- Hanus, Michael. "Curry: An Integrated Functional Logic Language" (PDF). www-ps.informatik.uni-kiel.de. Institut für Informatik, Christian-Albrechts-Universität zu Kiel. p. 33. Archived from the original (PDF) on 25 July 2014. Retrieved 17 July 2014.