User-defined function
A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment.
BASIC language
In some old implementations of the BASIC programming language, user-defined functions are defined using the "DEF FN" syntax. More modern dialects of BASIC are influenced by the structured programming paradigm, where most or all of the code is written as user-defined functions or procedures, and the concept becomes practically redundant.
Databases
In relational database management systems, a user-defined function provides a mechanism for extending the functionality of the database server by adding a function, that can be evaluated in standard query language (usually SQL) statements. The SQL standard distinguishes between scalar and table functions. A scalar function returns only a single value (or NULL), whereas a table function returns a (relational) table comprising zero or more rows, each row with one or more columns.
User-defined functions in SQL are declared using the CREATE FUNCTION
statement. For example, a function that converts Celsius to Fahrenheit might be declared like this:
CREATE FUNCTION dbo.CtoF(Celsius FLOAT)
RETURNS FLOAT
RETURN (Celsius * 1.8) + 32
Once created, a user-defined function may be used in expressions in SQL statements. For example, it can be invoked where most other intrinsic functions are allowed. This also includes SELECT statements, where the function can be used against data stored in tables in the database. Conceptually, the function is evaluated once per row in such usage. For example, assume a table named ELEMENTS, with a row for each known chemical element. The table has a column named BoilingPoint for the boiling point of that element, in Celsius. The query
SELECT Name, CtoF(BoilingPoint)
FROM Elements
would retrieve the name and the boiling point from each row. It invokes the CtoF user-defined function as declared above in order to convert the value in the column to a value in Fahrenheit.
Each user-defined function carries certain properties or characteristics. The SQL standard defines the following properties:
- Language - defines the programming language in which the user-defined function is implemented; examples include SQL, C, C# and Java.
- Parameter style - defines the conventions that are used to pass the function parameters and results between the implementation of the function and the database system (only applicable if language is not SQL).
- Specific name - a name for the function that is unique within the database. Note that the function name does not have to be unique, considering overloaded functions. Some SQL implementations require that function names are unique within a database, and overloaded functions are not allowed.
- Determinism - specifies whether the function is deterministic or not. The determinism characteristic has an influence on the query optimizer when compiling a SQL statement.
- SQL-data access - tells the database management system whether the function contains no SQL statements (NO SQL), contains SQL statements but does not access any tables or views (CONTAINS SQL), reads data from tables or views (READS SQL DATA), or actually modifies data in the database (MODIFIES SQL DATA).
User-defined functions should not be confused with stored procedures. Stored procedures allow the user to group a set of SQL commands. A procedure can accept parameters and execute its SQL statements depending on those parameters. A procedure is not an expression and, thus, cannot be used like user-defined functions.
Some database management systems allow the creation of user defined functions in languages other than SQL. Microsoft SQL Server, for example, allows the user to use .NET languages including C# for this purpose. DB2 and Oracle support user-defined functions written in C or Java programming languages.
SQL Server 2000
There are three types of UDF in Microsoft SQL Server 2000: scalar functions, inline table-valued functions, and multistatement table-valued functions.
Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types. Inline table-valued functions return the result set of a single SELECT statement. Multistatement table-valued functions return a table, which was built with many TRANSACT-SQL statements.
User-defined functions can be invoked from a query like built‑in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures.
Performance Notes: 1. On Microsoft SQL Server 2000 a table-valued function which "wraps" a View may be much faster than the View itself. The following MyFunction is an example of a "function-wrapper" which runs faster than the underlying view MyView:
CREATE FUNCTION MyFunction()
RETURNS @Tbl TABLE
(
StudentID VARCHAR(255),
SAS_StudentInstancesID INT,
Label VARCHAR(255),
Value MONEY,
CMN_PersonsID INT
)
AS
BEGIN
INSERT @Tbl
(
StudentID,
SAS_StudentInstancesID,
Label,
Value,
CMN_PersonsID
)
SELECT
StudentID,
SAS_StudentInstancesID,
Label,
Value,
CMN_PersonsID
FROM MyView -- where MyView selects (with joins) the same columns from large table(s)
RETURN
END
2. On Microsoft SQL Server 2005 the result of the same code execution is the opposite: view is executed faster than the "function-wrapper".
User-defined functions are subroutines made of one or more Transact-SQL statements that can be used to encapsulate code for reuse. It takes zero or more arguments and evaluates a return value. Has both control-flow and DML statements in its body similar to stored procedures. Does not allow changes to any Global Session State, like modifications to database or external resource, such as a file or network. Does not support output parameter. DEFAULT keyword must be specified to pass the default value of parameter. Errors in UDF cause UDF to abort which, in turn, aborts the statement that invoked the UDF.
CREATE FUNCTION CubicVolume
-- Input dimensions in centimeters
(
@CubeLength decimal(4,1),
@CubeWidth decimal(4,1),
@CubeHeight decimal(4,1)
)
RETURNS decimal(12,3)
AS
BEGIN
RETURN(@CubeLength * @CubeWidth * @CubeHeight)
END
Data type supported in Microsoft SQL Server 2000 Like a temporary table used to store results Mostly used to define temporary variable of type (table) and the return value of a UDF The scope is limited to function, stored procedure, or batch in which it is defined Assignment operation is not allowed between (Table) variables May be used in SELECT, INSERT, UPDATE, and DELETE CREATE FUNCTION to create UDF ALTER FUNCTION to change the characteristics of UDF DROP FUNCTION to remove UDF
Apache Hive
Apache Hive defines, in addition to the regular user-defined functions (UDF), also user-defined aggregate functions (UDAF) and table-generating functions (UDTF).[1] Hive enables developers to create their own custom functions with Java.[2]
References
- "LanguageManual UDF - Apache Hive - Apache Software Foundation". 26 June 2015.
- "HivePlugins - Apache Hive - Apache Software Foundation". 26 June 2015.