If you are familiar with programming, you might have heard of a specific paradigm called functional programming. Most modern languages, such as Java and C#, have integrated some functional elements over time. Other, like Scala and Haskell, were built around that very idea of functional programming.
This online course will look at some of the functional features available in C# 3.0, including the concept of anonymous functions, delegates and lambda expressions. But before doing that, we need to understand what functional programming actually means.
If you are interested, keep reading, and we will explore the following topics:
- Anonymous functions
- Delegates
- Anonymous delegates
- Multicast delegates
- Lambda expressions
- Lambda statements
- Expression-body members
- Expression trees
- Local functions
- Closures
In a traditional, imperative piece of code, you might use variables to store the values of earlier computations. These variables typically contain two things: either a primitive type (an integer number, a floating point number, a boolean value, …) or a reference to an object (a class that has been instantiated using the keyword new
).
The first novelty that comes with functional programming is that variables can also contain (references to) functions. Assigning functions to variables allows passing them as parameters to other functions, and even to be returned. C# is said to treat functions as first-class citizens, because it gives them the same privileges that all other variables have. Namely, you should be able to:
- pass functions as arguments to other functions,
- return functions as the values from other functions,
- assigning them to variables.
C# has been extending its support for functions programming for years. While massive improvements have been made since its first version, you should expect a somehow confusing syntax often worsen by the strong limitations imposed by the compiler.
❓ Functions or Methods?
The term function loosely describes any encapsulated piece of code that can be invoked. In the context of Object-Oriented Programming, all functions which belong to a class are called methods. All methods are functions, but not all functions are methods.
In C++, for instance, functions can be defined outside of classes. This is not the case in C#, where all functions need to reside in a class (which automatically makes them all methods). The terms function and method are often used interchangeably in C#, although the latter is often preferred.
That being said, C# supports local functions, which are (loosely speaking) special functions defined inside the body of another function. Despite the name, local functions themselves are described as “private methods [which are] nested in another member“. Even local functions are methods.
Referencing Methods
In C# all variables have a type. For instance, int x
is a variable called x which can contain an integer number. The same applies to functions: if functions can be stored into a variable, what kind of type do we need?
Let’s start with an easy example. Let’s imagine a simple function that takes no parameters and has no return value.
void F () { Debug.Log("Hello Function!"); }
While F()
invokes the function (meaning that the code it contains is being executed), the expression F
represents a reference to the function. We can use a function name to refer to it. Since functions are first-class citizens in C#, this means that there is a way to assign F
to a variable. In this particular case, a variable that can hold F
can have System.Action
type:
Action a = F;
Action
is not a keyword of C#, but rather a type introduced by .NET. It represents the type of a function with no parameters or return type.
Now that the variable a
contains a reference to F
, we can invoke it by simply treating a
itself as a function: a()
.
Delegates
C# has the concept of delegate, which is a way to describe the type of a method. Action
, for instance, is a built-in delegate defined inside .NET that represents all functions with no parameters or return type. Such delegate can be defined like this:
delegate void Action ();
Basically, it works by adding the keyword delegate
before the template signature of all functions we want to be able to reference. The name used—Action
, in this case—becomes the name of the delegate and, consequently, the name a new type that can be used to reference void functions with no parameters.
It is important to notice that delegates are a type definitions, not variable declarations. So, in the same way you cannot define a new enum type inside a method, you cannot define a new delegate type inside a method either.
namespace TestNameSpace { delegate void TestDelegate1 (); // ✔️ Defined in a namespace public class TestClass { public delegate void TestDelegate2 (); // ✔️ Defined inside a class public void Start () { delegate void TestDelegate3 (); // ❌ Defined inside a method } } }
Below, a more complex example that shows a delegate (called IntTest
) for all functions that are taking one integer and returning a boolean:
delegate bool IntTest (int x);
When specifying input parameters, delegates need to give an actual name to each one. Those names are not binding, meaning that you can match any method which takes an integer as a parameter, regardless of its name.
📚 Anonymous Delegates
A delegate can be initialised with the name of an existing method. However, this is not the only way to use them. In fact, you can create a function “on the fly” and assign it directly to the delegate:
Action a = delegate () { Debug.Log("Hello World"); };
This syntax is referred to as anonymous delegates, and it essentially creates a private method with no name.
Nowadays this syntax is somewhat discouraged, since C# has introduced a more elegant solution to write inline code, called lambda expressions. Using a lambda expression, the code from the snippet above becomes:
Action a = () => Debug.Log("Hello World");
The next section will also cover lambda expressions and how to use them. So keep reading if you are interested!
📚 Variance in Delegates
One interesting property of delegates is that the return type does not have to match exactly. For instance, if the return type of a delegate is Animal
, it will also accept Cat
s. This is know as delegate covariance:
class Animal { ... } class Cat : Animal { ... } delegate Animal ReturnAnimal (); Animal FunctionReturningAnimal () { ... } Cat FunctionReturningCat () { ... } void Start () { ReturnAnimal d1 = FunctionReturningAnimal; // ✔️ Exact match! ReturnAnimal d2 = FunctionReturningCat; // ✔️ Cat ⊂ Animal }
The same, unfortunately, does not apply for input parameters. If a delegate requires a parameter to be of type Animal
, it will not accept a method which takes a Cat
. This actually makes sense, because otherwise there is a chance you might be able to pass another animal (let’s say, a Dog
) to a method that can actually operates on Cat
s only:
delegate void ParameterAnimal (Animal a); FunctionParameterAnimal (Animal a) { ... } FunctionParameterCat (Cat c) { ... } void Start () { ParameterAnimal d1 = FunctionParameterAnimal; // ✔️ Exact match! ParameterAnimal d2 = FunctionParameterCat; // ❌ Cat ⊂ Animal }
However, C# supports delegates contravariance, which means that if your delegate accepts a Cat
, you can actually match a method which takes an Animal
. This is safe, because if the function works on all Animal
s, it is guaranteed to work with Cat
s as well:
delegate void ParameterCat (Cat c); void Start () { ParameterCat d1 = FunctionParameterCat; // ✔️ Exact match! ParameterCat d2 = FunctionParameterAnimal; // ✔️ Animal ⊃ Cat }
You can read more about covariance (using a more derived type) and contravariance (using a less derived type) in this article titled Using Variance in Delegates (C#).
Built-in Delegates
In C#, there are several built-in delegates that you can use. They cover the majority of cases, so that you do not have to create new delegates every time. Action
, for instance, is one such delegate. But there is also a generic variant that matches methods with one parameter and no return type, called Action<T>
:
void F (int x) { ... } void G (float x) { ... } void Start () { Action<int> di = F; Action<float> df = G; }
.NET includes 16 generic variants of Action
, so that you use them to match void functions with up to 16 parameters. If you need more (do you???) you will need to create a new custom delegate.
.NET also includes generic delegates for functions which have a return type. Func<TResult>
, for instance, represents a function with no parameters with return type TResult
. And, as it happened for Action
, there are 16 other variants to support input parameters, in the form of Func<T1, T2, ..., TResult>
.
There are also Predicate
and Predicate<T1, T2, ...>
which are used for boolean functions.
Action
Action<T1, ...>
Func<TResult>
Func<T1, ..., TResult>
Predicate
Predicate<T1, ...>
But there are many other used for specific purposes, such as event handling and thread synchronisation.
⚠️ Delegate Casting
It is important to remember that, although two delegates might represent the same method signature, they are different types and you cannot implicitly cast from one to another:
delegate void DelegateInt (int a); void F (int a) { ... } void Start () { Action<int> d0 = F; DelegateInt d1 = F; d1 = d0; // ❌ Different types (no implicit cast) d1 = (DelegateInt) d0; // ❌ Different types (no explicit cast) }
This might seem unfair, but it is important to remember that the same applies with traditional types as well. You cannot cast from one class type to an unrelated one, even if they share the same structure.
However, C# provides a workaround which requires to instantiate a new delegate (yes, delegates are objects!) via its constructor:
d1 = new DelegateInt (d0); // ✔️ A new DelegateInt object is instantiated
📚 Multicast Delegates
So far we have used delegates as references to methods. More specifically, once initialised one delegate can store the reference to one method. And when invoked, the method it is referencing is invoked. This is known as a singlecast delegate.
However, delegates can actually store a list of references to matching methods. And when the delegate is invoked, all of the functions it references are invoked in sequence. This additional way of using delegates is known as multicast, and it really resembles the way in which event handlers are added C# when listening for a specific event.
If you want a singlecast delegate to become a multicast delegate, no change is actually needed. You only need to use the +=
operator, and the new method will be added to the list. The example below shows how you can add the same method three times, causing the execution of the delegate to call three methods, not just one:
void HelloWorld () { Debug.Log("Hello World"); } void Start() { Action multicast = HelloWorld; multicast += HelloWorld; multicast += HelloWorld; multicast(); }
📰 Ad Break
Lambda Expressions
C# 3.0 introduced a new syntax to create anonymous methods, which are very handy in many scenarios. This feature is referred to as lambda expression, and it get its name from a concept known in programming as lambda. Based on the context and language, lambdas are also known as (or strongly related to) anonymous functions, blocks or closures.
Effectively, lambdas are a more expressive way to define a function, and are specifically designed to work well in the context of functional programming. This means that lambdas are often used to pass code as parameters to other functions.
Lambdas are effectively methods, and so they can be assigned to any delegate which matches their signature. So let’s see an example of how to initialise an Action
delegate in three different ways:
void Start () { // ✔️ Method → Delegate Action a0 = F; // ✔️ Anonymous delegate → Delegate Action a1 = delegate () { Debug.Log("Hello World!"); } // ✔️ Lambda → Delegate Action a2 = () => Debug.Log("Hello World!"); } void F () { Debug.Log("Hello World!"); }
All of the three methods above are equivalent: the Action
delegate will contain a reference to a method which prints the "Hello World!"
string to the console. However, the third one is undoubtedly the most compact.
What really defined the lambda function is the lambda operator, =>
(which is also known as hashrocket operator in Ruby) and it supposed to be read as “goes to“. So, the lambda expression x => x+1
should be read as “x
goes to x+1
“. You can read more about this on the relative => operator (C# reference page).
Their anatomies can be tricky to grasp at first. The first two round brackets contain a list of the input parameters. Unlike a delegate, they only include names, and not types. This is because the type of a lambda expression is automatically inferred by its context.
❓ Why Lambda?
The term “lambda expression” might seem a bit unusual. The lambda comes from a formal system in mathematical logic called Lambda calculus. It was originally invented before computers, but was nonetheless design to represent computation based on the concept of functions.
Nowadays the term lambda is used by many languages to indicate either anonymous functions or inline functions. However, each language that implements lambdas does it in a slight different way and has its own specific language and notation.
📚 LINQ
Lambdas are very helpful because many .NET methods—and the LINQ related ones in particular—are designed to work with them. LINQ is an extension that provides many useful way to manipulate data structures without the need for verbose for loops. For instance, the following LINQ queries takes an array of integer numbers, and returns only the even ones, sorted:
using System.Linq; ... int[] array = ... int[] evenSorted = array .Where(i => i % 2 == 0) .OrderBy(i => i) .ToArray();
This is made possible by the fact that the LINQ extension Where
takes a function as a parameter. Without lambda expressions, the simple expression that is i % 2 == 0
would either need its own function, or to be declared as an anonymous delegate.
⚠️ Lambda Type Inference
Unlike traditional methods and delegates—which type is explicitly defined in their signature—the return and parameter types of a lambda are inferred by its context.
In the example below, for instance, the same lambda expression is interpreted in two different ways. In the first line, the parameter x
is assumed to be int
, while it is float
in the second:
Action<int> a = x => Debug.Log(x); // ✔️ "x" inferred as int Action<float> a = x => Debug.Log(x); // ✔️ "x" inferred as float
When there is only one input parameter, the round brackets can also be omitted. They are, however, necessary when there are no parameters, leading to the somewhat cumbersome () =>
syntax.
In a similar fashion, even the return type is automatically inferred by the context. Also, lambda expressions do not need to use the return
keyword: the value they produce is automatically assumed to be output:
Func<float, float> a = x => x / 2; // ✔️ Return type inferred as float: float division performed Func<int, int> a = x => x / 2; // ✔️ Return type inferred as int: int division performed
It is important to notice that, besides the obvious syntactic sugar, lambdas do not really add any new functionality that was not already possible with delegates. But they make the overall syntax much more compact.
Because the parameter and return types are inferred, it is not possible to infer the type of a lambda without any context. This means that it cannot be naively assigned to a var
(which automatically infers the type of a variable):
// ❌ Lambda parameter and return types are inferred from context // If there is no context, it cannot be used! var v = x => x + 1; // ✔️ Thanks to the casting, how the compiler knows the type of the lambda var v = (Func<int,int>) (x => x + 1);
A similar issue arises if we try to assign a method to a var
variable.
⚠️ Limitations of Anonymous Functions
When they are used to create anonymous functions, both delegates and lambdas are subjected so some limitations that “traditional” methods do not have.
- Anonymous functions cannot be iterators (i.e.: they cannot use
yield return
); - Anonymous functions can use generic parameters that have been defined in the method or class they are contained. They cannot, however, define new generic parameters (as you can with a “traditional” method).
Statement Lambdas
On top of expression lambdas, the C# notation also supports statement lambdas. These are a way to create anonymous functions which need more than a single expression. They work in a very similar way, but the expression part is replaced with the more traditional body of a function, including the curly brackets and return
keyword (if needed).
// Lambda expression Func<int, int> a = x => x + 1; // Lambda statement Func<int, int> a = x => { return x + 1; };
📚 Expression-bodied Members
C# allows from methods to be specified with the lambda expression syntax. For instance, the following two declarations are equivalent:
// Traditional method declaration int Add (int a, int b) { return a +b; } // Expression-bodied member declaration int Add (int a, int b) => a+b;
You can learn more about this way of declaring methods on Expression-bodied members (C# programming guide).
📚 ref & out
Both delegates and lambdas support ref
and out
parameters. In some cases, however, the syntax can get a bit tricky:
delegate void RefOutDelegate (ref int x, out int y); void Start() { // ✔️ Correct syntax to use ref and out with anonymous functions RefOutDelegate d0 = RefOutFunction; RefOutDelegate d1 = delegate (ref int x, out int y) { y=x; }; RefOutDelegate d2 = (ref int x, out int y) => { y=x; }; // Needs to explicitly refer to the type! // ❌ ref and out cannot be used with built-in delegates Action<ref x, ref y> a = RefOutFunction; } void Function (ref int x, out int y) { y = x; }
📚 Expression Trees
The more navigated developers among the readers might be familiar with the treacherous subject of self-modifying code. Given the complexity and potential dangers, the technique has fallen into disuse. However, the ability to execute arbitrary code that can somehow be constructed programmatically is an idea that is still around. Many language, such as JavaScript and PHP, offers a variant of this functionality through the eval
function, which allows to execute arbitrary code from a string.
C# does not have a direct equivalent of eval
, but if offers a similar functionality through the so-called expression trees. An expression tree represents code in a tree-like data structure, where each node is an expression, for example, a method call or a binary operation. Their power resides in the fact that this structure can be manipulated at run-time, de-facto allowing for self-modifying code.
Lambda expressions can be automatically converted into expression trees, using the following syntax:
using System.Linq.Expressions; ... Expression<Func<int, int, int>> expression = (a, b) => a +b;
This does not work with lambda statements (multi-line lambdas).
The power of expressions also allows to print the code of a lambda function back in a readable format. You can learn more about expression trees here: Expression Trees (C#).
Scope and Closures
Closures are one of those topics that—along monads and other functional-related terms—is often poorly explained or misunderstood. Understanding what closures are in C# and why they are important …is important!
We can think of lambdas—and anonymous functions in general—as pieces of code that we can define and call. It is very common for a lambda to have all of its necessary data passed as a parameter. A typical example is x => x+1
, which takes one parameter and it only operates on that parameter alone. Understanding and implementing those kind of lambdas is easy.
Generally speaking, the code of most lambda expressions can be “stored” by the compiled inside a private methods. This works well for both the “self-contained” lambdas. But what happens when the lambdas are trying to access something that is outside their scope?
The answer, unsurprisingly, depends on what exactly they are trying to access. The easiest case is when a lambda is referencing a member of the same class.
The example below (check on ShaderLab.io) shows a simple lambda that sets a public member of its class to zero. If we look at the generated C# code through a decompiled, we will see that the lambda has simply been “converted” to a private method (renamed ResetMethod
for clarity):
// Original code public class C { public int x = 0; public void Start () { Action action = () => { x = 0; }; } } // Compiled (with variables renamed for clarity) public class C { public int x = 0; public void Start() { Action action = ResetMethod; } private void ResetMethod() { x = 0; } }
This works even if the lambda is invoked from another class. The original instance will not be garbage collected until all references to the lambdas are released.
Closures
Things get more complicated when a lambda accesses a variable inside the scope of the function in which it is defined. What makes this case much more complicated is the fact that the lambda could be potentially invoked when the function (along its stack) has already ended. This means that the variables it needs to access might have been already destroyed.
To ensure that the lambda stays “callable”, the variables it is accessing must somehow survive the end of the function in which they are defined. C# already provides a mechanism to do this: classes! This means that when a lambda accesses a variable defined inside a function, that variable (along with the lambda code) gets extracted and placed in a newly created class by the compiler. This is what a closure is.
You can see the original code, and its compiled version, below (check on SharpLab.io):
// Original public class C { public void Start () { int x = 0; Action action = () => { x = 0; }; } } // Compiled (with variables renamed for clarity) public class C { private sealed class Closure { public int x; internal void ResetMethod() { x = 0; } } public void Start() { // Instantiates the closure Closure closure = new Closure(); closure.x = 0; // Assigns the delegate Action action = Closure.ResetMethod; } }
The class Closure
provides the environment necessary to run the lambda code safely (i.e.: the variable x
, now “upgraded” to a class member). In the Computer Science literature, x
is also referred to as a free variable, and it is said to be captured.
If you are interested to learn about closures in C#, A Simple Explanation of C# Closures is a comprehensive and accessible read.
📰 Ad Break
Implementation
To better understand how anonymous functions work, it is helpful to see how lambdas and delegates are actually compiled. There are many online tools that can show you compiled C# code, including SharpLab.io. The code below shows how a simple lambda expression gets compiled (variable names have been changed for clarity):
// Original public class C { public void Start () { Action action = () => { Debug.Log("Hello World!"); }; } } // Compiled (with variables renamed for clarity) public class C { [Serializable] private sealed class Lambda { public static readonly Lambda Singleton = new Lambda(); public static Action HelloWorldDelegate; internal void HelloWorldMethod { Debug.Log("Hello World!"); } } public void Start () { // Instantiates the delegate (if it has not done before) if (Lambda.HelloWorldDelegate == null) Lambda.HelloWorldDelegate = Lambda.Singleton.HelloWorld; // Assigns the delegate Action action = Lambda.HelloWorldDelegate; } }
The lambda () => { Debug.Log("Hello World"); }
becomes the method (called HelloWorldMethod
) of a new class (called Lambda
) which is generated by the compiler (the exact same thing would happen if we had used an anonymous delegate instead). Yes: anonymous function in C# are actually methods of a new class created ex novo by the compiler.
It is interesting to see that for this to work, an instance of Lambda
(referenced in the static member called Singleton
) must be created. When used in this way, anonymous functions do cause memory allocation.
If other delegates or lambdas are to be created in the same class, each one will all become a different members of the compiler-generated Lambda
class.
📚 Local Functions
C# 7 introduced another concept: local functions. In a nutshell, a local function is a function defined inside another function. Conversely to what one might think, they are actually different from lambda functions. This is because while lambdas are actually objects, local functions might not be.
public int Function (int a, int b) { return Add(a,b); int Add (int a, int b) { return a + b; } }
If you are planning on using a local function as you normally would with any other members, they can be more efficient since they requires no additional memory allocation on the heap. Because local functions are defined at compilation time, they can be safely referenced from anywhere (within their scope). Conversely, a delegate might have not been initialised when it is invoked at runtime, causing an error.
Another important difference with lambdas is that local functions can use the yield keyword. This can be helpful if you want to use them for iterators.
What local functions are actually compiled into depends on how they are used. IIf a local function is assigned to a delegate, or used as a lambda, it gets automatically compiled into a lambda.
The C# Programming Guide suggests that for recursive algorithms, local functions might be more appropriate than lambdas. If you want to learn more about local function, Dissecting the local functions in C# 7 is a very good read.
Conclusion
This articles provided a general introduction to the concept of functional programming in C#, exploring the following constructs:
- Anonymous functions
- Delegates
- Anonymous delegates
- Multicast delegates
- Lambda expressions
- Lambda statements
- Expression-body members
- Expression trees
- Local functions
- Closures
There is so much more that could be say about functional programming in C#! Future posts will explore some of the most commonly used patterns in functional programming, including a deep dive into one of C#’s most loved/hated extension: LINQ.
Become a Patron!If you think this article helped you, please, consider supporting me on Patreon.
Thank you! 🙏
Leave a Reply