Contents
Introduction
The current version of Visual Studio (2013) came with a minor framework update in form of the .NET-Framework 4.5.1. What was missing? An update of the C# language! Not even Roslyn was included, but I think that this is no problem at the moment, since it is quite likely to be included in the next version. Aren't we happy with the range of possibilities that are delivered by C#? I think that has nothing to do with the fact that software is never finished and that one can always improve something.
So what will be in the next version of C#? First of all, we will not see such a game changing feature as with the async
/await
keywords. Nevertheless it seems like there will be some quite useful new constructs available. Additionally the compiler will be even better in doing his job and help us where possible. Together with some other upcoming features, like RyuJIT, we will not only be more productive, but also get more performance for free.
There is still some work in progress and some of those features might not make it to the next version, if ever. Also this article is very speculative about how these features work. So everything in this article has a certain possibility for failing. I'll try to update the article once the next version shipped, such that future visitors of this article will not be confused by some bogus information.
Background
C# has evolved from a clone of Java that is worse than the original to an elegant language with a ton of useful features. All in all C# should be the language of choice if the target platform is Windows or if one wishes to program efficiently cross-platform by using Xamarin's Mono.
Currently C# is available in its 5th version, accompanied with powerful concurrency features and a very mature type system. Let's take a short look back on some of the various improvements that have been introduced since the first release of C#:
- C# v2
- Generics
- Anonymous methods
- Partial types
- Nullable types
- C# v3
- Query syntax
- Extension methods
- Lambda expressions
- Type inference
- Anonymous objects
- Automatic properties
- C# v4
- Co- / contravariant type parameters
- DLR integration
- Named and optional arguments
- C# v5
- await / async
- Caller information
With that information back in our mind, let's have a look on what features are likely to be included in the next version of the C# programming language.
A list of features
In the following a list of possible features included in the upcoming version of C# is presented. The list of features is still not cut in stone, which is why I placed a probability qualifier in the end. A high probability means that I personally would say it is very likely to be included, whereas a low probability means that I personally have my doubts about this feature.
It should also be noted that the presented example code will not compile on current C# compilers (December 2013). Also even if the discussed feature is included in the next version, there is some probability that the syntax has been changed or that I made a mistake or typo. In any case just view the examples as a guideline about how code might look in the real-world.
Primary constructor
The primary constructor feature is a neat one. Basically this feature is intended to solve the re-occurring problem of writing code like the following:
class Point { int x; int y; public Point(int x, int y) { this.x = x; this.y = y; } }
Argh, this is a lot of code for just saying "this class has no empty default constructor and all input arguments are assigned to fields". The next version of C# might therefore have a primary constructor directly in the class name. This then could look like the following:
class Point(int x, int y) { int x; int y; }
It is also possible that the variables, x
and y
in this case, are auto-generated by the compiler. This would make the primary constructor even more useful.
class Point(int x, int y) { /* We automatically have fields x and y */ }
This is quite useful for such little code snippets. However, we can use it even better for constructors as shown in the following example. Suppose we have the following abstract base class, which does not have a parameterless constructor:
abstract class Token { int code; public class Token(int code) { this.code = code; } }
Of course we could simplify this construct by using a primary constructor, but actually this is not the possibility we are interested in. We are now interested in how we could use primary constructor feature to simplify the following class definition. The following code showcases a class that derives from a base class, which does not have a parameterless constructor:
class CommaToken : Token { public CommaToken() : base(0) { } }
The constructor represents some redundant code, since we are only creating it for picking the right base constructor. This is not very efficient, so a better syntax for such scenarios would be much appreciated. Let's rewrite it with a possible C# vNext style!
class CommaToken() : Token(0) { }
The problem would be solved quite elegantly with this syntax! Therefore primary constructor could be used for simplifying our classes and for simplifying base constructor calls. We can define primary constructors or call them.
Now the question is: Are there any constraints? And actually there might be one constraint. We can still have other constructors, but if we place those constructors, we need to ensure that the primary constructor gets actually called. Let's see this in an example:
class Point(int x, int y) { public Point() : this(0, 0) { } }
On the other side: What about partial classes? Obviously only one part is allowed to define a primary constructor and therefore to call a base constructor with arguments. All other partial definitions would not be allowed to do that.
All this should also be valid for a struct
type, however, there may be some subtle differences that come along with the unique properties of a structure in C#. This does not change the general thinking behind primary constructors. This feature seems to be very likely to be included as it would reduce some redundant code.
Probability: HIGH
Readonly auto-properties
Auto-properties are really useful. We can use them at several occasions and we should actually prefer them to fields in the first place. There are reasons for this, but we will not discuss them in this context.
The problem with auto-properties is that they are only useful if both, a getter and a setter are being expressed. This means we usually see a lot of code that is similar to the following example:
public int MyProperty { get; set; }
Of course we do not always want to make the property public
. The only thing we can do right now is changing the modifier of the set
part. This could look as follows:
public int MyProperty { get; private set; }
However, now we are back at a pretty ridiculous property from the class's perspective. Why wrapping such a setter in a method, if one does not need the method character? Of course there are good reasons for this. We could easily extend the method body while not having to alter any other code. We could also change the setter to public
again or create a protected
setter. All this is easily possible with the current construct. Nevertheless for the purpose that is covered by the expression above, the expression itself seems to add too much overhead.
What we want is a (private) variable to set and a a (public) property to get. Of course this has always been possible:
private int value; public int MyProperty { get { return value; } }
The problem is that we need to write a lot of code to achieve something trivial especially if we only want to initialize the property! So let's try to come up with a possible solution. Maybe we could directly assign to the property as in the following example:
class Point { int x; int y; public double Dist { get; } = Math.Sqrt(x * x + y * y); }
The auto-property now has a kind of initializer, but it will still create a backing field. Of course this feature really starts being useful together with automatic fields initialized by primary constructors. Let's see the two together:
class Point(int x, int y) { public int X { get { return x; } } public int Y { get { return y; } } public double Dist { get; } = Math.Sqrt(X * X + Y * Y); }
This syntax seems to be great if we want to ensure that properties only get initialized once, and will never change their value later on. Therefore this is like applying the readonly
keyword to a field.
Probability: HIGH
Static type inclusions
The Java programming language is able to do a static type inclusion. Additionally some people familiar with .NET already know that VB is also capable of doing this. Static type inclusion means that the methods defined in a static class (also called procedures and functions) can be invoked, without specifying the name of the static class for each call. So this is like seeing the static class as a namespace. In this sense the static type inclusion is a possibility of using this "namespace".
What kind of scenarios are therefore possible? Let's first think about a class with a high usage of math functions. Usually one would be required to write something like:
class MyClass { /* ... */ public double ComputeNumber(double x) { var ae = Math.Exp(-alpha * x); var cx = Math.Cos(x); var sx = Math.Sin(x); var sqrt = Math.Sqrt(ae * sx); var square = Math.Pow(ae * cx, 2.0); return Math.PI * square / sqrt; } }
Static type inclusions lets us now tell the compiler that we want to act as if the functions (in this case Exp
, Cos
, ...) would have been defined in this class, MyClass
. How does this look?
using System.Math; class MyClass { /* ... */ public double ComputeNumber(double x) { var ae = Exp(-alpha * x); var cx = Cos(x); var sx = Sin(x); var sqrt = Sqrt(ae * sx); var square = Pow(ae * cx, 2.0); return PI * square / sqrt; } }
This feature is highly controversial and might lead to strange code and ambiguities. However, in the end one should decide if it is really worth using this feature from case to case. If the inclusion imports functions with names like Do
,Create
, Perform
or Install
then confusion is probably part of the game. If math functions are imported the decision is quite obvious.
Probability: MEDIUM
Derived property expressions
Earlier we've seen that the properties are being extended with C# vNext. This is great, since properties make C# a very powerful language, that is also fun to write. Properties have been very useful from the start, however, with the integration of automatic properties we got rid of any excuse not to use them. They are just everywhere and this is definitely a good thing.
Nevertheless, people wished to have properties that do not reflect the value of a field, but a computation of other fields. In UML such a property is called a derived property (usually marked with a slash). Now, there is no excuse in not using those. Such properties are very useful, since they are very crucial for the information hiding principle. After all, even a simple property like Count
might be a derived property in most scenarios. This means that the actual value is computed when the property is accessed.
Until today we write such derived properties as follows:
public class Point { double x; double y; public double Dist { get { return Math.Sqrt(x * x + y * y); } } }
How can this be even shorter? Well, the C# team obviously thought a lot about this and in my opinion the only two things that could be shortened are:
- The
return
keyword (obviously we always want to return something) - The braces (same as with lambda expressions)
In fact the resulting syntax is quite close to the one from lambda expressions. Finally we just need to assign an expression to the property. Let's rewrite the previous example with the new syntax from C# vNext.
public class Point { double x; double y; public double Dist => Math.Sqrt(x * x + y * y); }
This syntax is using the fat arrow operator to assign an expression to the public member Dist
. This is what we call a property. All in all I personally feel the tendency to use lambda expressions / the fat arrow operator a lot more often. This trend is a good one, given that many people actually like this kind of syntax.
Probability: HIGH
Method expressions
The property expression is quite nice, but what is the lesson from this language refactoring process? If we can do it with properties, is there another part of the language where this feature might be useful? In fact there is one: Since properties are just methods, why shouldn't we able to decorate methods the same way?
Well, actually we should. Right now it seems very likely to me that this would be possible as well. Let's consider the following method:
public class Point { double x; double y; public double GetDist() { return Math.Sqrt(x * x + y * y); } }
It is no coincidence that the chosen example for this feature is very close to the one given in the property expression section. In fact this is just a simple transformation to illustrate the point being made here. Now let's use the same syntax as before by applying it to the given method.
public class Point { double x; double y; public double GetDist() => Math.Sqrt(x * x + y * y); }
Unsuprisingly this is really similar and uses the exact same syntax as before. So one line methods can now actually look like the following example, constructing an arbitrary class Rect
:
public class Rect { double x; double y; double width; double height; public double Left => x; public double Top => y; public double Right => x + width; public double Bottom => y + height; public double ComputeArea() => width * height; }
As before, this might be handy, but it is not that much shorter. Of course this feature is coupled to the general idea of allowing those expressions. If properties won't have them, they will also not be available for methods. On the other side, even if properties allow such expressions, we cannot be sure that they will be allowed for methods. There are actually reasons against this.
Personally, I like this feature and even though the benefit is minimal, it adds a fresh touch to the language. But every feature that makes C# somewhat more complex and productive, will also result in a possible barrier for newcomers.
Probability: MEDIUM
Enumerable params
Parameters are kind of fixed in C#. If we have a method that accepts three parameters, we need to insert three. Of course we could work with default values, however, we will have to specify a name for each parameter. A variadic parameter as in C/C++ is missing.
"Missing?!" you say... Yes, of course there are two exceptions. One is the undocumented feature called __arglist
. But this is a mess to work with. Additionally this feature would not be undocumented and anyone should use it! So what is the other exception? It is of course the params
keyword. How does this work?
The compiler will identify the specified method with a method, that has a params
argument. This argument has to be the last argument. Now the compiler will identify all parameters that would belong to the params
argument, create an array and pass this array. Within the method the params
argument behaves like a normal array (as it is a normal array).
Let's illustrate this with some code:
void Method(int first, params int[] others) { /* ... */ } // Compiler creates Method(1, new int[0]): Method(1); // Compiler creates Method(1, new int[1] { 2 }): Method(1, 2); // Compiler creates Method(1, new int[2] { 2, 3 }): Method(1, 2, 3); // ...
The compiler might also do some implicit casting for creating this array. Additionally we can have overloads that are more specialized, e.g. for 3 arguments or 1 argument. The method specification with the params
argument is only used if no other overload matches the signature demanded by the caller.
So far so good. But at the end of the day we might not always want to access all values given in the params
parameters. Additionally the code has some little things, which are sometimes nice, sometimes annoying:
void Method(params object[] p) { /* ... */ } // Compiler uses the call: Method(new object[] { 1, 2, 3 }); // Compiler creates Method(new object[] { 1, 2, 3 }): Method(1, 2, 3); // Compiler creates Method(new object[] { new object[] { 1, 2, 3 } }): Method(new object[] { 1, 2, 3 }.AsEnumerable());
The last method call illustrates the problem. Why can't we hand in an enumerator as a variadic parameter? C# vNext now tries to allow this. The feature is based on the exact same syntax as before, however, we need to change the expected object from an array, to an enumerable. Of course this is much more flexible and works great together with LINQ.
The transition is then actually from a T[]
array to a IEnumerable<T>
enumerable.
void Method(params IEnumerable<int> p) { /* ... */ } // Compiler uses the call: Method(new int[] { 1, 2, 3 }); // Compiler creates Method(new int[] { 1, 2, 3 }): Method(1, 2, 3); // Compiler uses the call: Method(new int[] { 1, 2, 3 }.AsEnumerable());
This now is "backwards" compatible, since every array implements an enumerator. Obviously this feature is also great since it will allow basically every enumerable to be passed as parameters, e.g. List<T>
instances.
var list = new List<int>(); list.Add(1); list.Add(2); /* ... */ // Compiler uses the call: Method(list);
In what scenarios shouldn't we use this feature if its available? Well, sometimes we are actually interested in the parameter count. In this case one would be forced to iterate over all the elements. Of course, there is a shortcut by calling the Count()
extension method, however, the required instructions for the iterations still need to be executed. Hence we could say that if we need a fixed set of elements, then the currently existing method is the way to go, otherwise we should definitely prefer to use the enumerable parameters.
I personally would love to see this feature, since it makes working with a variable amount of parameters more flexible.
Probability: HIGH
Monadic null checking
What the heck is this? Actually I know too many C# developers who are not aware of the coalescing null operator, or??
in code. And now this! Another operator that does something with null
references. null
is a great way of introducing bugs, having more if
statements than needed and bringing in complexity in trivial programs. Nevertheless, we require the notation of an unset reference, otherwise everything would be required to fall back to some default value, which would be required to be defined.
Now to the point: We still (obviously) have the notation of null
references. This will also not go away, at least not in C#. However, sometimes these constructs tend to become very repetitive:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); if (a != null) { var b = a.SomeProperty; if (b != null) { var c = b.SomeOtherProperty; if (c != null) return c.Value; } } return 0; }
Ouch! Does this code look familiar? Basically we get the result from some method call. This result has some property, which has some properties etc. Normally we could just write a.b.c.Value
, however, if any of the property calls returns a null-reference, then we will be in trouble. There are really funny folks out there writing code as follows:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); try { return a.b.c.Value; } catch (NullReferenceException) { return 0; } }
Great! This is much shorter to write, but more expensive and really stupid code, since it will not only catch null references on a
, b
, c
, but also within the property calls. Therefore any "real" null exception will also be caught, without us noticing that this is a real bug happening.
The conclusion is that we have no other choice than to write lengthy code. Or we make use of the null coalescing operator:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); return (((a ?? new A()).b ?? new B()).c ?? new C()).Value; }
This works, but it is not very efficient. The main problem here is that we are allocating a new object for every reference that is missing. This object is only required for a very short time. If we now call theSomeMethodCallWithDefaultValue
method very often, we might become a performance problem with the GC being required to run to often.
Of course there is solution by supplying "default" objects. An example for such a default object is EventArgs.Empty
. Basically this is also just new EventArgs()
, but pre-allocated in a static readonly variable. Our example would thus transform to:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); return (((a ?? A.Empty).b ?? B.Empty).c ?? C.Empty).Value; }
Now the question is: Why do we have to do this, when these classes could automatically return the default instance instead of a null reference? We'll probably never know... Back to the original question: How can we achieve the same as in the previous code, but without creating new instances or using sample instances? The answer is of course given by the title of this section: monadic null checking!
C# vNext could introduce a new dot-operator: ?.
. Obviously this could create a chain, which would be broken once a null reference is detected. In such a case we obviously require a default value. Otherwise we will return the result of the chain.
How could this chain look like in C# code? Let's see in terms of our previous example:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); return a?.b?.c?.Value ?? 0; }
Looks a little bit like the ternary conditional operator. This does not appear to be very complicated, but does a nice job in hiding the complexity from us. Obviously null
values are required (i.e. nullable<T>
structs or classes in general) for this to work. Otherwise the usage of the null coalescing operator would not make much sense. Else what's the outcome if we break the chain, but do not supply the null coalescing operator, i.e. a default value?
Even though this feature seems again very useful there are some very important questions that need to be discussed. Why shouldn't value types be included? If value types are included, should those be casted to nullable? Of course the chain cannot break on value types, since they will never be null
, however, if the last (i.e. returned) value is a value type, then we cannot use the null coalescing operator, as mentioned earlier.
Nevertheless some treatment for structures is required. Let's now see the code with a possible solution in action:
int SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); var c = a?.b?.c; return c != null ? c.Value : 0; }
This solution avoids value types in general. Therefore we cannot use it as previously shown. Nevertheless by using the monadic null checking, we could reduce the complexity of the method. Additionally the whole thing looks a lot more reader-friendly.
Let's see another possible solution:
int? SomeMethodCallWithDefaultValue() { var a = SomeMethodCall(); return a?.b?.c?.Value ?? 0; }
Obviously this introduces an implicit cast. So what kind of solution is better? Both have their flaws, but also their positive sides. Since a decision is required, all options have to be thoroughly considered. I think such a feature would be really nice to have, but the questions that I can think of have to be answered. Let's see if such a feature makes it into the next version, and if so, how the C# team solved it.
Probability: MEDIUM
Constructor type inference
Generic methods are super awesome in C#. What makes them so awesome? Well, those methods have what is called type inference. In fact if we look at the following code we will see what is meant by that:
void Read<T>(T obj) { /* ... */ } //... string a = "Hi"; int b = 5; Read(a);//T is resolved as string Read(b);//T is resolved as int Read<double>(5.0);//Explicit, T is double
So by just calling the method the compiler is able to detect the type parameter. This only works if all type parameters are directly or indirectly used for parameters. The following method cannot be used with type inference:
T Write<T>() { /* ... */ }
There is no way for the compiler to resolve T
, just by looking at the method call (which will in this case always look likeWrite()
). Of course one could argue, that inferring this type is possible in theory, if one considers expressions like int a = Write()
. However, this is currently not supported by the C# compiler.
This type inference with methods goes even one step further. Let's consider the following bunch of methods:
void DoStuff() { } void DoStuff<T>(T obj) { } void DoStuff(int a) { }
Obviously we can have overloads of a generic method. If we do not specify a parameter, then the first method is taken. In case of a single parameter, the choice depends on the type of the parameter. In case of an integer, the last method will be taken. In any other case we'll end up with the generic method.
Actually we can have a similar case with classes. Let's say we have the following kind of classes:
class MyClass { } class MyClass<T> { }
This is quite similar to the case with methods. It even gets more similar if we think in terms of constructors. A constructor is a method that we get for free. It is automatically invoked once a new object has been allocated. Additionally constructors control if it is possible to construct an object, and what parameters are required to instantiate the object. To trigger the invocation of a constructor we need to instantiate the object with the new
keyword.
With the two classes from above this would be possible by using the following instructions:
var nongeneric = new MyClass(); var generic = new MyClass<int>();
Now the constructed casse is similar to the T Write<T>()
one. Here it is definitely not possible to infer the type parameters from the instruction without the explicit naming of the type parameters. But what about the following case:
class MyClass { } class MyClass<T> { MyClass(T obj) { } }
Here we replaced the implicitely given standard constructor of the generic version with a constructor, that takes one argument. The argument has the same type as the generic type parameter of the class.
The question that we could ask now: Is the following call possible?
var nongeneric = new MyClass(); int parameter = 5; var generic = new MyClass(parameter);
Obviously the answer is no, at least until now. But why? The reasoning behind this is the possible ambiguity between various constructors. Let's change our example again:
class MyClass { MyClass(int p) { } } class MyClass<T> { MyClass(T obj) { } }
In this case we have two possible problems:
- In case of passing an integer parameter, we cannot take the generic one for sure. Explicit methods had a higher priority before, and in order to stay consistent with that we also need to prefer explicit constructors in this case.
- In other cases we do not know if the parameter is actually being passed in by mistake or if the generic version should really be able to handle this.
As already mentioned older C# versions had a very elegant solution to this problem: Avoid the problem by not allowing constructor type inference. C# vNext finally wants to include constructor type inference. What about our two problem?
- As with methods, explicit constructors will be preferred. This means non-generic beats generic. In such cases we still need to express the type arguments.
- If a generic constructor matches the signature then a generic constructor will be taken. We are responsible for avoiding mistakes by not using
var
, or having a look at the type beind inferred by the compiler.
All in all I think that this feature is really useful. This means that instead of writing, e.g.
var pair = new KeyValuePair<int, string>(0, "zero");
we can write the same statement as follows.
var pair = new KeyValuePair(0, "zero");
This feature is very likely to be introduced, since it has been missing for a while. On the other side one should not forget why it has been missing. Being careful with constructor type inference is therefore an obligation.
Probability: HIGH
Out parameter inference
C# introduces two possibilities of passing in values by reference. One is by using ref
parameters, the other is by using the out
keyword. I personally like having two keywords for practically the same thing (even though the compiler imposes some constraints on out
parameters, which really make sense). After all in C one was required to introduce some kind of convention on reference / return parameter names. Otherwise other programmers couldn't know how the parameter was being used by the routine.
One thing that some people would like to have is a way of declaring a variable in the call of method, which has out
parameters. This looks kind of weird, but it could be useful in some situations:
//Calling bool int.TryParse(string input, out int value); if (int.TryParse(someInput, out var a)) { // ... }
Let's compare this with the way we would write it today:
var a = 0;//alternatively: int a; if (int.TryParse(someInput, out a)) { // ... }
How is this then used? Well, there a some open questions. One certainly is, if this is just a temporary container, that cannot be accessed after the initialization. The other one is, in which scope does the variable live? Obviously the scope that is used by the body of the if
-statement is not the parent, since the method invocation is required for going into the scope. On the other side the parent scope does also not seem right, after all we might have a scenario where this method is never called.
Such a scenario is easy to construct:
if (someCondition) { // ... } else if (int.TryParse(someInput, out var a)) { // ... }
In this case the method is called if the condition is not fulfilled. The question now is: Is the variable a
available after the whole conditional block?
Those questions seem to be far away from real-world applications, but in fact they are not. There are really some open, unanswered questions with this syntax, which is why I doubt we will see it coming in the next version, unless all these questions can be answered without leaving ambiguities.
Probability: LOW
Points of Interest
As with all language features people will discuss some of them more than others. Also along with the controversial factor some people will like or dislike one or the other feature more than others. However, it should be noted that every language feature is just another syntactic sugar. The next version is fully backwards-compatible and therefore let's us only be more productive.
Personally I look most forward to primary constructors and monadic null checking. The additional generic type inference is quite nice as well. Enumerable parameters are long overdue and give us more flexibility. I also really like binary literals, since this is something that is somewhat easy to implement, but not available in most other languages. The increased possibilities with properties are nice, but nothing that I would consider absolutely required.
References
The only official information about C# vNext has been made available by Mads Torgersen with his talk at NDC 2013. Since then a lot of participants published some of the features shown in his talk. These are the references I used for this article:
- Mads Torgersen's talk at NDC 2013
- http://damieng.com/blog/2013/12/09/probable-c-6-0-features-illustrated
- http://adamralph.com/2013/12/06/ndc-diary-day-3/
The image is taken from http://www.brothers-brick.com/2006/06/09/star-trek-the-next-generation-minifigs/[^].