LINQ Tutorail Part1

Your Ad Here

Being a languages guy, watching a language evolve is always of great interest. It was interesting to see features that had tended to be more popular in academic languages making it into C# 2. The prime example of this was generics, a form of parameterised types. Parameterised types had been about in the academic world for quite a while, and so have many of the additions that have made it into C# 3. The advantage of this is that they have been researched well and their semantics are well understood. But before the academic in me takes over, let’s get practical and dig in to how C# 3 will make programming more productive and more fun.

Types

The title of this section say it is about variables, but really it’s about types. If you’ve never ventured outside the C#/Java/C/C++ world, you might assume that a variable declaration always looks something like:
string question = “What is the meaning of life, the universe and everything?”;
int answer = 42;
That is, you always specify the type of the variable (int or string in this case), the name of the variable and then (either on the same line or later on) you initialise it to some value. Since you always specify the type of the variable when you declare it, the types of variables are always known. This means that if you try and assign, say, an integer to a variable of type string, the compiler has enough type information from the declarations to detect this and complain. A variable only ever has one type (though it may hold values that are subtypes of that type) and this type is known at compile time. This is called static typing.

Other people will have looked at languages such as Perl, Python and Ruby and seen that when you declare a variable you don’t have to (or just can’t) specify a type for the variable. Instead, a variable carries a "tag" around with it that specifies the type of the value that it contains. When you assign a value to a variable at runtime, the tag is set with the type of that value. The tag is checked (again, at runtime) before operations are performed on a variable to make sure that they are safe. This is called dynamic typing.

Here is an example of some code that will compile and run in a dynamically typed language, but not in a statically typed one.
a = "something"
if (complex condition that is always true)
a = 84
b = a / 2
Here we have used the variable a as a string, then an integer. In a dynamically typed language that is fine - by the time we do the division, the variable is an integer rather than a string and it all works fine. In a statically typed language, this would fail to compile, since the compiler can not work out at compile time that the condition will always be true at runtime. It is a contrived example, but shows the heart of the issue.

Static typing clearly has the advantage of no runtime checks (though if you write casts into your program some of them will translate to a dynamic type check). That means you get higher performance and don’t get type errors at runtime, unless you’ve explicitly written something into your program that may generate them, in which case you’ve probably thought about what you’re doing.

I’ve dragged you through this whole detour on static vs. dynamic typing because I want you to understand what I mean when I say that C# 3 is statically typed, just as all previous versions have been.

Look! No Types!

Here’s some new C# 3 syntax for declaring the same variables we did earlier.
var question = “What is the meaning of life, the universe and everything?”;
var answer = 42;
At first glance, it appears that you’re declaring a variable of type “var”. Actually, “var” is not a type, but rather a new keyword that means, “I want to declare a variable, but I’m too lazy to write out its type”. For cases where the type is "Dictionary>" or similar, this saves quite a bit of clutter in the code:
// Before:
Dictionary> Coeffs = new Dictionary>();
// After:
var Coeffs = new Dictionary>();
I’ve stated that C# is statically typed, while at the same time showing that you can now declare a variable without specifying its type. What’s going on? The answer is type inference. A type inference algorithm in the compiler analyses your program and works out the type for you; you could see it as, conceptually at least, going through your program and for each occurrence of the keyword "var" figuring out the type of that variable and putting it in place of "var". Therefore, what gets compiled is equivalent to an explicitly annotated program (that is, one where you specify the types of all variables).

If you are thinking of Visual Basic and its Variant types and wondering if there is a similarity here, there isn't. Variants use dynamic typing rather than inferring the type statically. Therefore, variants carried a runtime performance penalty. Using "var" in C# 3 does not. Your program will run just as quickly as if you had written the type in yourself.

Time To Play

Enough theory - it’s about time we wrote some code to explore the idea. To do this, you’re going to need a copy of the C# 3 preview compiler, which is also known as the LINQ preview (where LINQ is the name of one of the most exciting new features in C# 3, and the .Net platform in general). You can download it from through the C# future versions page:

http://msdn2.microsoft.com/en-us/vcsharp/aa336745.aspx

Close Visual Studio if you have it open and install the LINQ technology preview. The next time you enter Visual Studio, you’ll find a new type of project on the New Project menu.

Visual Studio New Project window with LINQ project option.

We’ll just look at console applications in this article. Create a new LINQ Console Application and then open the Program.cs source file. At this point everything should look pretty familiar.

To demonstrate type inference at work, we’ll write a program that declares several variables using var and then print their types. Change the Main method to read as follows.
static void Main(string[] args)
{
var Name = "Jonathan";
var Answer = 42;
var Prog = new Program();
var Friends = new List();
// Print the types of the variables.
Console.WriteLine("Name is of type " + Name.GetType().Name);
Console.WriteLine("Answer is of type " + Answer.GetType().Name);
Console.WriteLine("Prog is of type " + Prog.GetType().Name);
Console.WriteLine("Friends is of type " + Friends.GetType().Name);

// Stop console disappearing.
Console.ReadKey();
}
The output of this program will be:
Name is of type String
Answer is of type Int32
Prog is of type Program
Friends is of type List`1
Notice that basic types are inferred (the first two lines), types of objects are inferred (the third line) and generic types and inferred too (the last line). You can put any expression with a single type to the right of the assignment. For example the following program:
var A = 28;
var B = 14;
var Result = A + B;
Console.WriteLine("Result is " + Result);
Console.WriteLine("Result is of type " + Result.GetType().Name);
Will produce the output:
Result is 42
Result is of type Int32
Here, the types of A and B are inferred as we have seen in the previous example. The type of Result is then inferred by considering the type of value that the plus operator returns when applied to variables of two other types. This means the inferred types of A and B are used, along with knowledge of the behavior of the plus operator, to determine a type for Result.

Subtleties And Corner Cases

There are some subtle issues that you should be aware of when using implicitly typed local variables. First of all, you must initialize the variable when you declare it. This will fail to compile:
// Must be initialized at the point of declaration.
var DoItLater;
Console.WriteLine("We'll do it later...");
DoItLater = "Done it!";
Also, initializing with null is not allowed; this following example will also fail to compile.
// Must not be initialized to null.
var TryItNull = null;
Console.WriteLine("We'll try it null...");
TryItNull = "Tried it!";
A less contrived example where this may hit you is:
// This woulda been nice to be able to do.
var SetConditionally;
if (condition)
SetConditionally = 500;
else
SetConditionally = 600;
Which doesn’t work. The rule is that there must be an expression (other than null) assigned to an implicitly typed local variable for the type to be inferred. Note that if we re-wrote the previous example to:
var SetConditionally = condition ? 500 : 600;
Then it would work, as there is an expression to the right with a type that can be determined. The following example will not work:
var SetConditionally = condition ? 500 : "badger";
Because there is no type that can be inferred (if you think about it, there is no type that you could write in place of var to make this compile anyway). Type inference understands subtyping too. If you write two classes:
class Parent
{
}
class Child : Parent
{
}
Then try the following program:
var Condition = true;
var SetConditionally = Condition ? new Parent() : new Child();
Console.WriteLine("SetConditionally is of type " + SetConditionally.GetType().Name);
This will compile and produce the output:
SetConditionally is of type Parent
Note that if you were to replace var with Parent in this case, the program would compile and run; if you replaced it with Child then it would not. That seems vaguely clever, but unfortunately you don’t have to try hard to find a type that could be safely inferred that, at least in the technology preview, will not be. Consider defining the following interface and two classes that implement it:
interface Animal
{
}
class Dog : Animal
{
}
class Cat : Animal
{
}
We would expect the following program to report that the variable Pet is of type Animal:
var PreferCats = true;
var Pet = PreferCats ? new Cat() : new Dog();
Console.WriteLine("Pet is of type " + Pet.GetType().Name);
Unfortunately, we get a compile time error:
Type of conditional expression cannot be determined because there
is no implicit conversion between 'CSharp3_Examples.Cat' and
'CSharp3_Examples.Dog'
The only way to resolve the problem while using implicit typing is to insert casts:
var PreferCats = true;
var Pet = PreferCats ? (Animal) new Cat() : (Animal) new Dog();
Console.WriteLine("Pet is of type " + Pet.GetType().Name);


Implicitly Typed Arrays

So far we have been looking at implicit typing of variables holding a single value, however implicit typing also extends to arrays. In the case that you initialize the array (that is, specify its elements when you declare it), you need not mention the type of the elements in the array at all; the type will be inferred by considering the type of the elements in the array. For example, the following code:
var Primes = new [] { 2, 3, 5, 7, 11, 13, 17, 19 };
Console.WriteLine("Primes is of type " + Primes.GetType().Name);
Will produce the following output:
Primes is of type Int32[]
Note that if there are any elements whose type cannot be inferred or are the null type, you will get a compile time error. Similarly, if there are elements of different types in the array and the type inferencer cannot find a single type that fits them all, you will get a compile time error. That is, neither of the following examples will compile:
var OneNull = new [] { 1, 4, 9, 16, null, 36 };
var Different = new [] { "String", 123 };


Conclusion

Having the full name of a type, particularly a parameterized one, appearing twice on the same line has been one of my annoyances with C# and Java. The statement:
Dictionary> Example = new Dictionary>();
Can now be shortened to:
var Example = new Dictionary>();
I consider that an improvement, both to developer productivity when writing code and to those reading the code too.

I know some people will consider this to be yet another feature moving C# away from being the simple language they desire. However, since variable declarations are written time and time again, this is hardly adding another feature that will only be used occasionally (an argument that you could apply to, for example, nullable types added in C# 2.0). The fact that it extends to arrays is a good thing.

On the other hand, some people may be disappointed with some of the restrictions with regard to type inferencing, and wish that it could be smarter. It is important to realize that type inference is not even possible for all type systems (more formally, it is not decidable – you can’t write a program that will be able to compute a type in all cases). The designers needed to come up with an implementable and understandable type inferencing scheme that would not restrict them too much with where they want to take future versions of the language.

Anyway, that’s all for this time. Next time we’ll be looking at extender methods (which allow more abstraction and code re-use) and lambda expressions (which will enable us to do higher order programming a lot less verbosely). And finally, if you're wondering if type inference is just about saving you typing a few characters in type annotations, the answer is no: you'll be meeting it again in the remaining parts of this series.

Subscribe
Posted in Labels: , , kick it on DotNetKicks.com |

0 comments: