LINQ Tutorail Part3

Your Ad Here

This is the third of a series of articles exploring new features in C# 3. The first two have covered:
  • Implicitly typed variables and arrays
  • Extension methods and lambda expressions
To follow this part you should certainly have read the first one, but no material from the second part is required.

In this article we will look at object and collection initializers, which provide a neater syntax for setting the initial values of properties for objects and the initial contents of collections. We will then move on to anonymous types, which you need to know about to understand Linq.

Object Initializers

It is fairly common in C# code to see an object be instantiated using the "new" keyword and then having its fields and/or properties set. Until C# 3.0, this could only be done by instantiating the object, storing it in a variable and then doing assignments to the various properties. In C# 3.0, object initializers make this possible within a single expression.

Suppose we want to instantiate a 5 year old male monkey called Norbert and add it to the jungle. In previous versions of C# we would have written:
Monkey NewCreation = new Monkey();
NewCreation.Name = "Norbert";
NewCreation.Sex = SexEnum.Male;
NewCreation.Age = 5;
Jungle.Add(NewCreation);
A few things are frustrating here. First is that if we have to mention the variable, NewCreation, each time. Second, we may not really need the variable NewCreation at all - we just want to add a monkey to the collection. Finally, it would be better from a linguistic point of view if we could have pulled the Add ahead of the monkey creation, so when you read the code you can see the purpose of creating the new monkey.

Object initializers allow us to set the initial values of fields or properties of an object as part of the new statement. For example, we can re-write the above like this:
Monkey NewCreation = new Monkey() {
Name = "Norbert",
Sex = SexEnum.Male,
Age = 5
};
Jungle.Add(NewCreation);
Here we have added a set of curly braces at the end of the "new" expression. Inside them, we can do assignments to the fields and properties without having to write the name of the object that is being referred to. Note the use of commas between the assignments rather than semicolons.

The fact that we don't have to name the object we are initializing - that is, setting the fields/properties of - means we can do a further refactoring:
Jungle.Add(new Monkey() {
Name = "Norbert",
Sex = SexEnum.Male,
Age = 5
});
Now the intermediate variable is gone. Finally, if there are no parameters to pass to the constructor, we are permitted to save ourselves two more characters and remove the brackets after the type name:
Jungle.Add(new Monkey {
Name = "Norbert",
Sex = SexEnum.Male,
Age = 5
});


Initializing Nested Objects

Our Monkey class may have, as one of its fields, an field that holds an instance of the Tail class. In this case, there are two possibilities. One is that the class does not instantiate the Tail for us. In this case, we can use the new keyword to instantiate it and set properties of it - basically, just nesting what we already know.
Jungle.Add(new Monkey {
Name = "Norbert",
Sex = SexEnum.Male,
Age = 5,
Tail = new Tail { Length = 50 }
});
The other possibility is that the class does instantiate tail and we just need to set some properties of it. In this case we can omit not only the "new" keyword, but also the name of the class too, since that can be worked out by the compiler.
Jungle.Add(new Monkey {
Name = "Norbert",
Sex = SexEnum.Male,
Age = 5,
Tail = { Length = 50 }
});
You can nest as deeply as you wish, but be careful not to harm readability. Good use of whitespace can help on that front.

Collection Initializers

Collections can contain many values. Sometimes you will create a collection and then immediately add some values to it. Just as object initializer syntax made a common use case neater for objects, collection initializer syntax makes one neater for collections.

Again, let's take an example. Notice that I am already using the new C# 3 "var" keyword.
var Jungle = new List();
Jungle.Add(new Monkey());
Jungle.Add(new Tiger());
Jungle.Add(new Panda());
Using a collection initializer, we can write this as:
var Jungle = new List
{ new Monkey(), new Tiger(), new Panda() };


You can observe implicit coercion taking place whenever you assign a value of one type to a variable of another without the need to insert a cast.
There are some rules concerning the use of collection initializers. First, if you are writing your own collections and want them to work with collection initializer syntax, they must implement the ICollection interface. Second, the elements of the collection must all be of the same type (or more precisely, they must all have an implicit coercion to a single type).

Initializer Performance

Shorter code doesn't always mean a performance improvement at runtime. In this case, the new object initializer syntax will almost certainly compile down to the same IL instructions as if you had not used it. You might save a tiny amount of memory due to not having to allocate space for the local variable. However, the compiler should have been able to optimize that away anyway. In short, expect equivalent performance: no better and no worse.

Anonymous Types

Anonymous simply means "without a name", and you can safely read the word "type" as "class" in this case. That is, in this section we are going to discuss the idea of classes without names.

In C# 2.0 we saw the introduction of anonymous methods. One of the consequences of a method having no name is that we had to take a reference to it - stored in a delegate type - right away, so we had some way to refer to it. The analogy with anonymous classes is that we are required to instantiate them right away. Therefore, the construct for creating an anonymous class also instantiates that class.

In C# 3.0, anonymous classes are greatly limited compared to standard classes. They can only inherit from object and their only memebers are private fields each with a matching read/write property.

With all of these things in mind, let's see how we declare and instantiate an anonymous type.
var MyProduct = new {
Name = "Vacuum Cleaner",
Price = 94.99,
Description = "Really sucks! Have your carpets clean in no time."
};
There are a couple of things to notice here. First is that we do not have a name for the class. Therefore, there is no type that we can write before the name of the variable when declaring it. What we can do, however, is to write "var", which leaves the compiler to work out the type for us. While the types are anonymous as far as we should care, the compiler and runtime actually do have some way of identifying them.

The second thing to notice is that we have used the "new" keyword but without specifying a type name. Instead, we have placed something after it that looks just like the object initializers we were looking at a few moments ago. This is not a co-incidence: we actually are initializing the object created by new. The question is, where is the definition of the class?

The class is created by looking at the initializer. For each name assigned to inside the initializer (Name, Price and Description in this case), a private field is created along with a get/set property. In this case, the class might look like this:
class __NO_NAME__ {
private string _Name;
private double _Price;
private string _Description;

public string Name {
get { return _Name; }
set { _Name = value; }
}
public double Price {
get { return _Price; }
set { _Price = value; }
}
public string Description {
get { return _Description; }
set { _Description = value; }
}
}
Note that the types of the fields are worked out by looking at what is being assigned to the property. Therefore, you are not allowed to assign a null value. It is the same type inference process that we have seen time and time again in C# 3.0.

Since anonymous classes are just classes and instances of them are just objects, you can do all of the things you'd expect to be able to with them, from simple things like accessing their properties through to more complicated things such as reflection.

Type Equivalence

Type equivalence involves determining if two values are of the same type. In this case, we are concerned with type equivalence of objects instantiated from anonymous classes.

This comes up in practice when assignment is considered. Let's take an example.
var x = new {
Real = 5.4,
Complex = 2.8
};
var y = new {
Real = 1.9,
Complex = 5.3
};
x = y;
Remember from the first part of the series that C# 3.0 is statically typed. That means that the variables x and y both have and retain a given type. Therefore, if the assignment is to work then y has to be of the same type as x (we don't have to consider subtyping here, since anonymous classes always inherit from object).

Two anonymous types will be considered equivalent if all of the following properties are true:
  • They have the same number of fields
  • They have fields of the same name declared in the same order
  • The types of each of the fields are identical
In the previous example, this is the case. However, any of the following changes to the anonymous type that was instantiated to give y will result in the types not being equivalent and the assignment resulting in a compile time error.
// Not equivalent due to an extra field.
var y = new {
Real = 1.9,
Complex = 5.3,
Conjugated = -5.3
};
// Not equivalent - fields in a different order.
var y = new {
Complex = 5.3,
Real = 1.9
};
// Not equivalent; different types (int != double)
var y = new {
Complex = 4,
Real = 2
};


Projections

There is one final feature of anonymous types to point out, and until you see Linq this is going to feel a little obscure. So far we have declared a field in an anonymous type by specifying its name and initializing it to a value. There are two other ways.

The first is to simply write the name of an already declared variable. The name of the variable will be taken as the name of the field, and the value it holds will be used to initialize the field. Using this, you could rewrite:
var x = new {
Real = 5.4,
Complex = 2.8
};
As:
var Real = 5.4;
var Complex = 2.8;
var x = new {
Real,
Complex
};
In this case it complicates the code, but it's worth being aware of. There is a variation on this where instead of naming a variable, you access a member of an existing object. The field takes the name and value of the member.

Imagine we have a class called Customer that represents all of the details of a customer, but we just want an anonymous type that contains the name and email address. We can write the following:
Customer c = GetCustomer(1764);
var EmailRecord = new { c.Name, c.Email };
This is equivalent to:
Customer c = GetCustomer(1764);
var EmailRecord = new {
Name = c.Name,
Email = c.Email
};


Conclusion

Object and collection initializers are partly handy syntactic shortcuts. The real power in them is the ability to instantiate and set up an object with a single expression. This can save us from introducing an extra temporary variable, which can lead to neater code.

Anonymous types probably feel a little strange at the moment. While the other language features we have seen have had immediate obvious practical uses, anonymous types may be a little harder to see the use for.

All will become much clearer in the next and final part of the series, when we look at Linq. Linq is built using all of the primitives we have learnt about so far, so if you have read and understood this and the previous parts of the series, you are ready to understand not just how to use Linq, but how it actually works.

Subscribe
Posted in Labels: , , kick it on DotNetKicks.com |

0 comments: