If the data source for the query has 10's of thousands of items, does the new syntax automatically build and index behind the scenes If not, is there a way to specify one If not, is this being planned
It is probably not reasonable to expect the query facility to optimize by sorting and/or creating a dictionary for the collections involved, automatically.
However, if the programmer is wise enough to build a dictionary as one of the collections, or to sort the collections beforehand, shouldn't there be some way for the query to take this into account (perhaps via explicit C# language in the query)
For example, if I have two collections that I wish to join, and I would normally program that by sorting the collections first, then doing a merge, I would want the query language to do the join (merge) as I would have programmed it, given that I have explicitly written the sort statements before the query statement. Similarly, if one collection was a dictionary whose key was the join field, I would want to be able to specify the join in the query language by insisting that it use the dictionary key, and not loop through the dictionary examining each member.
Without this facility, the query language cannot be used on internal data efficiently, in my judgment. It is likely to lead to programmer frustration because the query language cannot express what she wants, or to much inefficient code when the programmer takes the path of least resistance and lets the query language implement its default behaviour.
As an aside: I think the query language can be a great addition to C#, and lead to much more concise (and therefore understandable) code.
You can certainly build your own dictionary and use the dictionary's indexer inside your query to do explicit lookups that are fast. The standard query operators are designed to work on IEnumerable<T> which is merely a sequence and confers no other information about itself, even its length, so its not generally optimizable. We are looking into building a LINQ API for large in-memory structures that would define its own operators that would invoke a simple query processor built to recognize indexes in collections via common interfaces.
If the data source is a relational database, then the database's indexes will be used in the normal manner. As for other data sources, I don't know. I doubt that there's any automatic indexing, but I imagine that some datasources, implementing the Query Expression Pattern as per the C# 3 spec, may perform indexing of their data.
Indexing data
VBZero
However, if the programmer is wise enough to build a dictionary as one of the collections, or to sort the collections beforehand, shouldn't there be some way for the query to take this into account (perhaps via explicit C# language in the query)
For example, if I have two collections that I wish to join, and I would normally program that by sorting the collections first, then doing a merge, I would want the query language to do the join (merge) as I would have programmed it, given that I have explicitly written the sort statements before the query statement. Similarly, if one collection was a dictionary whose key was the join field, I would want to be able to specify the join in the query language by insisting that it use the dictionary key, and not loop through the dictionary examining each member.
Without this facility, the query language cannot be used on internal data efficiently, in my judgment. It is likely to lead to programmer frustration because the query language cannot express what she wants, or to much inefficient code when the programmer takes the path of least resistance and lets the query language implement its default behaviour.
As an aside: I think the query language can be a great addition to C#, and lead to much more concise (and therefore understandable) code.
Bart Elia
John Bristowe