Language-Integrated Query
Cell Enumeration #
With the cell selectors, we
can select the cells of a given type in the local memory storage via
IEnumerable<CellType>
or IEnumerable<CellType_Accessor>
. An
IEnumerable<T>
is nothing more than a container where it can pump out elements
one after another. This interface exposes basic enumeration capabilities. It
does not provide indexer so we cannot take an element by specifying a subscript.
There is no rewind facilities so U-turn and revisiting an element is impossible.
Enumerable Collection Operators #
Custom logic can be performed on the cells when iterating through them. .NET
provides a set of static methods for querying enumerable collections. For a
complete list of query methods, refer to the doc for the Enumerable
With the extension methods provided by System.Linq.Enumerable
, we can use the
cell selectors to manipulate data in an elegant manner. Instead of writing data
processing logic in a foreach
loop, we can use the query interfaces to extract
and aggregate information in a declarative way. For example, instead of writing:
var sum = 0;
foreach(var n in Global.LocalStorage.Node_Selector())
sum += n.val;
We can simply write:
var sum = Global.LocalStorage.Node_Selector().Sum(n=>n.val);
Or:
var sum = Global.LocalStorage.Node_Selector().Select(n=>n.val).Sum();
The code eliminates the need for intermediate states (e.g., the sum
variable
in this example) and saves some implementation details. In GE, certain query
optimizations can be done automatically by the query execution engine to
leverage the indexes defined in TSL. Specifically, the execution engine inspects
the filters, extracts the substring queries, and dispatches them to the proper
substring query interfaces generated by the TSL compiler.
Language-Integrated Query (LINQ) #
LINQ
provides a convenient way of querying a data collection. The expressive power of
LINQ is equivalent to the extension methods provided by the
System.Linq.Enumerable
class, only more convenient to use. The following
example demonstrates LINQ in GE versus its imperative equivalent:
/*========================== LINQ version ==============================*/
var result = from node in Global.LocaStorage.Node_Accessor_Selector()
where node.color == Color.Red && node.degree > 5
select node.CellID.Value;
/*========================== Imperative version ========================*/
var result = Global.LocalStorage.Node_Accessor_Selector()
.Where( node => node.color == Color.Red && node.degree > 5 )
.Select( node => node.CellID.Value );
Both versions will be translated to the same binary code; the elements in the
LINQ expression will eventually be mapped to the imperative interfaces provided
by System.Linq.Enumerable
class. But, with LINQ we can write cleaner code. For
example, if we try to write an imperative equivalent for the following LINQ
expression, a nested lambda expression must be used.
var positive_feedbacks = from user in Global.LocalStorage.User_Accessor_Selector()
from comment in user.comments
where comment.rating == Rating.Excellent
select new
{
uid = user.CellID,
pid = comment.ProductID
};
Parallel LINQ (PLINQ) #
PLINQ is a parallel implementation of LINQ. It runs the query on multiple processors simultaneously whenever possible. Calling AsParallel() on a selector turns it into a parallel enumerable container that works with PLINQ.
Limitations #
There is a limitation of IEnumerable<T>
: IDisposable
elements are not
disposed during the enumeration. However, disposing a cell accessor after use is
crucial, an undisposed cell accessor will result in the target cell
being locked permanently.
This has led to the design decision that we actively dispose a cell accessor when the user code finishes using the accessor in the enumeration loop. As a result, it is not allowed to capture the value/reference of an accessor during an enumeration and store it somewhere for later use. The reference will be destroyed and the value will be invalidated immediately after the enumeration loop. Any operation done to the stored value/reference will cause data corruption or system crash. This is the root cause for the following limitations:
Select operator cannot return cell accessors, because the accessors are disposed as soon as the loop is done.
LINQ operators that cache elements, such as
join
,group by
, are not supported.PLINQ caches some elements and distributes them to multiple cores, therefore it will not work with cell accessors. It does work with cell object selectors, though.
Although an enumeration operation does not lock the whole local storage, it does take the trunk-level locks. Compound LINQ selectors with join operations are not supported, because the inner loop will try to obtain the trunk lock that has been taken by the outer one.