Microsoft .net framework 3.5 introduced the Language Integrated Query (Linq) feature. Linq provides a set of query operators to filter, join and perform other such data related operations on various types datasources such as lists, dictionaries, datasets etc.
In this article we perform a simple filter operation on a collection of data stored in a list using the traditional foreach loop method and the linq method and compare the performance statistics.
To test the two methods we prepare a sample list object filled with 10,000,000 random numbers.
Numeric Collection
The first case we consider a list of random integer which has to be filtered to find out how many number generated are greater than 1,000,000. We implement a simple foreach loop filter as shown below.
Show Code
class Program
{
static void Main(string[] args)
{
List<int> sample = GenerateSample();
int count = 0;
DateTime start;
/* Using Foreach Loop */
start = DateTime.Now;
foreach (int number in sample)
if (number > 1000000)
count++;
Console.WriteLine("Using Foreach loops: " +
(DateTime.Now - start).TotalMilliseconds + " ms");
/* Using Linq Query */
start = DateTime.Now;
count = (from number in sample where number > 1000000
select number).Count();
Console.WriteLine("Using Linq Query: " +
(DateTime.Now - start).TotalMilliseconds + " ms");
Console.ReadKey();
}
static List<int> GenerateSample()
{
List<int> sample = new List<int>();
Random random = new Random();
for (int i = 0; i < 10000000; i++)
sample.Add(random.Next());
return sample;
}
}
The result for numeric collection:
Using Foreach Loop: 125.0504 ms
Using Linq Query: 547.0955 ms
String Collection
For testing a string collection we take the same random number list in string format and filter out all those strings that start with 10.
Show Code
class Program
{
static void Main(string[] args)
{
List<string> sample = GenerateSample();
int count = 0;
DateTime start;
/* Using Foreach Loop */
start = DateTime.Now;
foreach (string number in sample)
if (number.StartsWith("10"))
count++;
Console.WriteLine("Using Foreach Loop: " +
(DateTime.Now - start).TotalMilliseconds + " ms");
/* Using Linq Query */
start = DateTime.Now;
count = (from number in sample where number.StartsWith("10")
select number).Count();
Console.WriteLine("Using Linq Query: " +
(DateTime.Now - start).TotalMilliseconds + " ms");
Console.ReadKey();
}
static List<string> GenerateSample()
{
List<string> sample = new List<string>();
Random random = new Random();
for (int i = 0; i < 10000000; i++)
sample.Add(random.Next().ToString());
return sample;
}
}
The result for string collection:
Using Foreach Loop: 2688.5836 ms
Using Linq Query: 2235.2759 ms
Conclusion
From the above results its quite clear that for performing operations like filtering on integer or numeric collections the traditional foreach loop filters execute faster.
However for filtering string collections using string operations susch as StrartsWith(), Substring() etc, linq is much faster.
Download Sample
Linq Filter Samples
Very nice article. However, I wonder if you could expound on your test methodology. Did you run the tests several times and take averages? If not, could it be that another process on your machine happened to request resources during one stage or another of your testing? I plan to use your sample code to run some tests myself but it would definitely improve the relevance of your article if you would provide answers to these questions. Also, what about multiple conditions in the Numeric example, such as value between 1,000,000 and 2,000,000 instead of merely greater than 1,000,000?
Thanks for the thought provoking article as well as the sample code!
I wonder if you pre-compiled your linq query if the results would have been the same….