hello,
I have an apllication that have about 3000 regexp compiled using the form
regexp = new Regex(expression, RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
when a try to match a string against the 3000 regexp for the first time it take about 2 minutes on a Big machine !!
the second time the match goes faster and take only few milliseconds
what is going on

Slow regexp for first time match
sabitha
Ok i have removed the RegexOptions.Compiled , and you know what
I am Happy
many thanks .
Anna Ahn
many Thanks to every body,
as I understand the RegexOptions.Compiled options is the fastest that I can use .
that what I do but the regexp is not compiled in the constructor of the Regex, it is only compiled
on the very first match , so I want a good response time on the first match ,
the solution I found is to make a fake match immediatly following the build of the regexp
the code is as:
regexp = new Regex(expression, RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
regexp.Match("###"); // for really compile the expression really slow on 3000 expression
did any body known if the pcre lib is usable with C#
_mubashir
Again,
I try this :
regexp = new Regex(expression, /* RegexOptions.Compiled | */RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
//regexp.Match("###"); // for really compile the expression really slow on 3000 expression
remove Compiled Option ( will be faster )
dont known why this is better than with the Compiled expression
compile time is faster , and match not so slow
Alexander Mossin
I try the following bench:
using
System;using
System.Collections.Generic;using
System.Text;using
System.Text.RegularExpressions;using
System.IO;namespace
TestRegexp{
class Program
{
static void Main(string[] args)
{
Test(RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
Test(RegexOptions.Compiled|RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
Console.ReadLine();
}
private static void Test(RegexOptions opts)
{
List<Regex> regexps = new List<Regex>();
string line;
FileStream file = new FileStream("..\\..\\regexp.txt", FileMode.Open, FileAccess.Read);
TextReader reader = new StreamReader(file);
TimeSpan compileTime = new TimeSpan();
TimeSpan firstMatchTime = new TimeSpan();
TimeSpan secondMatchTime = new TimeSpan();
while ((line = reader.ReadLine()) != null)
{
DateTime begin = DateTime.Now;
Regex reg = new Regex(line, opts);
DateTime end = DateTime.Now;
compileTime += end - begin;
regexps.Add(reg);
}
foreach (Regex reg in regexps)
{
DateTime begin = DateTime.Now;
reg.Match("###");
DateTime end = DateTime.Now;
firstMatchTime += end - begin;
}
foreach (Regex reg in regexps)
{
DateTime begin = DateTime.Now;
reg.Match("###");
DateTime end = DateTime.Now;
secondMatchTime += end - begin;
}
Console.WriteLine(" compile time {0} s first match {1} s second match {2} s for {3} expr",
compileTime.TotalSeconds, firstMatchTime.TotalSeconds, secondMatchTime.TotalSeconds, regexps.Count);
}
}
}
results:
compile time 0,156253 s first match 0,0156253 s second match 0 s for 3010 expr
compile time 3,0625588 s first match 135,8463582 s second match 0,0156253 s for 3010 expr
found a bug
nidhig83
RashiQ
AZ_2005
Ahmad_Jafari
You compile it before you need it if you're going to use it tons and tons of times. If you're only going to use that regex a few times the option of compiling them is most likely not worth it. I've never compiled the regex. If i have a regex whom's pattern stays the same I'll create it once and then use that regex instance over and over. I've seldom ever noticed a performance problem and I've run actual benchmarks where one has had to prase 500-1k+ patterns across 20-30 very complex regex patterns. I've honestly never used the Compile option but I do know that if one did use that it would be best served to be done before one actually wanted to use
.
Hope that helps.
Alex cai
The first time you create the regular expression the expression is compiled, that takes a considerably huge amount of time. After that you got a compiled version of the regex, so the second execution does not have to compile again and executes very very fast.
As far as I can remember from what I read, one drawback with using the RegexOptions.Compiled is that the compiled ressource is not released anymore until the program exits.
Edit: I found the link to that again: http://msdn2.microsoft.com/en-us/8zbs0h2f.aspx