I have to parse data contaning blocks that are not always present. Here's example
<block id="aaa"> aaa data </block>
<block id="bbb"> bbb data <block > xxx data </block> </block>
<block id="ccc"> ccc data </block>
I want to catch each (aaa, bbb and ccc) blocks in corresponding variables. The problem is that some blocks can be omitted.
I tried this regex:
"( s)\s*"
& _"( :<block id=""aaa"">\s*( <aaa>.*)\s*</block>) \s*" & _
"( :<block id=""bbb"">\s*( <bbb>.*)\s*</block>) \s*" & _
"( :<block id=""ccc"">\s*( <ccc>.*)\s*</block>) \s*"
But it doesn't work. The first (aaa) block consumes other blocks.
How to change that

Parsing optional blocks with RegEx
Hydra20010
Thnks. That really helped.
P.S. I was parsing <div> - based html pages
ManishPPPP
Imran.
J.Douglas
Iainr
I want to get:
<aaa>= aaa data
<bbb>= bbb data <block > xxx data </block>
<ccc>= ccc data
sbogollu
is 1 or 0 times; is 0 or 1 times. I suggest you read up on regular expressions. A document I always found very useful is perlre, although there are probably lots of other manuals and tutorials you could read more specific to .NET regexes.
I didn't notice that your input data contained a nested block. Dealing with those can be quite tricky when using regular expressions... Have a look at this topic to see how to deal with nested elements.
parsec
"( s)\s*" & _
"( :<block id=""aaa"">\s*( <aaa>.* )\s*</block>) \s*" & _
"( :<block id=""bbb"">\s*( <bbb>.* )\s*</block>) \s*" & _
"( :<block id=""ccc"">\s*( <ccc>.* )\s*</block>) \s*"
This helps but partially. It results in
<aaa>="aaa data"
<bbb>="bbb data <block > xxx data"
<ccc>=""
Is there any "greedy " which would mean 0 or 1 times, but 1 preferred