My situation: I'm new to Spirit, I have to use VC6 and am thus using Spirit 1.6.4.
I have a line that looks like this:
//The Description;DESCRIPTION;;
I want to put the text DESCRIPTION in a string if the line starts with //The Description;.
I have something that works but looks not that elegant to me:
vector<char> vDescription; // std::string doesn't work due to missing ::clear() in VC6's STL implementation
if(parse(chars,
// Begin grammar
(
as_lower_d["//the description;"]
>> (+~ch_p(';'))[assign(vDescription)]
),
// End grammar
space_p).hit)
{
const string desc(vDescription.begin(), vDescription.end());
}
I would much more like to assign all printable characters up to the next ';' but the following won't work because parse(...).hit == false
parse(chars,
// Begin grammar
(
as_lower_d["//the description;"]
>> (+print_p)[assign(vDescription)]
>> ';'
),
// End grammar
space_p).hit)
How do I make it hit?
-
You're not getting a hit because ';' is matched by print_p. Try this:
parse(chars, // Begin grammar ( as_lower_d["//the description;"] >> (+(print_p-';'))[assign(vDescription)] >> ';' ), // End grammar space_p).hit)mxp : Thanks, I will try this tomorrow. It looks like there is a fundamental misunderstanding on my side then. I assumed the parser would try to match things if possible and not be so lazy... Do you know what the term for this behaviour is?Fred Larson : I think the term is "greedy". See http://www.boost.org/doc/libs/1_35_0/libs/spirit/doc/faq.html#greedy_rd -
You might try using
confix_p:confix_p(as_lower_d["//the description;"], (+print_p)[assign(vDescription)], ch_p(';') )It should be equivalent to Fred's response.
The reason your code fails is because
print_pis greedy. The+print_pparser will consume characters until it encounters the end of the input or a non-printable character. Semicolon is printable, soprint_pclaims it. Your input gets exhausted, the variable is assigned, and the match fails — there's nothing left for the last semicolon of your parser to match.Fred's answer constructs a new parser,
(print_p - ';'), which matches everythingprint_pdoes, except for semicolons. "Match everything except X, and then match X" is a common pattern, soconfix_pis provided as a shortcut for constructing that kind of parser. The documentation suggests using it for parsing C- or Pascal-style comments, but that's not required.For your code to work, Spirit would need to recognize that the greedy
print_pmatched too much and then backtrack to allow matching less. But although Spirit will backtrack, it won't backtrack to the "middle" of what a sub-parser would otherwise greedily match. It will backtrack to the next "choice point," but your grammar doesn't have any. See Exhaustive backtracking and greedy RD in the Spirit documentation.mxp : Thanks, this works, too and looks even better. I have a small correction though: To get only the text _in_between_ confix_p's opening and closing, the [assign()] action has to be put behind the print_p instead of behind confix_p().Rob Kennedy : Ah, you're right. I skimmed over the documentation too fast. It looks wrong at first, but the parser fixes it to do the right thing.
0 comments:
Post a Comment