http://www.vckbase.com/document/viewdoc/?id=1138
GRETA的匹配速度比boost(http://www.boost.org)正则表达式库大约快7倍,比ATL7的CATLRegExp快10倍之多!
GRETA正则表达式模板类库 下载源代码
2。 缩写匹配
#include "stdafx.h"#include <atlrx.h>int main(int argc, char* argv[]){ CAtlRegExp<> reUrl; // five match groups: scheme, authority, path, query, fragment REParseError status = reUrl.Parse( "({[^:/?#]+}:)?(//{[^/?#]*})?{[^?#]*}(?{[^#]*})?(#{.*})?" ); if (REPARSE_ERROR_OK != status) { // Unexpected error. return 0; } CAtlREMatchContext<> mcUrl; if (!reUrl.Match( "http://search.microsoft.com/us/Search.asp?qu=atl&boolean=ALL#results", &mcUrl)) { // Unexpected error. return 0; } for (UINT nGroupIndex = 0; nGroupIndex < mcUrl.m_uNumGroups; ++nGroupIndex) { const CAtlREMatchContext<>::RECHAR* szStart = 0; const CAtlREMatchContext<>::RECHAR* szEnd = 0; mcUrl.GetMatch(nGroupIndex, &szStart, &szEnd); ptrdiff_t nLength = szEnd - szStart; printf("%d: \"%.*s\"\n", nGroupIndex, nLength, szStart); }}输出: 0: "http"1: "search.microsoft.com"2: "/us/Search.asp"3: "qu=atl&boolean=ALL"4: "results" Match的结果通过第二个参数pContext所指向的CAtlREMatchContext类来返回,Match的结果及其相关信息都被存放在CAtlREMatchContext类中,只要访问CAtlREMatchContext的方法和成员就可以得到匹配的结果。CAtlREMatchContext通过m_uNumGroups成员以及GetMatch()方法向调用者提供匹配的结果信息。m_uNumGroups代表匹配上的Group有多少组,GetMatch()则根据传递给它的Group的Index值,返回匹配上的字符串的pStart和pEnd指针,调用者有了这两个指针,自然可以很方便的得到匹配结果。 GRETA 为了执行搜索和替换的操作,用户首先需要用一个描述匹配规则的字符串来显式初始化一个rpattern对象,然后把需要匹配的字符串作为参数,调用rpattern的函数,比如match()或者substitute(),就可以得到匹配后的结果。如果match()/substitute()调用失败,函数返回false,如果调用成功,函数返回true,此时,match_results对象存储了匹配结果。请看例子代码: #include <iostream>#include <string>#include "regexpr2.h"using namespace std;using namespace regex;int main() { match_results results; string str( "The book cost $12.34" ); rpattern pat( "\\$(\\d+)(\\.(\\d\\d))?" ); // Match a dollar sign followed by one or more digits, // optionally followed by a period and two more digits. // The double-escapes are necessary to satisfy the compiler. match_results::backref_type br = pat.match( str, results ); if( br.matched ) { cout << "match success!" << endl; cout << "price: " << br << endl; } else { cout << "match failed!" << endl; } return 0;}程序输出将是: match success!price: $12.34 您可以阅读GRETA文档,获知rpattern对象的细节内容,并掌握如何自定义搜索策略来得到更好的效率。 namespace boost{template <class charT, class traits = regex_traits<charT>, class Allocator = std::allocator<charT> > class basic_regex;typedef basic_regex<char> regex;typedef basic_regex<wchar_t> wregex;}Boost Regex 库附带的文档非常丰富,示例更是精彩,比如有两个例子程序,不多的代码,程序就可以直接对 C++ 文件进行语法高亮标记,生成相应的 HTML (converts a C++ file to syntax highlighted HTML)。下面的例子可以分割一个字符串到一串标记符号(split a string into tokens)。 #include <list>#include <boost/regex.hpp>unsigned tokenise(std::list<std::string>& l, std::string& s){ return boost::regex_split(std::back_inserter(l), s);}#include <iostream>using namespace std;#if defined(BOOST_MSVC) || (defined(__BORLANDC__) && (__BORLANDC__ == 0x550))// problem with std::getline under MSVC6sp3istream& getline(istream& is, std::string& s){ s.erase(); char c = is.get(); while(c != ''''\n'''') { s.append(1, c); c = is.get(); } return is;}#endifint main(int argc){ string s; list<string> l; do{ if(argc == 1) { cout << "Enter text to split (or \"quit\" to exit): "; getline(cin, s); if(s == "quit") break; } else s = "This is a string of tokens"; unsigned result = tokenise(l, s); cout << result << " tokens found" << endl; cout << "The remaining text is: \"" << s << "\"" << endl; while(l.size()) { s = *(l.begin()); l.pop_front(); cout << s << endl; } }while(argc == 1); return 0;} |