zoukankan      html  css  js  c++  java
  • js正则表达式中的正向肯定预查和正向否定预查, 反向肯定和反向否定(这个翻译不准确)

    ?:  is for non capturing group
    ?=  is for positive look ahead
    ?!  is for negative look ahead
    ?<= is for positive look behind
    ?<! is for negative look behind

    Please check here: http://www.regular-expressions.info/lookaround.html for very good tutorial and examples on lookahead in regular expressions.


    -----------------------------------------------------------------------------


    https://stackoverflow.com/questions/10804732/what-is-the-difference-between-and-in-regex



    ?: is for non capturing group ?= is for positive look ahead ?! is for negative look ahead ?<= is for positive look behind ?<! is for negative look behind



    The difference between ?= and ?! is that the former requires the given expression to match and the latter requires it to not match. For example a(?=b) will match the "a" in "ab", but not the "a" in "ac". Whereas a(?!b) will match the "a" in "ac", but not the "a" in "ab".

    The difference between ?: and ?= is that ?= excludes the expression from the entire match while ?: just doesn't create a capturing group. So for example a(?:b) will match the "ab" in "abc", while a(?=b) will only match the "a" in "abc". a(b) would match the "ab" in "abc" and create a capture containing the "b".



    _______________________________________________________________

    js正则表达式中的正向肯定预查和正向否定预查

    对于没有使用过这几个表达式的人,应该对这个概念都有点不太理解,下面就以实际例子说明这几个表达式的用户。

    一、?:pattern——匹配检验:
    会作为匹配校验,是一个非获取匹配,并出现在匹配字符结果里面,比如 windows(?:2000|NT|98) 等同于 windows2000|windowsNT|windows98 
    就是一个比用 | 更简略的表达式,跟直接用 | 的区别是不作为子匹配返回:

    例1:

    复制代码
    var reg1=/windows(?:2000|NT|98)/i
    var reg2=/windows(2000|NT|98)/i
    var str='windows2000'
    
    str.match(reg1) // ["windows2000", index: 0, input: "windows2000"]
    str.match(reg2) // ["windows2000", "2000", index: 0, input: "windows2000"]
    reg1.test(str)    //true
    reg2.test(str)    //true
    复制代码

    可以注意到 第一个正则匹配返回的结果中没有子匹配的返回内容

    二、?=pattern——正向肯定预查:

    会作为匹配校验,是一个非获取匹配,不会出现在匹配结果字符串里面。
    示例:

    var reg=/windows(?=2000|NT|98)/i
    var str='windows2000'
    var str2='windows xp'
    str.match(reg) // ["windows", index: 0, input: "windows2000"]
    str2.match(reg)    //null

    其中,
    1. 匹配windows,如果没有匹配到,那么就返回为空
    2. 其后是否有2000|NT|98其中的一个,如果有,那么就返回 windows,没有就返回为空

    三、?!pattern——正向否定预查:
    在任何不匹配pattern的字符串开始处匹配查找字符串,也是一个非获取匹配,不会出现在匹配结果字符串里面。
    示例:

    var reg=/windows(?!2000|NT|98)/i
    var str='windows2000'
    var str2='windows xp'
    str.match(reg) // null
    str2.match(reg)    //["windows", index: 0, input: "windows xp"]

    可以看到,跟上边正想肯定预查刚好相反。

    上边的例子是?!前边直接匹配字符串,还有一种情况,就是元字符,如下例:

    var reg=/windows*(?!2000|NT|98)/i    
    var str='windows2000'
    var str2='windows xp'
    str.match(reg) // ["window", index: 0, input: "windows2000"]
    str2.match(reg)    //["windows", index: 0, input: "windows xp"]

    *在正则表达式中的意思是匹配前一个子表达式0次或者多次,那么对于str,能够匹配 ?! 后边的表达式,所以取反,即不匹配*前边的表达式(这里是 s ),匹配结果为window,str2中恰好相反。
    注:下边这个例子可能不太好理解,多写几个表达式熟悉下慢慢就理解了。

    个人原创博客,转载请注明来源地址:https://www.cnblogs.com/xyyt
     
     
     

    https://javascript.info/regexp-lookahead-lookbehind

     

     

    ————————————————————————————————————

    Lookahead and lookbehind

     

    Sometimes we need to find only those matches for a pattern that are followed or preceeded by another pattern.

    There’s a special syntax for that, called “lookahead” and “lookbehind”, together referred to as “lookaround”.

    For the start, let’s find the price from the string like 1 turkey costs 30€. That is: a number, followed by  sign.

    Lookahead

    The syntax is: X(?=Y), it means "look for X, but match only if followed by Y". There may be any pattern instead of X and Y.

    For an integer number followed by , the regexp will be d+(?=€):

     
    let str = "1 turkey costs 30€";
    
    alert( str.match(/d+(?=€)/) ); // 30, the number 1 is ignored, as it's not followed by €

    Please note: the lookahead is merely a test, the contents of the parentheses (?=...) is not included in the result 30.

    When we look for X(?=Y), the regular expression engine finds X and then checks if there’s Y immediately after it. If it’s not so, then the potential match is skipped, and the search continues.

    More complex tests are possible, e.g. X(?=Y)(?=Z) means:

    1. Find X.
    2. Check if Y is immediately after X (skip if isn’t).
    3. Check if Z is also immediately after X (skip if isn’t).
    4. If both tests passed, then the X is a match, otherwise continue searching.

    In other words, such pattern means that we’re looking for X followed by Y and Z at the same time.

    That’s only possible if patterns Y and Z aren’t mutually exclusive.

    For example, d+(?=s)(?=.*30) looks for d+ only if it’s followed by a space, and there’s 30 somewhere after it:

     
    let str = "1 turkey costs 30€";
    
    alert( str.match(/d+(?=s)(?=.*30)/) ); // 1

    In our string that exactly matches the number 1.

    Negative lookahead

    Let’s say that we want a quantity instead, not a price from the same string. That’s a number d+, NOT followed by .

    For that, a negative lookahead can be applied.

    The syntax is: X(?!Y), it means "search X, but only if not followed by Y".

     
    let str = "2 turkeys cost 60€";
    
    alert( str.match(/d+(?!€)/) ); // 2 (the price is skipped)

    Lookbehind

    Lookahead allows to add a condition for “what follows”.

    Lookbehind is similar, but it looks behind. That is, it allows to match a pattern only if there’s something before it.

    The syntax is:

    • Positive lookbehind: (?<=Y)X, matches X, but only if there’s Y before it.
    • Negative lookbehind: (?<!Y)X, matches X, but only if there’s no Y before it.

    For example, let’s change the price to US dollars. The dollar sign is usually before the number, so to look for $30 we’ll use (?<=$)d+ – an amount preceded by $:

     
    let str = "1 turkey costs $30";
    
    // the dollar sign is escaped $
    alert( str.match(/(?<=$)d+/) ); // 30 (skipped the sole number)

    And, if we need the quantity – a number, not preceded by $, then we can use a negative lookbehind (?<!$)d+:

     
    let str = "2 turkeys cost $60";
    
    alert( str.match(/(?<!$)d+/) ); // 2 (skipped the price)

    Capturing groups

    Generally, the contents inside lookaround parentheses does not become a part of the result.

    E.g. in the pattern d+(?=€), the  sign doesn’t get captured as a part of the match. That’s natural: we look for a number d+, while (?=€) is just a test that it should be followed by .

    But in some situations we might want to capture the lookaround expression as well, or a part of it. That’s possible. Just wrap that part into additional parentheses.

    In the example below the currency sign (€|kr) is captured, along with the amount:

     
    let str = "1 turkey costs 30€";
    let regexp = /d+(?=(€|kr))/; // extra parentheses around €|kr
    
    alert( str.match(regexp) ); // 30, €

    And here’s the same for lookbehind:

     
    let str = "1 turkey costs $30";
    let regexp = /(?<=($|£))d+/;
    
    alert( str.match(regexp) ); // 30, $

    Summary

    Lookahead and lookbehind (commonly referred to as “lookaround”) are useful when we’d like to match something depending on the context before/after it.

    For simple regexps we can do the similar thing manually. That is: match everything, in any context, and then filter by context in the loop.

    Remember, str.match (without flag g) and str.matchAll (always) return matches as arrays with index property, so we know where exactly in the text it is, and can check the context.

    But generally lookaround is more convenient.

    Lookaround types:

    Patterntypematches
    X(?=Y) Positive lookahead X if followed by Y
    X(?!Y) Negative lookahead X if not followed by Y
    (?<=Y)X Positive lookbehind X if after Y
    (?<!Y)X Negative lookbehind X if not after Y

    Tasks

     

    There’s a string of integer numbers.

    Create a regexp that looks for only non-negative ones (zero is allowed).

    An example of use:

    let regexp = /your regexp/g;
    
    let str = "0 12 -5 123 -18";
    
    alert( str.match(regexp) ); // 0, 12, 123

    The regexp for an integer number is d+.

    We can exclude negatives by prepending it with the negative lookahead: (?<!-)d+.

    Although, if we try it now, we may notice one more “extra” result:

     
    let regexp = /(?<!-)d+/g;
    
    let str = "0 12 -5 123 -18";
    
    console.log( str.match(regexp) ); // 0, 12, 123, 8

    As you can see, it matches 8, from -18. To exclude it, we need to ensure that the regexp starts matching a number not from the middle of another (non-matching) number.

    We can do it by specifying another negative lookbehind: (?<!-)(?<!d)d+. Now (?<!d) ensures that a match does not start after another digit, just what we need.

    We can also join them into a single lookbehind here:

     
    let regexp = /(?<![-d])d+/g;
    
    let str = "0 12 -5 123 -18";
    
    alert( str.match(regexp) ); // 0, 12, 123
     

    We have a string with an HTML Document.

    Write a regular expression that inserts <h1>Hello</h1> immediately after <body> tag. The tag may have attributes.

    For instance:

    let regexp = /your regular expression/;
    
    let str = `
    <html>
      <body style="height: 200px">
      ...
      </body>
    </html>
    `;
    
    str = str.replace(regexp, `<h1>Hello</h1>`);

    After that the value of str should be:

    <html>
      <body style="height: 200px"><h1>Hello</h1>
      ...
      </body>
    </html>

    In order to insert after the <body> tag, we must first find it. We can use the regular expression pattern <body.*> for that.

    In this task we don’t need to modify the <body> tag. We only need to add the text after it.

    Here’s how we can do it:

     
    let str = '...<body style="...">...';
    str = str.replace(/<body.*>/, '$&<h1>Hello</h1>');
    
    alert(str); // ...<body style="..."><h1>Hello</h1>...

    In the replacement string $& means the match itself, that is, the part of the source text that corresponds to <body.*>. It gets replaced by itself plus <h1>Hello</h1>.

    An alternative is to use lookbehind:

     
    let str = '...<body style="...">...';
    str = str.replace(/(?<=<body.*>)/, `<h1>Hello</h1>`);
    
    alert(str); // ...<body style="..."><h1>Hello</h1>...

    As you can see, there’s only lookbehind part in this regexp.

    It works like this:

    • At every position in the text.
    • Check if it’s preceeded by <body.*>.
    • If it’s so then we have the match.

    The tag <body.*> won’t be returned. The result of this regexp is literally an empty string, but it matches only at positions preceeded by <body.*>.

    So we replaces the “empty line”, preceeded by <body.*>, with <h1>Hello</h1>. That’s the insertion after <body>.

    P.S. Regexp flags, such as s and i can also useful: /<body.*>/si. The s flag makes the dot . match a newline character, and i flag makes <body> also match <BODY> case-insensitively.

  • 相关阅读:
    BZOJ 5018 [Snoi2017]英雄联盟
    BZOJ 4945 [Noi2017]游戏
    BZOJ4942 [Noi2017]整数
    BZOJ 2427 [HAOI2010]软件安装
    BZOJ 4870 [Shoi2017]组合数问题
    THINKPHP 全局404
    PHP 万能查询代码
    xml Array 相互转化
    JS 倒计时计算
    PHP 多态
  • 原文地址:https://www.cnblogs.com/oxspirt/p/13648017.html
Copyright © 2011-2022 走看看