例:
<style>
.word-0{ background-color: yellow; }
.word-1{ border:1px solid red; }
</style>
<?php
header('Content-type:text/html;charset=utf-8');
/* 标记Web页面 */
$body = '
<p>I like pickles and hrrring.</p>
<a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
I have herringbone-patterned toaster cozy.
<herring>Herring is not a real HTML element!</herring>
';
$words = array('pickle', 'herring');
$replacements = array();
foreach($words as $i => $word) {
$replacements[] = "<span class='word-$i'>$word</span>";
}
// 将页面分解为多个块
// 由看上去类似HTML元素的部分分隔
$parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
//var_dump($parts);
/*
array (size=15)
0 => string '
' (length=2)
1 => string '<p>' (length=3)
2 => string 'I like pickles and hrrring.' (length=27)
3 => string '</p>' (length=4)
4 => string '
' (length=2)
5 => string '<a href="pickle.php">' (length=21)
6 => string '' (length=0)
7 => string '<img width="200" src="pickle.png">' (length=34)
8 => string 'A pickle pic' (length=12)
9 => string '</a>' (length=4)
10 => string '
I have herringbone-patterned toaster cozy.
' (length=46)
11 => string '<herring>' (length=9)
12 => string 'Herring is not a real HTML element!' (length=35)
13 => string '</herring>' (length=10)
14 => string '
' (length=2)
*/
foreach($parts as $i => $part) {
//如果这个部分是HTML元素则跳过
if(isset($part[0]) && ($part[0] == '<')) { continue; }
//将这些单词用<span/>包围起来
$parts[$i] = str_replace($words, $replacements, $part);
}
$body = implode('', $parts);
echo $body;
说明:
preg_split() 函数中使用的正则表达式匹配 HTML 标签
<(?:"[^"]*"|'[^']*'|[^'">])*>
可以这样理解:
< //开始尖括号 (?: //任意数量的 "[^"]*" //双引号字符串 | //或 '[^']*' //单引号字符串 | //或 [^'">] //除去单引号、双引号和>的其他文本 )* > //结束尖括号
但是这种方法无法高亮最后一个 Herring,因为它的首字母是大写的。要完全不区分大小写的更改,需要把 str_replace() 方法 改为 preg_replace() 方法:
<style>
.word-0{ background-color: yellow; }
.word-1{ border:1px solid red; }
</style>
<?php
header('Content-type:text/html;charset=utf-8');
/* 标记Web页面 */
$body = '
<p>I like pickles and hrrring.</p>
<a href="pickle.php"><img width="200" src="pickle.png">A pickle pic</a>
I have herringbone-patterned toaster cozy.
<herring>Herring is not a real HTML element!</herring>
';
$words = array('pickle', 'herring');
$replacements = array();
foreach($words as $i => $word) {
$patterns[] = '/'.preg_quote($word).'/i';
//preg_quote()需要参数 str 并向其中 每个正则表达式语法中的字符前增加一个反斜线。正则表达式特殊字符有: . + * ? [ ^ ] $ ( ) { } = ! < > | : -
$replacements[] = "<span class='word-$i'>\0</span>";
}
// 将页面分解为多个块
// 由看上去类似HTML元素的部分分隔
$parts = preg_split("{(<(?:"[^"]*"|'[^']*'|[^'">])*>)}", $body, -1, PREG_SPLIT_DELIM_CAPTURE);
var_dump($parts);
/*
array (size=15)
0 => string '
' (length=2)
1 => string '<p>' (length=3)
2 => string 'I like pickles and hrrring.' (length=27)
3 => string '</p>' (length=4)
4 => string '
' (length=2)
5 => string '<a href="pickle.php">' (length=21)
6 => string '' (length=0)
7 => string '<img width="200" src="pickle.png">' (length=34)
8 => string 'A pickle pic' (length=12)
9 => string '</a>' (length=4)
10 => string '
I have herringbone-patterned toaster cozy.
' (length=46)
11 => string '<herring>' (length=9)
12 => string 'Herring is not a real HTML element!' (length=35)
13 => string '</herring>' (length=10)
14 => string '
' (length=2)
*/
foreach($parts as $i => $part) {
//如果这个部分是HTML元素则跳过
if(isset($part[0]) && ($part[0] == '<')) { continue; }
//将这些单词用<span/>包围起来
$parts[$i] = preg_replace($patterns, $replacements, $part);
}
$body = implode('', $parts);
echo $body;
参考:
<PHP Cookbook>3'rd
《精通正则表达式》第3版