最近项目有需求,要对用户的签名,回复进行敏感词检测,然后搜到了一个好用的扩展,分享给大家。
https://github.com/FireLustre/php-dfa-sensitive
通过 composer 进行安装:
composer require lustre/php-dfa-sensitive
然后在 app 目录下创建 Services ,并添加 SensitiveWords.php
<?php namespace AppServices; use DfaFilterSensitiveHelper; class SensitiveWords { protected static $handle = null; private function __construct() { } private function __clone() { } /** * 获取实例 */ public static function getInstance($word_path = []) { if (!self::$handle) { //默认的一些敏感词库 $default_path = [ storage_path('dict/bk.txt'), storage_path('dict/fd.txt'), storage_path('dict/ms.txt'), storage_path('dict/qt.txt'), storage_path('dict/sq.txt'), storage_path('dict/tf.txt'), ]; $paths = array_merge($default_path, $word_path); self::$handle = SensitiveHelper::init(); if (!empty($paths)) { foreach ($paths as $path) { self::$handle->setTreeByFile($path); } } } return self::$handle; } /** * 检测是否含有敏感词 */ public static function isLegal($content) { return self::getInstance()->islegal($content); } /** * 敏感词过滤 */ public static function replace($content, $replace_char = '', $repeat = false, $match_type = 1) { return self::getInstance()->replace($content, $replace_char, $repeat, $match_type); } /** * 标记敏感词 */ public static function mark($content, $start_tag, $end_tag, $match_type = 1) { return self::getInstance()->mark($content, $start_tag, $end_tag, $match_type); } /** * 获取文本中的敏感词 */ public static function getBadWord($content, $match_type = 1, $word_num = 0) { return self::getInstance()->getBadWord($content, $match_type, $word_num); } }
然后我们就可以在项目中,使用 SensitiveWords::getBadWord() 来获取文本中是否有敏感词。
$bad_word = SensitiveWords::getBadWord($content); if (!empty($bad_word)) { throw new Exception('包含敏感词:' . current($bad_word)); }
在 storage 目录下创建 dict 目录存放敏感词词库,bk.txt .....等等,这些词库都是我在网上下载的。
下载地址:
https://download.csdn.net/download/jkko123/12066066