zoukankan      html  css  js  c++  java
  • MySQL状态变量Aborted_connects与Aborted_clients浅析

     

    关于MySQL的状态变量Aborted_clients & Aborted_connects分别代表的意义,以及哪些情况或因素会导致这些状态变量变化呢?下文通过实验测试来验证一下,首先我们来看看状态变量的描述:

     

    Aborted Connect

     Aborted Connect表示尝试连接到MySQL服务器失败的次数。这个状态变量可以结合host_cache表和其错误日志一起来分析问题。 引起这个状态变量激增的原因如下:

        1、 客户端没有权限但是尝试访问MySQL数据库。

        2、 客户端输入的密码有误。

        3、 A connection packet does not contain the right information.

         4、超过连接时间限制,主要是这个系统变量connect_timeout控制(mysql默认是10s,基本上,除非网络环境极端不好,一般不会超时。)

    官方解释如下:

    If a client is unable even to connect, the server increments the Aborted_connects status variable. Unsuccessful connection attempts can occur for the following reasons:

    ·         A client attempts to access a database but has no privileges for it.

    ·         A client uses an incorrect password.

    ·         A connection packet does not contain the right information.

    ·         It takes more than connect_timeout seconds to obtain a connect packet. See Section 5.1.7, “Server System Variables”.

     

    Aborted Clients

    Aborted Clients表示由于客户端没有正确关闭连接而中止的连接数。官方解释如下:

       The number of connections that were aborted because the client died without closing the connection properly. SeeSection B.5.2.10, “Communication Errors and Aborted Connections

    Aborted Clients增大的时候意味着有客户端成功建立连接,但是由于某些原因断开连接或者被终止了,这种情况一般发生在网络不稳定的环境中。主要的可能性有:

       1、 客户端程序在退出之前未调用mysql_close()正确关闭MySQL连接。

       2、 客户端休眠的时间超过了系统变量wait_timeout和interactive_timeout的值,导致连接被MySQL进程终止

       3、 客户端程序在数据传输过程中突然结束

     

    官方文档B.5.2.10 Communication Errors and Aborted Connections的介绍如下:

     

    If a client successfully connects but later disconnects improperly or is terminated, the server increments theAborted_clients status variable, and logs an Aborted connection message to the error log. The cause can be any of the following:

    ·         The client program did not call mysql_close() before exiting.

    ·         The client had been sleeping more than wait_timeout or interactive_timeout seconds without issuing any requests to the server. See Section 5.1.7, “Server System Variables”.

    ·         The client program ended abruptly in the middle of a data transfer.

     

     

    Other reasons for problems with aborted connections or aborted clients:

    ·         The max_allowed_packet variable value is too small or queries require more memory than you have allocated formysqld. See Section B.5.2.9, “Packet Too Large”.

    ·         Use of Ethernet protocol with Linux, both half and full duplex. Some Linux Ethernet drivers have this bug. You should test for this bug by transferring a huge file using FTP between the client and server machines. If a transfer goes in burst-pause-burst-pause mode, you are experiencing a Linux duplex syndrome. Switch the duplex mode for both your network card and hub/switch to either full duplex or to half duplex and test the results to determine the best setting.

    ·         A problem with the thread library that causes interrupts on reads.

    ·         Badly configured TCP/IP.

    ·         Faulty Ethernets, hubs, switches, cables, and so forth. This can be diagnosed properly only by replacing hardware.

     

    如上介绍所示,有很多因素引起这些状态变量的值变化,那么我们来一个个分析、演示一下吧。首先,我们来测试一下导致Aborted Connect状态变量增加的可能因素

     

    1、 客户端没有权限但是尝试访问MySQL数据库。

    其实这里所说的没有权限,个人理解是:客户端使用没有授权的账号访问数据库 。打个比方,你尝试用账号kkk访问MySQL数据库,其实你也不知道数据库是否存在这个用户,实际上不存在这个用户。

    实验对比测试前,先将状态变量清零。

     

    mysql> flush status;
    Query OK, 0 rows affected (0.01 sec)
    mysql> show status like 'Abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 0     |
    +------------------+-------+
    2 rows in set (0.01 sec)
     
    mysql> 
    mysql> select host,user from mysql.user;
    +-------------------------------+-----------+
    | host                          | user      |
    +-------------------------------+-----------+
    | %                             | mydba     |
    | %                             | root      |
    | %                             | test      |
    | 127.0.0.1                     | root      |
    | 192.168.%                     | mydbadmin |
    | 192.168.103.18,192.168.103,22 | LimitIP   |
    | ::1                           | root      |
    | db-server.localdomain         | root      |
    | localhost                     | backuser  |
    | localhost                     | root      |
    +-------------------------------+-----------+

     

     

    在本机的SecureCRT的另外一个窗口,使用不存在的账号kkk访问MySQL后,你会发现状态变量Aborted_connects变为1了。

     

    [root@DB-Server ~]# mysql -u kkk -p

    Enter password:

    ERROR 1045 (28000): Access denied for user 'kkk'@'localhost' (using password: YES)

     

    也有可能,这个账号本身存在,但是只允许特定IP地址才能访问,实际环境中,可能是有人在进行尝试暴力破解。可能性非常多。我们来测试一下限制IP访问的情况

     

    mysql> grant all on MyDB.* to mydbadmin@'10.20.%' identified by '123456';
    Query OK, 0 rows affected (0.01 sec)
     
    mysql> flush privileges;
    Query OK, 0 rows affected (0.00 sec)
     
    mysql>  show status like 'Abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 0     |
    +------------------+-------+
    2 rows in set (0.00 sec)

    如上所示,创建一个mydbadmin的行号,只允许10.20段的IP访问,然后我们从192.168段的IP访问MySQL数据库

     

     

     

    # mysql -h 10.20.57.24 -u mydbadmin -p

    Enter password:

    ERROR 1045 (28000): Access denied for user 'mydbadmin'@'192.168.7.208' (using password: YES)

     

    此时,状态变量Aborted_connects就变为1了。

     

     

     

     

     

    2、 客户端输入的密码有误或者根本就是尝试各个密码。(A client uses an incorrect password)

     

    如下所示,使用test账号访问MySQL数据,但是输入了一个错误密码

     

    [root@DB-Server ~]# mysql -u test -p

    Enter password:

    ERROR 1045 (28000): Access denied for user 'test'@'localhost' (using password: YES)

    [root@DB-Server ~]#

     

    你检查状态变量Aborted_connects就会发现状态变量Aborted_connects变为2了。

     

    mysql>  show status like 'Abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 2     |
    +------------------+-------+
    2 rows in set (0.00 sec)

     

     

    3 A connection packet does not contain the right information.

     

     

    这个比较容易构造,可以对MySQL的端口进行端口测试(ping 端口),因为psping的包不包含正确的信息(right information),测试之前,先将状态变量清空。

     

     

    mysql> flush status;
     Query OK, 0 rows affected (0.00 sec)
    mysql> show status like 'abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 0     |
    +------------------+-------+
    2 rows in set (0.00 sec)

     

     

    在客户端对MySQL服务所在的主机进行端口连通性验证(psping)

     

     

    如上所示,psping测试后,Aborted_connects变成了5,如果继续进行psping测试,那么这个状态变量就会继续增长。

     

    另外,如果超过max_connect_error的限制后,某一个客户端持续访问MySQL,这个是否会引起状态变量Aborted_connects变化呢,实验测试的答案是不会。有兴趣的可以验证一下,很奇怪,网上有不少文章都说如果连接数满了,也会导致Aborted_connects状态变量增加,实际上这个是不会引起状态变量Aborted_connects变化的。

     

      4、 超过连接时间限制,主要是这个参数connect_timeout控制(mysql默认是10s,基本上,除非网络环境极端不好,一般不会超时。)

    首先在一台MySQL数据库服务器上执行下面命令,我们用Linux下的netem与tc命令模拟构造出复杂环境下的网络传输延时案例,延时11秒。

    # tc qdisc add dev eth0 root netem delay 11000ms

    在另外一台MySQL服务器ping这台MySLQ服务器,如下所示,你会看到网络时延为11秒

    # ping 10.20.57.24

    PING 10.20.57.24 (10.20.57.24) 56(84) bytes of data.

    64 bytes from 10.20.57.24: icmp_seq=1 ttl=61 time=11001 ms

    64 bytes from 10.20.57.24: icmp_seq=2 ttl=61 time=11001 ms

    64 bytes from 10.20.57.24: icmp_seq=3 ttl=61 time=11001 ms

    64 bytes from 10.20.57.24: icmp_seq=4 ttl=61 time=11001 ms

    64 bytes from 10.20.57.24: icmp_seq=5 ttl=61 time=11001 ms

    此时访问MySQL数据库,由于网络时延为11秒,超出了系统变量connect_timeout的10秒,就会出现下面错误,此时状态变量Aborted_connects的值变化!

    # mysql -h 10.20.57.24 -u test -p

    Enter password:

    ERROR 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 0

     

    那么如何区分状态变量Aborted Connect是那个引起的呢? 单从状态变量本身是无法区分的,但是可以结合performance_schema.host_cache来稍微做判别、甄别。

     

    COUNT_NAMEINFO_PERMANENT_ERRORS   IP到主机名称DNS解析期间的永久性错误数。

    COUNT_AUTHENTICATION_ERRORS       验证失败导致的错误数量

    SUM_CONNECT_ERRORS:         被视为 阻塞 的连接错误的数量 (根据max_connect_errors系统变量进行评估)。只有协议握手错误才会被计数,只有通过验证(HOST_VALIDATED = YES)的主机才会被计数

     

     

    1、 客户端没有权限但是尝试访问MySQL数据库。

        每次都会引起COUNT_AUTHENTICATION_ERRORS增1 ,第一次会引起COUNT_NAMEINFO_PERMANENT_ERRORS也增1

     

    2、 客户端输入的密码有误                   

        每次都会引起COUNT_AUTHENTICATION_ERRORS增1 ,第一次会引起COUNT_NAMEINFO_PERMANENT_ERRORS也增1

     

    其实对于与1和2,两者无法判别,最简单有效的将系统变量log_warnings设置为2,然后分析、查看错误日志信息:

     

    mysql> set global log_warnings=2;
    Query OK, 0 rows affected (0.00 sec)
    mysql> 

     

     

    那么此时1和2都会记录到错误日志里面去,然后你就可以通过分析错误日志,结合状态变量Aborted Connect来分析, 如下测试案例所示:

     

    2018-06-20 22:44:16 18026 [Warning] IP address '192.168.xxx.xxx' could not be resolved: Name or service not known

    2018-06-20 22:44:16 18026 [Warning] Access denied for user 'kkkk'@'192.168.xxx.xxx' (using password: YES)

    2018-06-20 22:45:18 18026 [Warning] Access denied for user 'test'@'192.168.xxx.xxx' (using password: YES)

     

    3、 A connection packet does not contain the right information 

    每次引起COUNT_HANDSHAKE_ERRORS增1,

    每次引起SUM_CONNECT_ERRORS增1

     

    C:Users>psping 10.20.57.24:3306
    
    
    PsPing v2.10 - PsPing - ping, latency, bandwidth measurement utility
    Copyright (C) 2012-2016 Mark Russinovich
    Sysinternals - www.sysinternals.com
     
    TCP connect to 10.20.57.24:3306:
     
    5 iterations (warmup 1) ping test:
     
    Connecting to 10.20.57.24:3306 (warmup): from 192.168.103.34:55327: 1.93ms
    Connecting to 10.20.57.24:3306: from 192.168.103.34:55328: 10.08ms
    Connecting to 10.20.57.24:3306: from 192.168.103.34:55329: 3.35ms
    Connecting to 10.20.57.24:3306: from 192.168.103.34:55330: 3.71ms
    Connecting to 10.20.57.24:3306: from 192.168.103.34:55331: 2.32ms
     
     
     
    TCP connect statistics for 10.20.57.24:3306:
      Sent = 4, Received = 4, Lost = 0 (0% loss),
      Minimum = 2.32ms, Maximum = 10.08ms, Average = 4.87ms

      4、 超过连接时间限制

     

    如果是超时引起,那么就会出现下面状况:

     

        每次引起SUM_CONNECT_ERRORS增1,

        每次引起COUNT_HANDSHAKE_ERRORS增1

        第一次会引起COUNT_NAMEINFO_PERMANENT_ERRORS增1

     

    注意: 3与4不会写入错误日志,3与4的区别可以通过COUNT_NAMEINFO_PERMANENT_ERRORS的值来区别。

     

     

     

    下面我们来实验测试一下状态变量Aborted Clients的变化因素,

     1、 客户端程序在退出之前未调用mysql_close()正确关闭MySQL连接。

    在实验前,使用flush status清理一下状态变量

    mysql> flush status;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show status like 'Abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 0     |
    +------------------+-------+
    2 rows in set (0.00 sec)

    mysql>

     

    写一个简单的Python测试脚本python_mysql.py,如下所示,将关闭数据库连接的地方dbcon.close注释掉, 

    import mysql.connector

    try:
     
    dbcon=mysql.connector.connect(
     host='127.0.0.1',
     user='root' ,
    passwd='xxxxxxx',
     database='information_schema'
    )

     cursor= dbcon.cursor()
    sql_tex='select count(*) from MyDB.test'
     cursor.execute(sql_tex)
    dtlist= cursor.fetchall()
     print dtlist
    except mysql.connector.Error as e:

      print('operation the sql fail!{0}'.format(e))
     
    finally:

      cursor.close;
    # dbcon.close;

     

    然后执行一下脚本,检查状态变量Aborted_clients,然后发现状态变量Aborted_clients的值增1了。

     

    [root@DB-Server kerry]# python python_mysql.py

    [(99999,)]

     

    2、 客户端休眠的时间超过了系统变量wait_timeout和interactive_timeout的值,导致连接被MySQL进程终止

     

    mysql> show global variables like 'interactive_timeout';
    +---------------------+-------+
    | Variable_name       | Value |
    +---------------------+-------+
    | interactive_timeout | 28800 |
    +---------------------+-------+
    1 row in set (0.00 sec)

    mysql> show global variables like 'wait_timeout';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | wait_timeout  | 28800 |
    +---------------+-------+
    1 row in set (0.00 sec)

    mysql>

     

    将全局系统变量interactive_timeout 和wait_timeout设置为4秒

     

    mysql> set global interactive_timeout=4;
    Query OK, 0 rows affected (0.00 sec)

    mysql> set global wait_timeout=4;
    Query OK, 0 rows affected (0.00 sec)

    mysql> show status like 'Abort%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | Aborted_clients  | 0     |
    | Aborted_connects | 0     |
    +------------------+-------+
    2 rows in set (0.00 sec)

     

    然后在客户端连接到MySQL数据库,不做任何操作,过来4秒后,你去操作就会出现错误ERROR 2013 (HY000): Lost connection to MySQL server during query


    # mysql -h 10.20.57.24 -u test -p
    Enter password:
    Welcome to the MySQL monitor.  Commands end with ; or g.
    Your MySQL connection id is 43
    Server version: 5.6.20-enterprise-commercial-advanced-log MySQL Enterprise Server - Advanced Edition (Commercial)

    Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.

    Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

    mysql> select current_user();
    ERROR 2013 (HY000): Lost connection to MySQL server during query
    mysql>

     

    在MySQL服务器你就会看到状态变量Aborted_clients变为1了。

     

    mysql> show status like 'Abort%';

    +------------------+-------+

    | Variable_name    | Value |

    +------------------+-------+

    | Aborted_clients  | 1     |

    | Aborted_connects | 0     |

    +------------------+-------+

    2 rows in set (0.00 sec


    还有其他一些原因(客户端异常中断或查询超出了max_allowed_packet值)由于不方便构造,在此略过。另外,其实我们还可以通过tcpdump抓包工具来追踪分析。下面举个例子(这里

    简单介绍一下tcpdump,后续文章再做展开分析)

    在MySQL服务器使用tcpdump抓包

    [root@DB-Server ~]# tcpdump -i eth0  port 3306  -s 1500 -w tcpdump.log

    然后在另外一台MySQL服务器,使用不存在的账号或错误的密码访问MySQL数据库

    # mysql -h 10.20.57.24 -u kkk  -p

    Enter password:

    ERROR 1045 (28000): Access denied for user 'kkk'@'192.168.7.208' (using password: YES)

    # mysql -h 10.20.57.24 -u test -p

    Enter password:

    ERROR 1045 (28000): Access denied for user 'test'@'192.168.7.208' (using password: YES)

    [root@GETLNX28 ~]#

     

    执行完命令后,你可以使用CTRL + C结束抓包分析,然后查看分析。如下截图所示:

     

    [root@DB-Server ~]# tcpdump -i eth0  port 3306  -s 1500 -w tcpdump.log

    tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 1500 bytes

    28 packets captured

    28 packets received by filter

    0 packets dropped by kernel

    [root@DB-Server ~]# strings tcpdump.log

     

     

     ....截图....

     

     

     

     

     

     

     

     

    参考资料:

     

    https://dev.mysql.com/doc/refman/8.0/en/communication-errors.html

    http://www.olivierdoucet.info/blog/2012/05/08/customer-case-finding-cause-of-max_user_connections/

  • 相关阅读:
    Access操作必须使用一个可更新的查询
    SAP资料学习好地方
    Access关键词大全
    WPF零散笔记
    WPF:如何实现单实例的应用程序(Single Instance)
    WPF应用程序启动显示图片资源
    Drawable、Bitmap、Canvas和Paint的关系以及部分使用方法
    C#中一种可调用的异常处理方法
    easyui datagrid 点击列表头排序出现错乱的原因
    easyui datagrid 没数据时显示滚动条的解决方法
  • 原文地址:https://www.cnblogs.com/DataArt/p/10237360.html
Copyright © 2011-2022 走看看