zoukankan      html  css  js  c++  java
  • cvm母机宕机重启后数据库修复

    下午正在开周会,然后收到短信,说是X.X.X.X的机器ping不通了,一轮测试过后,发现是某台数据库服务器挂了,先不急着重启,问下tencent客服。。。

    乖乖的好家伙,母机的主板坏了。。。。一个小时候,母机起来了,看下数据库起来么。。

    [root@VM_145_57_tlinux ~]# mysql -uroot -p1234
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)
    [root@VM_145_57_tlinux ~]# mysql -uroot -p    
    Enter password: 
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)
    [root@VM_145_57_tlinux ~]# ps -ef |grep mysql
    root      2016     1  0 17:50 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/data/mysql/var --socket=/tmp/mysql.sock --pid-file=/data/mysql/mysqld/mysqld.pid --basedir=/usr --user=mysql
    mysql     2177  2016 98 17:50 ?        00:11:21 /usr/libexec/mysqld --basedir=/usr --datadir=/data/mysql/var --user=mysql --log-error=/data/mysql/mysqld/mysqld.log --pid-file=/data/mysql/mysqld/mysqld.pid --socket=/tmp/mysql.sock --port=3306
    root     13643 13364  0 18:02 pts/0    00:00:00 grep mysql
    [root@VM_145_57_tlinux ~]# mysql -uroot -p1234 -S /tmp/mysql.sock
    ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)
    [root@VM_145_57_tlinux ~]# ls /tmp/mys^C
    [root@VM_145_57_tlinux ~]# /etc/init.d/mysqld restart
    Stopping mysqld:                                           [  OK  ]
    Starting mysqld:                                           [  OK  ]
    [root@VM_145_57_tlinux ~]# ps -ef | msyql
    -bash: msyql: command not found
    [root@VM_145_57_tlinux ~]# ps -ef |grep mysql
    root      2016     1  0 17:50 ?        00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/data/mysql/var --socket=/tmp/mysql.sock --pid-file=/data/mysql/mysqld/mysqld.pid --basedir=/usr --user=mysql
    mysql     2177  2016 99 17:50 ?        00:13:45 /usr/libexec/mysqld --basedir=/usr --datadir=/data/mysql/var --user=mysql --log-error=/data/mysql/mysqld/mysqld.log --pid-file=/data/mysql/mysqld/mysqld.pid --socket=/tmp/mysql.sock --port=3306
    root     13942     1  0 18:03 pts/0    00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/data/mysql/var --socket=/tmp/mysql.sock --pid-file=/data/mysql/mysqld/mysqld.pid --basedir=/usr --user=mysql
    mysql    14100 13942  0 18:03 pts/0    00:00:00 /usr/libexec/mysqld --basedir=/usr --datadir=/data/mysql/var --user=mysql --log-error=/data/mysql/mysqld/mysqld.log --pid-file=/data/mysql/mysqld/mysqld.pid --socket=/tmp/mysql.sock --port=3306
    root     14908 13364  0 18:04 pts/0    00:00:00 grep mysql

    这下亮了,出现了double进程,没发解释啊。。。【下面这步觉得error】

    [root@VM_145_57_tlinux ~]# kill -9 2016
    [root@VM_145_57_tlinux ~]# kill -9 2177
    [root@VM_145_57_tlinux ~]# ps -ef |grep mysql
    root     13942     1  0 18:03 pts/0    00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/data/mysql/var --socket=/tmp/mysql.sock --pid-file=/data/mysql/mysqld/mysqld.pid --basedir=/usr --user=mysql
    mysql    14100 13942  0 18:03 pts/0    00:00:00 /usr/libexec/mysqld --basedir=/usr --datadir=/data/mysql/var --user=mysql --log-error=/data/mysql/mysqld/mysqld.log --pid-file=/data/mysql/mysqld/mysqld.pid --socket=/tmp/mysql.sock --port=3306
    root     15240 13364  0 18:05 pts/0    00:00:00 grep mysql

    之前起来的进程被我kill -9之后,发现能连上数据库了,可是innodb存储引擎没起来。

    mysql> show engines;
    +------------+---------+-----------------------------------------------------------+--------------+------+------------+
    | Engine     | Support | Comment                                                   | Transactions | XA   | Savepoints |
    +------------+---------+-----------------------------------------------------------+--------------+------+------------+
    | MRG_MYISAM | YES     | Collection of identical MyISAM tables                     | NO           | NO   | NO         |
    | CSV        | YES     | CSV storage engine                                        | NO           | NO   | NO         |
    | MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance    | NO           | NO   | NO         |
    | MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables | NO           | NO   | NO         |
    +------------+---------+-----------------------------------------------------------+--------------+------+------------+
    4 rows in set (0.00 sec)

    似乎能看出问题了,进程起来了,socket没有建立,存储引擎没有启动-->innodb正在后台线程操作!!查下errorlog一探究竟:

    150126 17:45:52 mysqld_safe Starting mysqld daemon with databases from /data/mysql/var
    150126 17:45:53  InnoDB: Initializing buffer pool, size = 28.0G
    150126 17:45:55  InnoDB: Completed initialization of buffer pool
    InnoDB: Log scan progressed past the checkpoint lsn 1808 1368235506
    150126 17:45:55  InnoDB: Database was not shut down normally!
    InnoDB: Starting crash recovery.
    InnoDB: Reading tablespace information from the .ibd files...
    InnoDB: Restoring possible half-written data pages from the doublewrite
    InnoDB: buffer...
    InnoDB: Doing recovery: scanned up to log sequence number 1808 1373477888

    事实上,那个还没创建socket的进程是正在执行double write的回滚工作,继续往下翻页:

    InnoDB: Doing recovery: scanned up to log sequence number 1808 3523058688
    150126 18:03:02 mysqld_safe Starting mysqld daemon with databases from /data/mysql/var
    150126 18:03:03  InnoDB: Initializing buffer pool, size = 28.0G
    150126 18:03:03  InnoDB: Error: cannot allocate 30064787456 bytes of
    InnoDB: memory with malloc! Total allocated memory
    InnoDB: by InnoDB 37927312 bytes. Operating system errno: 12
    InnoDB: Check if you should increase the swap file or
    InnoDB: ulimits of your operating system.
    InnoDB: On FreeBSD check you have compiled the OS with
    InnoDB: a big enough maximum process size.
    InnoDB: Note that in most 32-bit computers the process
    InnoDB: memory space is limited to 2 GB or 4 GB.
    InnoDB: We keep retrying the allocation for 60 seconds...
    InnoDB: Doing recovery: scanned up to log sequence number 1808 3528301568

    这个是我在innodb引擎正在执行recovery的时候强行启动mysql的报错,提示内存不足。。继续往下翻页:

    InnoDB: Doing recovery: scanned up to log sequence number 1808 3606944768
    150126 18:04:03InnoDB: Fatal error: cannot allocate the memory for the buffer pool
    150126 18:04:03 [ERROR] Plugin 'InnoDB' init function returned error.
    150126 18:04:03 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
    150126 18:04:03 [Note] Event Scheduler: Loaded 0 events
    150126 18:04:03 [Note] /usr/libexec/mysqld: ready for connections.
    Version: '5.1.61'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
    InnoDB: Doing recovery: scanned up to log sequence number 1808 3612187648
    [此处省略若干行] InnoDB: Doing recovery: scanned up to log sequence number 1808 3711802368 150126 18:16:28 [Note] /usr/libexec/mysqld: Normal shutdown 150126 18:16:28 [Note] Event Scheduler: Purging the queue. 0 events 150126 18:16:30 [Note] /usr/libexec/mysqld: Shutdown complete 150126 18:16:30 mysqld_safe mysqld from pid file /data/mysql/mysqld/mysqld.pid ended 150126 18:16:30 mysqld_safe Starting mysqld daemon with databases from /data/mysql/var 150126 18:16:30 InnoDB: Initializing buffer pool, size = 28.0G 150126 18:16:32 InnoDB: Completed initialization of buffer pool InnoDB: Log scan progressed past the checkpoint lsn 1808 1368235506 [kill -9的结果] 150126 18:16:32 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 1808 1373477888

    看到上面的错误,才恍然大悟,前面的操作是有多危险,要不是mysql的recovery不那么强悍的话,恐怕数据就被我这样弄没了。。。好悬

    InnoDB: Doing recovery: scanned up to log sequence number 1808 4183661568
    InnoDB: Doing recovery: scanned up to log sequence number 1808 4185407155
    150126 18:38:51  InnoDB: Starting an apply batch of log records to the database...
    InnoDB: Progress in percents: 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
    InnoDB: Apply batch completed
    150126 18:40:59  InnoDB: Started; log sequence number 1808 4185407155
    150126 18:40:59 [Note] Event Scheduler: Loaded 0 events
    150126 18:40:59 [Note] /usr/libexec/mysqld: ready for connections.
    Version: '5.1.61'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution

    回滚完成,happy启动服务。

    下面来讲下innodb_log_file_size在innodb异常时回滚机制:

     

    下面我们参考 http://www.cnblogs.com/zuoxingyu/archive/2012/10/25/2738864.html 大牛的博客,算一下innodb_log_file_size到底多大为最合适:【作为一个粗略的规则,你可以让这个日志足够大到能容纳最多一小时左右的日志】

    mysql> pager grep sequence
    PAGER set to 'grep sequence'
    mysql> show engine innodb statusG select sleep(60); show engine innodb statusG
    Log sequence number 1818 411580730
    1 row in set (0.00 sec)
    
    1 row in set (1 min 0.00 sec)
    
    Log sequence number 1818 446445181
    1 row in set (0.00 sec)

    (411580730-446445181)/1024/1024/60=1999M,向上取整2G,由于默认有2个日志文件,所以按照当前高峰期计算,设为1G为最佳值。

  • 相关阅读:
    使用RestTemplate上传文件到远程接口
    设计模式(五)之适配器模式
    设计模式(四)之装饰者模式
    设计模式(三)之模板方法模式
    设计模式(二)之责任链模式
    BUG-jQuery提交表单submit方法-TypeError: e[h] is not a function
    数据类型--集合 set
    数据类型--字典 dic
    字符 str 串需要记住的语法
    数据类型--列表 list
  • 原文地址:https://www.cnblogs.com/forilen/p/4253386.html
Copyright © 2011-2022 走看看