zoukankan      html  css  js  c++  java
  • PostgreSQL恢复误删数据

      在Oracle中;删除表或者误删表记录;有个闪回特性,不需要停机操作,可以完美找回记录。当然也有一些其他的恢复工具;例如odu工具,gdul工具。都可以找回数据。而PostgreSQL目前没有闪回特性。如何在不停机情况下恢复误删数据。还好是有完整的热备份。

      本文描述的方法是:利用热备份在另一台服务器进行数据恢复;再导入正式环境;这样不影响数据库操作。这方法也适用在Oracle恢复。必须满足几个条件

    1. 有完整的基础数据文件备份和归档文件备份.所以备份是很重要的。
    2. 有一台装好同款Postgres软件的服务器

    实例模拟讲解

      过程模拟误删表tbl_lottu_drop后;后续进行dml/ddl操作;表明正式数据库还是进行正常工作。在另外一台数据库基于数据库PITR恢复。恢复表tbl_lottu_drop的数据。

    • Postgres201 : 线上数据库服务器
    • Postgres202 : 操作服务器

    1. 创建一个有效的备份

    postgres=# select pg_start_backup(now()::text); 
     pg_start_backup 
    -----------------
     0/F000060
    (1 row)
    
    [postgres@Postgres201 ~]$ rsync -acvz -L --exclude "pg_xlog" --exclude "pg_log" $PGDATA /data/backup/20180428
    
    postgres=# select pg_stop_backup(); 
    NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
     pg_stop_backup 
    ----------------
     0/F000168
    (1 row)

    2. 模拟误操作

      2.1 创建一个需要恢复对象表tbl_lottu_drop。并插入1000记录。也保证数据从数据缓存写入磁盘中。

    lottu=> create table tbl_lottu_drop (id int);
    CREATE TABLE
    lottu=> insert into tbl_lottu_drop select generate_series(1,1000);  
    INSERT 0 1000
    lottu=> c lottu postgres
    You are now connected to database "lottu" as user "postgres".

      2.2 这个获取一个时间:用于后面基于数据库PITR恢复(当然现实操作后只能记住一个大概的时间;还往往是不准;可能记住的时间是误操作之后。后面有讲解如何获取需要恢复到那个时间点)

    lottu=# select now();
                  now              
    -------------------------------
     2018-04-28 20:47:31.617808+08
    (1 row)
    lottu=# checkpoint;
    CHECKPOINT
    lottu=# select pg_xlogfile_name(pg_switch_xlog());
         pg_xlogfile_name     
    --------------------------
     000000030000000000000010
    (1 row)

      2.3 进行drop表

    lottu=# drop table tbl_lottu_drop;
    DROP TABLE

      2.4 后续进行dml/ddl操作;表明正式数据库还是进行正常工作

    lottu=# create table tbl_lottu_log (id int);
    CREATE TABLE
    lottu=# insert into  tbl_lottu_log values (1),(2);
    INSERT 0 2
    lottu=# checkpoint;
    CHECKPOINT
    lottu=# select pg_xlogfile_name(pg_switch_xlog());
         pg_xlogfile_name     
    --------------------------
     000000030000000000000011
    (1 row)

    3. 恢复操作

      3.1 将备份拷贝到Postgres202数据库上

    [postgres@Postgres201 20180428]$ cd /data/backup/20180428
    [postgres@Postgres201 20180428]$ ll
    total 4
    drwx------. 18 postgres postgres 4096 Apr 28 20:42 data
    [postgres@Postgres201 20180428]$ rsync -acvz -L data postgres@192.168.1.202:/data/postgres

      3.2 删除不必要的文件

    [postgres@Postgres202 data]$ cd $PGDATA
    [postgres@Postgres202 data]$ rm backup_label.old postmaster.pid tablespace_map.old

      3.3 还原备份表空间软链接

    [postgres@Postgres202 data]$ cat tablespace_map 
    16385 /data/pg_data/lottu
    [postgres@Postgres202 data]$ mkdir -p /data/pg_data
    [postgres@Postgres202 data]$ cd pg_tblspc/
    [postgres@Postgres202 pg_tblspc]$ mv 16385/  /data/pg_data/lottu
    [postgres@Postgres202 pg_tblspc]$ ln -s /data/pg_data/lottu ./16385
    [postgres@Postgres202 pg_tblspc]$ ll
    total 0
    lrwxrwxrwx. 1 postgres postgres 19 Apr 28 23:12 16385 -> /data/pg_data/lottu

      3.4 将wal日志拷贝到Postgres202数据库上pg_xlog目录下;从哪个日志开始拷贝?

    [postgres@Postgres202 data]$ mkdir -p pg_xlog/archive_status
    [postgres@Postgres202 data]$ cat backup_label 
    START WAL LOCATION: 0/F000060 (file 00000003000000000000000F)
    CHECKPOINT LOCATION: 0/F000098
    BACKUP METHOD: pg_start_backup
    BACKUP FROM: master
    START TIME: 2018-04-28 20:42:15 CST
    LABEL: 2018-04-28 20:42:13.244358+08

      查看backup_label;知道00000003000000000000000F开始到正在写入的wal日志。

    [postgres@Postgres202 pg_xlog]$ ll
    total 65540
    -rw-------. 1 postgres postgres 16777216 Apr 28 20:42 00000003000000000000000F
    -rw-------. 1 postgres postgres      313 Apr 28 20:42 00000003000000000000000F.00000060.backup
    -rw-------. 1 postgres postgres 16777216 Apr 28 20:48 000000030000000000000010
    -rw-------. 1 postgres postgres 16777216 Apr 28 20:50 000000030000000000000011
    -rw-------. 1 postgres postgres 16777216 Apr 28 20:55 000000030000000000000012

      3.5 编辑recovery.conf文件

    [postgres@Postgres202 data]$ vi recovery.conf 
    
    restore_command = 'cp /data/arch/%f %p'            # e.g. 'cp /mnt/server/archivedir/%f %p'
    recovery_target_time = '2018-04-28 20:47:31.617808+08'
    recovery_target_inclusive = false
    recovery_target_timeline = 'latest'

      3.6 启动数据库;并验证数据

    [postgres@Postgres202 data]$ pg_start
    server starting
    [postgres@Postgres202 data]$ ps -ef | grep postgres
    root      1098  1083  0 22:32 pts/0    00:00:00 su - postgres
    postgres  1099  1098  0 22:32 pts/0    00:00:00 -bash
    root      1210  1195  0 22:55 pts/1    00:00:00 su - postgres
    postgres  1211  1210  0 22:55 pts/1    00:00:00 -bash
    postgres  1442     1  1 23:16 pts/0    00:00:00 /opt/pgsql96/bin/postgres
    postgres  1450  1442  0 23:16 ?        00:00:00 postgres: checkpointer process   
    postgres  1451  1442  0 23:16 ?        00:00:00 postgres: writer process   
    postgres  1459  1442  0 23:16 ?        00:00:00 postgres: wal writer process   
    postgres  1460  1442  0 23:16 ?        00:00:00 postgres: autovacuum launcher process   
    postgres  1461  1442  0 23:16 ?        00:00:00 postgres: archiver process   last was 00000005.history
    postgres  1462  1442  0 23:16 ?        00:00:00 postgres: stats collector process   
    postgres  1464  1099  0 23:16 pts/0    00:00:00 ps -ef
    postgres  1465  1099  0 23:16 pts/0    00:00:00 grep postgres
    [postgres@Postgres202 data]$ psql
    psql (9.6.0)
    Type "help" for help.
    
    postgres=# c lottu lottu
    You are now connected to database "lottu" as user "lottu".
    lottu=> dt
                List of relations
     Schema |      Name      | Type  | Owner 
    --------+----------------+-------+-------
     public | pitr_test      | table | lottu
     public | tbl_lottu_drop | table | lottu
     
     lottu=> select count(1) from tbl_lottu_drop;
     count 
    -------
      1000
    (1 row)

      从这看数据是恢复了;copy到线上数据库操作略。

    延伸点

    下面讲解下如何找到误操作的时间。即recovery_target_time = '2018-04-28 20:47:31.617808+08'的时间点。上文是前面已经获取的;

      1. 用pg_xlogdump解析这段日志。

    [postgres@Postgres201 pg_xlog]$ pg_xlogdump -b 00000003000000000000000F 000000030000000000000012 > lottu.log
    pg_xlogdump: FATAL:  error in WAL record at 0/12000648: invalid record length at 0/12000680: wanted 24, got 0

      2. 从lottu.log中可以找到这段日志

    rmgr: Transaction len (rec/tot):      8/    34, tx:       1689, lsn: 0/100244A0, prev 0/10024460, desc: COMMIT 2018-04-28 20:45:49.736013 CST
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/100244C8, prev 0/100244A0, desc: RUNNING_XACTS nextXid 1690 latestCompletedXid 1689 oldestRunningXid 1690
    rmgr: Heap        len (rec/tot):      3/  3130, tx:       1690, lsn: 0/10024500, prev 0/100244C8, desc: INSERT off 9
        blkref #0: rel 16385/16386/2619 fork main blk 15 (FPW); hole: offset: 60, length: 5116
    
    rmgr: Btree       len (rec/tot):      2/  7793, tx:       1690, lsn: 0/10025140, prev 0/10024500, desc: INSERT_LEAF off 385
        blkref #0: rel 16385/16386/2696 fork main blk 1 (FPW); hole: offset: 1564, length: 452
    
    rmgr: Heap        len (rec/tot):      2/   184, tx:       1690, lsn: 0/10026FD0, prev 0/10025140, desc: INPLACE off 16
        blkref #0: rel 16385/16386/1259 fork main blk 0
    rmgr: Transaction len (rec/tot):     88/   114, tx:       1690, lsn: 0/10027088, prev 0/10026FD0, desc: COMMIT 2018-04-28 20:46:37.718442 CST; inval msgs: catcache 49 catcache 45 catcache 44 relcache 32784
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027100, prev 0/10027088, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027138, prev 0/10027100, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
    rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 0/10027170, prev 0/10027138, desc: CHECKPOINT_ONLINE redo 0/10027138; tli 3; prev tli 3; fpw true; xid 0:1691; oid 40976; multi 1; offset 0; oldest xid 1668 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 1691; online
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/100271E0, prev 0/10027170, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/10027218, prev 0/100271E0, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
    rmgr: XLOG        len (rec/tot):     80/   106, tx:          0, lsn: 0/10027250, prev 0/10027218, desc: CHECKPOINT_ONLINE redo 0/10027218; tli 3; prev tli 3; fpw true; xid 0:1691; oid 40976; multi 1; offset 0; oldest xid 1668 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 1691; online
    rmgr: XLOG        len (rec/tot):      0/    24, tx:          0, lsn: 0/100272C0, prev 0/10027250, desc: SWITCH 
    rmgr: Standby     len (rec/tot):     24/    50, tx:          0, lsn: 0/11000028, prev 0/100272C0, desc: RUNNING_XACTS nextXid 1691 latestCompletedXid 1690 oldestRunningXid 1691
    rmgr: Standby     len (rec/tot):     16/    42, tx:       1691, lsn: 0/11000060, prev 0/11000028, desc: LOCK xid 1691 db 16386 rel 32784 
    rmgr: Heap        len (rec/tot):      8/  2963, tx:       1691, lsn: 0/11000090, prev 0/11000060, desc: DELETE off 16 KEYS_UPDATED 
        blkref #0: rel 16385/16386/1247 fork main blk 8 (FPW); hole: offset: 88, length: 5288

    根据“32784”日志可以看到是表tbl_lottu_drop在2018-04-28 20:46:37.718442插入1000条记录(所以恢复时间点选2018-04-28 20:47:31.617808+08没毛病);即也是在事务id为1690操作的。并在事务id为1691进行删除操作。

    所以上面的recovery.conf 也可以改写为:

    restore_command = 'cp /data/arch/%f %p'            # e.g. 'cp /mnt/server/archivedir/%f %p'
    recovery_target_xid = '1690' 
    recovery_target_inclusive = false    
    recovery_target_timeline = 'latest'
  • 相关阅读:
    tree related problems (update continuously)
    [单元測试]_[VC2010使用gtest单元測试入门]
    generate alphanumeric serial number
    UiAutomator源代码分析之获取控件信息
    Fckeditor常见漏洞的挖掘与利用整理汇总
    AsyncAwait
    .Net异步编程知多少
    ParameterizedThreadStart task
    AsyncAwait 学习
    8天玩转并行开发——第八天 用VS性能向导解剖你的程序
  • 原文地址:https://www.cnblogs.com/lottu/p/8968892.html
Copyright © 2011-2022 走看看