博客文章除注明转载外,均为原创。转载请注明出处。
本文链接地址:http://blog.chinaunix.net/blog/post/id/5766124.html
1.
模拟问题的出现:
(1)创建测试表
root@localhost [test]> show create table student_info\G
*************************** 1. row ***************************
Table: student_info
Create Table: CREATE TABLE `student_info` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
`city` char(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.02 sec)
root@localhost [test]> select * from student_info;
+----+----------+-----------+
| id | name | city |
+----+----------+-----------+
| 1 | zhangsan | beijing |
| 2 | lisi | shanghai |
| 3 | xiaoming | guangzhou |
| 4 | yangli | chengdu |
| 5 | liuyang | tianjing |
+----+----------+-----------+
(2)模拟问题
从库上执行;
delete from test.student_info where id=4;
然后在主库上执行
delete from test.student_info where id=4;
错误出现了
root@localhost [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.10.94
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysqlbin.000002
Read_Master_Log_Pos: 2095
Relay_Log_File: relay.000002
Relay_Log_Pos: 2022
Relay_Master_Log_File: mysqlbin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table test.student_info; Can't find record in 'student_info', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysqlbin.000002, end_log_pos 2064
Skip_Counter: 0
Exec_Master_Log_Pos: 1811
Relay_Log_Space: 2503
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test.student_info; Can't find record in 'student_info', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysqlbin.000002, end_log_pos 2064
Replicate_Ignore_Server_Ids:
Master_Server_Id: 23306
Master_UUID: a236bacb-30af-11e7-b2e6-08002788f50a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 170504 18:20:52
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a236bacb-30af-11e7-b2e6-08002788f50a:1-7
Executed_Gtid_Set: 9e3853f8-30af-11e7-a9c7-080027357b98:1,
a236bacb-30af-11e7-b2e6-08002788f50a:1-6
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
上面有错误
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test.student_info; Can't find record in 'student_info', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysqlbin.000002, end_log_pos 2064
主从同步已经停止,因为在主库上删除id为4的数据时,binlog传到从库执行时没找到这条数据,所以发生了错误。
2.处理如下:
(1)停止sqlve,修改复制模式:
root@localhost [(none)]> stop slave;
Query OK, 0 rows affected (0.00 sec)
root@localhost [(none)]> set global slave_exec_mode='IDEMPOTENT';
Query OK, 0 rows affected (0.00 sec)
root@localhost [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
root@localhost [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.10.94
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysqlbin.000002
Read_Master_Log_Pos: 2095
Relay_Log_File: relay.000003
Relay_Log_Pos: 451
Relay_Master_Log_File: mysqlbin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 2095
Relay_Log_Space: 2800
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 23306
Master_UUID: a236bacb-30af-11e7-b2e6-08002788f50a
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a236bacb-30af-11e7-b2e6-08002788f50a:1-7
Executed_Gtid_Set: 9e3853f8-30af-11e7-a9c7-080027357b98:1,
a236bacb-30af-11e7-b2e6-08002788f50a:1-7
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
错误消失了,不影响其它表的使用和复制继续进行。
(2)修复
这里使用
pt-online-schema-change来进行修复。
[root@master01 ~]# pt-online-schema-change -h localhost -u root -P 3306 --alter "engine=innodb" D=test,t=student_info --charset=utf8 --execute --nocheck-replication-filter
Cannot connect to A=utf8,D=test,P=3306,h=192.168.10.84,u=root
No slaves found. See --recursion-method if host master01 has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `test`.`student_info`...
Creating new table...
Created new table test._student_info_new OK.
Altering new table...
Altered `test`.`_student_info_new` OK.
2017-05-04T18:27:31 Creating triggers...
2017-05-04T18:27:31 Created triggers OK.
2017-05-04T18:27:31 Copying approximately 5 rows...
Cannot connect to A=utf8,D=test,P=3306,h=192.168.10.84,u=root
2017-05-04T18:27:31 Copied rows OK.
2017-05-04T18:27:31 Analyzing new table...
2017-05-04T18:27:31 Swapping tables...
2017-05-04T18:27:31 Swapped original and new tables OK.
2017-05-04T18:27:31 Dropping old table...
2017-05-04T18:27:32 Dropped old table `test`.`_student_info_old` OK.
2017-05-04T18:27:32 Dropping triggers...
2017-05-04T18:27:32 Dropped triggers OK.
Successfully altered `test`.`student_info`.
修复完成
(3)对相关表进行数据校验
[root@master01 ~]# pt-table-checksum --nocheck-replication-filters --no-check-binlog-format --recursion-method=processlist --replicate=test.checksum --databases=test h=localhost,u=root,P=3306
Cannot connect to P=3306,h=192.168.10.84,u=root
Diffs cannot be detected because no slaves were found. Please read the --recursion-method documentation for information.
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
05-04T18:38:29 0 0 4 1 0 0.061 test.student_info
05-04T18:38:29 0 0 0 1 0 0.030 test.t1
root@localhost [test]> select * from test.checksum where this_crc <> master_crc or this_cnt <> master_cnt\G
Empty set (0.00 sec)
(4)恢复复制模式
root@localhost [(none)]> set global slave_exec_mode='STRICT';
Query OK, 0 rows affected (0.00 sec)
root@localhost [(none)]> select * from test.student_info;
+----+----------+-----------+
| id | name | city |
+----+----------+-----------+
| 1 | zhangsan | beijing |
| 2 | lisi | shanghai |
| 3 | xiaoming | guangzhou |
| 5 | liuyang | tianjing |
| 8 | yangli | chengdu |
+----+----------+-----------+
5 rows in set (0.00 sec)
---The end