通常我们会使用binlog_format=row的格式,这样就没得函数之类的坑了, 主库更新的数据全部都会记录在binlog里面,主从回放基本上就没啥问题了. 但是呢, 这样的日志量会非常的多, 比如业务执行一条insert into t2 select * from t2这么一条简单的sql,会把整个表的数据都记录下来; 表只要不是很小, 就会产生大量的binlog, 除了占用空间外还会影响我们分析binlog. 那么有没有参数可以记录下原始sql呢? 有的,兄弟,包有的.
当启用参数binlog_rows_query_log_events的时候, 执行的SQL除了记录修改的数据外,还会额外记录原始的SQL(主从复制的时候能直接看到SQL), 这样我们就不需要看那一堆堆的row_event了.
☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜
![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090450823.jpg)
新问题又来了, 怎么在一堆堆的信息中找到我们的这个SQL语句呢? mysqlbinlog -vvv mysql-bin.xxxx | grep -iE "^# (delete|update|insert)"就可以啊, 是的. 但mysqlbinlog解析的时候会使用临时目录, 可能会把临时目录打爆, 也好解决, 换个大点的临时目录:export TMPDIR=/data. 哈哈,完美解决.
![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090441537.jpg)
那如果Binlog很密集, 比如1分钟1GB日志,其中有很多insert into select的SQL, 你需要分析其中某部分的事务逻辑, 如果直接解析的话, 可能会产生几十GB的日志, 使用grep过滤这几十GB的日志是非常慢的. 而且有很大可能需要分析多个日志,这就得花费大量时间了, 而且还得观察临时空间, 免得告警.
我们分析密集型的binlog的时候除了mysqlbinlog外, 还有没有其它更好的方法呢?
我们现在的需求是要只提取binlog中的业务SQL--ROWS_QUERY_LOG_EVENT, 貌似没有现成好用点的工具, 那我们就自己写一个吧. 我们先来回顾下binlog的格式: binlog由若干个event组成, 每个event由19字节的event_header和event_body组成.如下:
![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090564000.jpg)
每种event的event_body结构都不一样, 本次的ROWS_QUERY_LOG_EVENT格式如下:
![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090574925.jpg)
size是固定的1字节,用来记录业务SQL的长度, 超过1字节的部分,只记录1字节的内容. 也就是长度对255求余.
然后就是编写脚本了, 由于我们要考虑的场景比较特殊, 就不做成通用的了, 也不考虑--start-position,--start-datetime,--table-include之类的功能了, 主打一个能直接手敲! 有兴趣的自己添加. 脚本如下:
<code class="python">#!/usr/bin/env python3# write by ddcw @https://github.com/ddcw# 解析binlog中 QUERY_EVENT和ROWS_QUERY_LOG_EVENT, 也就是开启参数binlog_rows_query_log_events的就能解析# 简单解析, 先不支持时间过滤,指定POS等import datetimeimport structimport sysdef format_timestamp(t):return datetime.datetime.fromtimestamp(t).strftime('%Y-%m-%d %H:%M:%S')def main():filename = sys.argv[1]with open(filename,'rb') as f:checksum_alg = Falseif f.read(4) != b'\xfebin':f.tell(0,0) # relay logwhile True:bevent_header = f.read(19)if len(bevent_header) != 19:breaktimestamp, event_type, server_id, event_size, log_pos, flags = struct.unpack('<LBLLLH',bevent_header)msg = f'# time:{format_timestamp(timestamp)} server_id:{server_id} event_type:{event_type} event_size:{event_size} log_pos:{log_pos}'#bevent_body = f.read(event_size-19)#continueif event_type == 15: # FORMAT_DESCRIPTION_EVENTbinlog_version, = struct.unpack('<H',f.read(2))mysql_version_id = f.read(50).decode().strip()create_timestamp, = struct.unpack('<L',f.read(4))event_header_length, = struct.unpack('<B',f.read(1))if mysql_version_id[:2] == '5.': # 5.xevent_post_header_len = f.read(38)elif mysql_version_id[:4] == '8.4.': # 8.4event_post_header_len = f.read(43)elif mysql_version_id[:2] == '8.': # 8.0event_post_header_len = f.read(41)checksum_alg = True if struct.unpack('<B',f.read(1))[0] else 0if checksum_alg:f.seek(4,1)print(f'{msg} create_time {format_timestamp(create_timestamp)} mysql_version:{mysql_version_id} create_time:{format_timestamp(create_timestamp)}')elif event_type == 2: # QUERY_EVENT DDLdata = f.read(event_size-19)thread_id,query_exec_time,db_len,error_code,status_vars_len = struct.unpack('<LLBHH',data[:13])dbname = data[13+status_vars_len:][:db_len].decode()ddl = data[13+status_vars_len+db_len+1:-4 if checksum_alg else -1].decode()l = ''if ddl != 'BEGIN':print(f'{msg} thread_id:{thread_id} query_exec_time:{query_exec_time}{"USE "+dbname+";"+l if db_len>0 else ""}{ddl}</p><p>')elif event_type == 3: # STOP_EVENT 文件结束了breakelif event_type == 33: # GTID_LOG_EVENT beginf.seek(event_size-19,1)print(f'{msg}BEGIN;')elif event_type == 29: # ROWS_QUERY_LOG_EVENT querydata = f.read(event_size-19)print(f'{msg}{data[1:-4 if checksum_alg else -1].decode()};')elif event_type == 16: # XID_EVENT commitf.seek(event_size-19,1)print(f'{msg}COMMIT;</p><p>')else: # 剩余的事务全部跳过f.seek(event_size-19,1)if __name__ == '__main__':main()</code>然后我们来测试下效果:
<code class="sql">-- 删除存在的表,可选drop table if exists db1.t20251120_rows_query;-- 刷新下日志, 方便后续校验flush binary logs;-- 建表create table db1.t20251120_rows_query(id int primary key auto_increment, name varchar(200));-- 准备时间insert into db1.t20251120_rows_query(name) values('ddcw');-- 多加几条,比如来个10来遍insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;-- ....-- 可以再删除几条,看看效果delete from db1.t20251120_rows_query limit 10;-- 看下日志叫啥select @@log_bin_basename;show master status;</code>![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090511041.jpg)
然后我们就可以使用我们的校验来看下效果了:
<code class="txt">17:07:23 [root@ddcw21 ei]#python3 get_sql_by_rows_query_log_event.py /data/mysql_3314/mysqllog/binlog/m3314.000106 # time:2025-11-20 17:06:38 server_id:866003314 event_type:15 event_size:122 log_pos:126 create_time 1970-01-01 08:00:00 mysql_version:8.0.28 create_time:1970-01-01 08:00:00# time:2025-11-20 17:06:38 server_id:866003314 event_type:33 event_size:79 log_pos:276BEGIN;# time:2025-11-20 17:06:38 server_id:866003314 event_type:2 event_size:177 log_pos:453 thread_id:10 query_exec_time:0USE db1;create table db1.t20251120_rows_query(id int primary key auto_increment, name varchar(200))# time:2025-11-20 17:06:38 server_id:866003314 event_type:33 event_size:79 log_pos:532BEGIN;# time:2025-11-20 17:06:38 server_id:866003314 event_type:29 event_size:81 log_pos:687insert into db1.t20251120_rows_query(name) values('ddcw');# time:2025-11-20 17:06:38 server_id:866003314 event_type:16 event_size:31 log_pos:835COMMIT;# time:2025-11-20 17:06:38 server_id:866003314 event_type:33 event_size:79 log_pos:914BEGIN;# time:2025-11-20 17:06:38 server_id:866003314 event_type:29 event_size:108 log_pos:1096insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:38 server_id:866003314 event_type:16 event_size:31 log_pos:1244COMMIT;# time:2025-11-20 17:06:39 server_id:866003314 event_type:33 event_size:79 log_pos:1323BEGIN;# time:2025-11-20 17:06:39 server_id:866003314 event_type:29 event_size:108 log_pos:1505insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:39 server_id:866003314 event_type:16 event_size:31 log_pos:1664COMMIT;# time:2025-11-20 17:06:40 server_id:866003314 event_type:33 event_size:79 log_pos:1743BEGIN;# time:2025-11-20 17:06:40 server_id:866003314 event_type:29 event_size:108 log_pos:1925insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:40 server_id:866003314 event_type:16 event_size:31 log_pos:2106COMMIT;# time:2025-11-20 17:06:41 server_id:866003314 event_type:33 event_size:79 log_pos:2185BEGIN;# time:2025-11-20 17:06:41 server_id:866003314 event_type:29 event_size:108 log_pos:2367insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:41 server_id:866003314 event_type:16 event_size:31 log_pos:2592COMMIT;# time:2025-11-20 17:06:41 server_id:866003314 event_type:33 event_size:79 log_pos:2671BEGIN;# time:2025-11-20 17:06:41 server_id:866003314 event_type:29 event_size:108 log_pos:2853insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:41 server_id:866003314 event_type:16 event_size:31 log_pos:3166COMMIT;# time:2025-11-20 17:06:42 server_id:866003314 event_type:33 event_size:79 log_pos:3245BEGIN;# time:2025-11-20 17:06:42 server_id:866003314 event_type:29 event_size:108 log_pos:3427insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:42 server_id:866003314 event_type:16 event_size:31 log_pos:3916COMMIT;# time:2025-11-20 17:06:42 server_id:866003314 event_type:33 event_size:79 log_pos:3995BEGIN;# time:2025-11-20 17:06:42 server_id:866003314 event_type:29 event_size:108 log_pos:4177insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:42 server_id:866003314 event_type:16 event_size:31 log_pos:5018COMMIT;# time:2025-11-20 17:06:43 server_id:866003314 event_type:33 event_size:79 log_pos:5097BEGIN;# time:2025-11-20 17:06:43 server_id:866003314 event_type:29 event_size:108 log_pos:5279insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:43 server_id:866003314 event_type:16 event_size:31 log_pos:6824COMMIT;# time:2025-11-20 17:06:43 server_id:866003314 event_type:33 event_size:79 log_pos:6903BEGIN;# time:2025-11-20 17:06:43 server_id:866003314 event_type:29 event_size:108 log_pos:7085insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:43 server_id:866003314 event_type:16 event_size:31 log_pos:10038COMMIT;# time:2025-11-20 17:06:44 server_id:866003314 event_type:33 event_size:79 log_pos:10117BEGIN;# time:2025-11-20 17:06:44 server_id:866003314 event_type:29 event_size:108 log_pos:10299insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:44 server_id:866003314 event_type:16 event_size:31 log_pos:16068COMMIT;# time:2025-11-20 17:06:44 server_id:866003314 event_type:33 event_size:79 log_pos:16147BEGIN;# time:2025-11-20 17:06:44 server_id:866003314 event_type:29 event_size:108 log_pos:16329insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:44 server_id:866003314 event_type:16 event_size:31 log_pos:27765COMMIT;# time:2025-11-20 17:06:45 server_id:866003314 event_type:33 event_size:79 log_pos:27844BEGIN;# time:2025-11-20 17:06:45 server_id:866003314 event_type:29 event_size:108 log_pos:28026insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:45 server_id:866003314 event_type:16 event_size:31 log_pos:50761COMMIT;# time:2025-11-20 17:06:45 server_id:866003314 event_type:33 event_size:79 log_pos:50840BEGIN;# time:2025-11-20 17:06:45 server_id:866003314 event_type:29 event_size:108 log_pos:51022insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:45 server_id:866003314 event_type:16 event_size:31 log_pos:96390COMMIT;# time:2025-11-20 17:06:46 server_id:866003314 event_type:33 event_size:80 log_pos:96470BEGIN;# time:2025-11-20 17:06:46 server_id:866003314 event_type:29 event_size:108 log_pos:96652insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:46 server_id:866003314 event_type:16 event_size:31 log_pos:187286COMMIT;# time:2025-11-20 17:06:46 server_id:866003314 event_type:33 event_size:80 log_pos:187366BEGIN;# time:2025-11-20 17:06:46 server_id:866003314 event_type:29 event_size:108 log_pos:187548insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:46 server_id:866003314 event_type:16 event_size:31 log_pos:368679COMMIT;# time:2025-11-20 17:06:47 server_id:866003314 event_type:33 event_size:80 log_pos:368759BEGIN;# time:2025-11-20 17:06:47 server_id:866003314 event_type:29 event_size:108 log_pos:368941insert into db1.t20251120_rows_query(name) select name from db1.t20251120_rows_query;# time:2025-11-20 17:06:47 server_id:866003314 event_type:16 event_size:31 log_pos:731066COMMIT;# time:2025-11-20 17:06:51 server_id:866003314 event_type:33 event_size:79 log_pos:731145BEGIN;# time:2025-11-20 17:06:51 server_id:866003314 event_type:29 event_size:69 log_pos:731288delete from db1.t20251120_rows_query limit 10;# time:2025-11-20 17:06:51 server_id:866003314 event_type:16 event_size:31 log_pos:731535COMMIT;</code>效果很满意, 下次还会回购
看起来是达到了我们要的效果, 那本该有的一堆堆的数据没了,只剩下我们需要的业务SQL, 而且解析很快,资源也几乎不消耗(除非业务SQL很多),tmpdir也不需要设置了,简直完美! 剩下的就是分析了(分析其实也可以使用脚本的).
本次需求不复杂,只是查看下binlog中记录的业务SQL而已, 所以能很快的编写相关脚本来实现, 前提是得熟悉binlog的结构, 也就是打好基础很重要!
![[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?](https://img.php.cn/upload/article/001/503/042/176415090591952.jpg)
参考: https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html
以上就是[MYSQL] 1分钟1GB binlog的超密集型日志怎么解析?的详细内容,更多请关注php中文网其它相关文章!
每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号