一、日志/data/clustrix/log/query.log
记录节点慢SQL/错误SQL/DDL 等信息,节点分开记录
Each entry in the query.log is categorized as one of these types. Specific logging for each query type is controlled by the global or session variable indicated.
Query Type |
Description |
---|---|
ALTER CLUSTER | Changes made to your cluster via the ALTER CLUSTER command are always logged to the query.log automatically. This logging is not controlled by a global variable. |
BAD | The query reads more rows than necessary to return the expected results. This may indicate a bad plan or missing index. Logging of BAD queries is not enabled by default (session_log_bad_queries). |
DDL | The query is DDL (i.e. schema change such as CREATE, DROP, ALTER), or a SET GLOBAL or SESSION command. All DDL queries are initially logged by default (session_log_ddl). |
SLOW | Query execution time exceeded the threshold specified by the variable session_log_slow_threshold_ms. |
SQLERR | These database errors are things such as syntax errors, timeout notifications, and permission issues. All SQLERR queries will be logged by default (session_log_error_queries). |
These are the variables that control query and user logging. The defaults shown are generally acceptable for most installations.
Name |
Description |
Default Value |
Session Variable |
---|---|---|---|
session_log_bad_queries | Log BAD queries to the query.log | false |
|
session_log_ddl | Log DDL statements to query.log | true | |
session_log_error_queries | Log ERROR statements to query.log | true | |
session_log_slow_queries | Log SLOW statements to query.log | true | |
session_log_slow_threshold_ms | Query duration threshold in milliseconds before logging this query | 10000 |
|
session_log_users | Log LOGIN/LOGOUT to user.log | false |
MySQL [(none)]> show variables like '%session_log%';
+--------------------------------+--------+
| Variable_name | Value |
+--------------------------------+--------+
| session_log_bad_queries | false |
| session_log_bad_read_ratio | 100 |
| session_log_bad_read_threshold | 4000 |
| session_log_ddl | true |
| session_log_error_queries | true |
| session_log_slow_queries | true |
| session_log_slow_threshold_ms | 100000 |
| session_log_users | false |
+--------------------------------+--------+
8 rows in set (0.01 sec)
修改慢SQL时间,超过1s的SQL都被记录
MySQL [(none)]> set global session_log_slow_threshold_ms=1000;
Query OK, 0 rows affected (0.01 sec)
MySQL [(none)]> show global variables like 'session_log_slow_threshold_ms';
+-------------------------------+-------+
| Variable_name | Value |
+-------------------------------+-------+
| session_log_slow_threshold_ms | 1000 |
+-------------------------------+-------+
查看慢SQL日志(集群所有的操作):
MySQL [(none)]> select * from tail_query_log order by timestamp desc limit 10;
二、 日志 /data/clustrix/log/user.log
监控用户登录登出状态,单个节点文件记录单个节点信息,查看集群,system.tail_user_log表中查询。
默认关闭监控用户状态
MySQL [system]> show variables like 'session_log_users';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| session_log_users | false |
+-------------------+-------+
1 row in set (0.01 sec)MySQL [system]> set global session_log_users=1;
Query OK, 0 rows affected (0.51 sec)MySQL [system]> show variables like 'session_log_users';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| session_log_users | true |
+-------------------+-------+
查看集群用户登录登出状态
MySQL [system]> select * from system.tail_user_log;
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
| nodeid | timestamp | num | hostname | command | message | repeated |
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
| 1 | 2019-11-26 07:22:23.396171 | 0 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:16385 db="system" user=root@localhost LOGOUT | 0 |
| 1 | 2019-11-26 07:22:24.371812 | 1 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:17409 db=#undef user=root@localhost LOGIN | 0 |
| 1 | 2019-11-26 07:22:54.216004 | 2 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:18433 db=#undef user=root@localhost LOGIN | 0 |
| 1 | 2019-11-26 07:23:05.066316 | 3 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:18433 db=#undef user=root@localhost LOGOUT | 0 |
| 2 | 2019-11-26 07:23:07.523621 | 0 | nid | NULL | 2 ip-10-1-3-242.cn-northwest-1.compute.internal clxnode: USER SID:6146 db="system" user=root@localhost LOGOUT | 0 |
| 2 | 2019-11-26 07:23:08.049621 | 1 | nid | NULL | 2 ip-10-1-3-242.cn-northwest-1.compute.internal clxnode: USER SID:7170 db=#undef user=root@localhost LOGIN | 0 |
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
6 rows in set (0.00 sec)
三、Rebalance 配置
查看rebalance状态
sql> select * from system.rebalancer_activity_log order by started desc limit 10;
提高rebalance性能
sql> set global rebalancer_rebalance_task_limit = 8; sql> set global rebalancer_vdev_task_limit = 4; sql> set global task_rebalancer_rebalance_distribution_interval_ms = 5000; sql> set global task_rebalancer_rebalance_interval_ms = 5000;
如果出现负载过高
sql> set global rebalancer_rebalance_task_limit = default;sql> set global rebalancer_vdev_task_limit = default; sql> set global task_rebalancer_rebalance_distribution_interval_ms = default; sql> set global task_rebalancer_rebalance_interval_ms = default;
四、查看那些配置不是默认值
sql> select * from system.global_variable_definitions where current_value != default_value;
五、监控SQL
列出消耗CPU最大的3个历史SQL
SELECT nodeid, exec_count, exec_ms, exec_ms/exec_count as avg_ms, left(statement,100) FROM system.qpc_queries ORDER BY exec_ms desc LIMIT 3;
列出过去24小时内最频繁运行的100个SQL
sql> SELECT query_key,
min(rank), max(rank), database, left(statement,100), sum(exec_count) as calc_exec_count, round(avg(avg_rows_read)) as calc_avg_rows_read, round(avg(avg_exec_ms)) as calc_avg_exec_ms FROM clustrix_statd.qpc_history WHERE timestamp BETWEEN (now() - interval 24 hour) AND now() AND database !='clustrix_statd' GROUP BY query_key ORDER BY calc_exec_count DESC, calc_avg_rows_read DESC LIMIT 100;
列出过去24小时内返回行数最多的100个SQL
sql> SELECT query_key, min(rank), max(rank), database, left(statement,100), sum(exec_count) as calc_exec_count, round(avg(avg_rows_read)) as calc_avg_rows_read, round(avg(avg_exec_ms)) as calc_avg_exec_ms FROM clustrix_statd.qpc_history WHERE timestamp BETWEEN (now() - interval 24 hour) AND now() AND database !='clustrix_statd' GROUP BY query_key ORDER BY calc_avg_rows_read DESC, calc_exec_count DESC LIMIT 100;
列出过去24小时内返回运行时间最长的100个SQL
sql> SELECT query_key, min(rank), max(rank), database, left(statement,100), sum(exec_count) as calc_exec_count, round(avg(avg_rows_read)) as calc_avg_rows_read, round(avg(avg_exec_ms)) as calc_avg_exec_ms FROM clustrix_statd.qpc_history WHERE timestamp BETWEEN (now() - interval 24 hour) AND now() AND database !='clustrix_statd' GROUP BY query_key ORDER BY calc_avg_exec_ms DESC, calc_exec_count DESC LIMIT 100;
六、system's tables
MySQL [(none)]> use system;
Database changed
MySQL [system]> show tables;
+-----------------------------------+
| Tables_in_system |
+-----------------------------------+
| activity |
| alerts_intervals |
| alerts_messages |
| alerts_parameters |
| alerts_subscriptions |
| alter_progress |
| autoinc_sequences |
| backups |
| backup_masters |
| backup_status |
| backup_tables |
| barriers |
| base_allocators |
| bigc_state |
| binlogs |
| binlog_commits |
| binlog_commits_segments |
| binlog_ignore_databases |
| binlog_ignore_tables |
| binlog_log_databases |
| binlog_log_tables |
| binlog_segments |
| bm_latch_waits |
| bm_stats |
| broadcast_nodes |
| check_constraints |
| cluster_session_stats |
| cluster_session_variables |
| columns |
| constraints |
| containers |
| container_stats |
| container_truncates |
| container_type_codes |
| cpm_history |
| cpm_info |
| cpu_activity |
| cpu_allocations |
| cpu_load |
| databases |
| debugpoints |
| deferred_foreign_keys |
| deferred_foreign_key_columns |
| definers |
| device_containers |
| device_containers2 |
| device_space_stats |
| disks |
| disk_activity |
| disk_paths |
| dlog_stats |
| dropped_binlogs |
| engines |
| error_codes |
| event_map |
| failpoints |
| fibers |
| flow_control_channels |
| flow_control_peers |
| foreign_keys |
| foreign_key_columns |
| global_stats |
| global_variables |
| global_variables_ignored |
| global_variable_definitions |
| groups |
| gtm_accepter |
| gtm_coord |
| gtm_coord_invocations |
| gtm_ddl |
| gtm_ddl_server |
| gtm_poisoned_invoc |
| gtm_poisoned_trx |
| gtm_repick_accepters |
| gtm_resolver |
| gtm_send_queues |
| hash_distribution_map |
| heaps |
| imported_index_pds |
| index_sizes |
| index_stats |
| init_graph_stats |
| internal_routines |
| internode_latency |
| invocations |
| irp_queues |
| key_caches |
| layercons |
| layers |
| layer_merges |
| license |
| load |
| lockman |
| lockman_holders |
| lockman_victims |
| lpdcache |
| lpds |
| ltm_transactions |
| mdstat |
| media_scanners |
| membership |
| memory_table_replicas |
| missing_pds |
| missing_pd_columns |
| missing_pd_details |
| mounts |
| mvcc_waiters |
| mysql_binlogs |
| mysql_binlog_index |
| mysql_binlog_segments |
| mysql_binlog_stats |
| mysql_binlog_trims |
| mysql_character_sets |
| mysql_collations |
| mysql_db_replication_policy |
| mysql_error_codes |
| mysql_indexed_binlogs |
| mysql_master_status |
| mysql_registered_slaves |
| mysql_repconfig |
| mysql_repslave_svars |
| mysql_repstate |
| mysql_repstate_until |
| mysql_sessions |
| mysql_slave_connection_status |
| mysql_slave_db_replication_policy |
| mysql_slave_driver_status |
| mysql_slave_log_updates |
| mysql_slave_rewrite_db |
| mysql_slave_skip_errors |
| mysql_slave_stats |
| mysql_slave_status |
| mysql_slave_variables |
| mysql_table_replication_policy |
| named_locks |
| networking |
| network_activity |
| nodeinfo |
| nodes |
| objects |
| partitioned_hash_distributions |
| partitions |
| partition_endpoints |
| partition_functions |
| pdcache_raw |
| pdms |
| pds |
| pd_groups |
| pd_group_columns |
| pending_invites |
| periodic_tasks |
| poisoned_barriers |
| privileges |
| problem_nodes |
| processlist |
| processlistfull |
| proc_cpu |
| proc_cpu_rates |
| proc_diskstats |
| proc_diskstat_rates |
| proc_interrupts |
| proc_interrupt_rates |
| proc_meminfo |
| proc_net_dev |
| proc_net_dev_rates |
| proc_softirqs |
| proc_softirq_rates |
| profiled_invocations |
| profiled_plans |
| profiled_statements |
| profiled_til |
| profiled_transactions |
| profiling |
| program_cache |
| protection_log |
| ps |
| public_keys |
| qpc_lru |
| qpc_plans |
| qpc_queries |
| queues |
| queue_readers |
| queue_replays |
| queue_replay_streams |
| queue_replay_waiters |
| queue_status |
| range_hash_distributions |
| rebalancer_activity_log |
| rebalancer_activity_targets |
| rebalancer_copies |
| rebalancer_copy_activity |
| rebalancer_copy_work_queue |
| rebalancer_hash_distributions |
| rebalancer_queued_activity |
| rebalancer_redistributes |
| rebalancer_replicas |
| rebalancer_representations |
| rebalancer_scheduled_replicas |
| rebalancer_slices |
| rebalancer_splits |
| rebalancer_started_activity |
| rebalancer_summary |
| rebalancer_vdevs |
| redistributes |
| relations |
| relation_alters |
| relation_alter_columns |
| relation_build_status |
| relation_sizes |
| replicas |
| replicated_containers |
| replication_checkpoint |
| replication_master_status |
| replica_copies |
| replica_sizes |
| replica_status_codes |
| representations |
| representation_builds |
| representation_columns |
| representation_sizes |
| representation_stats |
| restore_database_builds |
| restore_object_builds |
| ril_cache_p |
| ril_stats |
| routines |
| routine_parameters |
| sequences |
| sequence_state |
| sessions |
| session_call_stacks |
| session_containers |
| session_local_variables |
| session_pdcache |
| session_row_stats |
| sighandlers |
| sighandlers_json |
| signals |
| skiplists |
| slave_row_stats |
| slave_slices |
| slave_slice_readers |
| slave_slice_relations |
| slave_slice_writers |
| slices |
| slice_splits |
| softfailed_devices |
| softfailed_nodes |
| softfailing_containers |
| sqlstates |
| stats |
| strmaps |
| suspended_closures |
| system_users |
| sys_block_devs |
| tables_and_views |
| table_pdms |
| table_replicas |
| table_sizes |
| table_slices |
| tail_clustrix_log |
| tail_nanny_log |
| tail_query_log |
| tail_user_log |
| tail_webui_log |
| tasks |
| task_placement |
| tcp |
| temporary_tables |
| time_zones |
| tracepoints |
| transactions |
| triggers |
| trigger_event |
| trigger_orientation |
| trigger_timing |
| trxstate |
| trxstate_stats |
| underprotected_slices |
| users |
| user_accessible_databases |
| user_accessible_tables |
| user_accessible_triggers |
| user_accessible_users |
| user_acl |
| user_routine_acl |
| vdev_io |
| vdev_stat |
| version_history |
| views |
| virtual_relations |
| virtual_views |
| vmstats |
| wals |
| wal_windows |
+-----------------------------------+
295 rows in set (0.14 sec)