<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KVGHS6G" height="0" width="0" style="display:none;visibility:hidden"></iframe>
Career transition : 21 Tips for SQL Performance Optimization (Part 2)
Career transition : 21 Tips for SQL Performance Optimization (Part 2)
篱笆资讯
Career transition : 21 Tips for SQL Performance Optimization (Part 2)
11、operate delete or update statement, add a limit or loop to delete in batches

1、Reduce the cost of writing the wrong SQL
 
Emptying table data is not a little things, a shaky hand all gone, delete the library and run away? If you add limit, delete by mistake just lose part of the data, you can quickly recover through the binlog log.
 
2、SQL efficiency is likely to be higher
 
With the addition of limit 1 in SQL, if the first one hits the target return, without the limit, it will continue to execute the scan table.
 
3、Avoid long transactions
 
When delete is executed, if age adds an index, MySQL will add write locks and gap locks to all relevant rows, and all execution-related rows will be locked, and if the number of deletions is large, it will directly cause the related business to be unusable.
 
4. If the data volume is large, it is easy to fill up the CPU
 
If you delete a large amount of data, do not add a limit to limit the number of records, it is easy to hit the cpu full, resulting in the more slow deletion.
 
5、Lock table
 
Deleting too much data at once may cause lock table, there will be lock wait timeout exceeded error, so it is recommended to operate in batches.
 
12、UNION operator
 
UNION will filter out duplicate records after table linking, so the resulting result set will be sorted and operated after the table linking to remove duplicate records before returning the results.
 
In practice, most applications do not generate duplicate records, the most common is the process table and the history table UNION.
 
1| select username,tel from user
2| union
3| select departmentname from department
 
This SQL first takes out the results of the two tables at runtime, and then sorts them with sorting space to remove duplicate records, and finally returns the result set, which may lead to sorting with disk if the table data quantity is large.
 
Recommended programme: Use UNION ALL operator instead of UNION, because UNION ALL operation simply combines the two results and returns later.
 
13、 Batch insertion performance improvement
 
1、Multiple submissions
1| INSERT INTO user (id,username) VALUES(1,'SQL');
2|
3| INSERT INTO user (id,username) VALUES(2,'SUCCESS');
 
2、Batch submission
1| INSERT INTO user (id,username) VALUES(1,'SQL'),(2,'SUCCESS');
 
3、Reason
Default added SQL has transaction control, resulting in each one needs the start of the transaction and commitment of the transaction, while batch processing is a the start of the transaction and commitment of the transaction, efficiency promotion is obvious, to a certain degree of volume, the results are significant, usually invisible.
 
14、 Not too many table joins, not too many indexes, generally within 5
 
1、Table joins should not be too many, generally within 5
 
1.      The more the number of linked tables, the more the compilation time and expense will be
2.      A temporary table is generated in memory for each connection
3.      The connection table should be broken up into smaller executions for higher readability
4.      If you have to join many tables to get the data, it means it's a bad design
5.      In Ali specification, it is suggested to check three tables or less for multi-table joins
 
2, the index should not be too many, generally within 5
 
1.      index is not the more the better, although it improves the efficiency of the query, but will reduce the efficiency of insertion and update.
2.      index can be understood that one is a table, which can store data, its data will take up space.
3.      index table data is sorted, sorting is also time consuming.
4.      insert or update when it is possible to rebuild the index, if the data size is huge, the reconstruction will be carried out to reorder the records, so building an index needs to be carefully considered, depending on the specific situation.
5.      the number of indexes in a table should better not exceed 5, if there are too many you need to consider whether some indexes are necessary to exist.
 
15、Avoid using built-in functions on index columns

1、counter example
1| SELECT * FROM user WHERE DATE_ADD(birthdate,INTERVAL 7 DAY) >=NOW();
 
2、Positive example
1| SELECT * FROM user WHERE birthday >= DATE_ADD(NOW(),INTERVAL 7 DAY);
 
3、The reason
Use the built-in function on the index column, the index fails.
 
16、Combined indexes
Sorting should be done in the order of the columns in the combined index, even if only one column in the index is to be sorted, otherwise the sorting performance will be worse.
 
1|create index IDX_USERNAME_TEL on user(deptid,position,createtime);
2|select username,tel from user where deptid= 1 and position = 'java developer' order by deptid,position,createtime desc; 
 
Actually just query the records that meet the deptid= 1 and position = 'java development' condition and sort them in descending order by createtime, but write it as order by createtime desc for poor performance.
 
17、 composite index leftmost feature
 
1、create a composite index
ALTER TABLE employee ADD INDEX idx_name_salary (name,salary)
 
2、 to meet the leftmost characteristic of the composite index, even if only partially, the composite index takes effect
SELECT * FROM employee WHERE NAME='which Zha programming'
 
3、there is no left field, it does not satisfy the leftmost characteristic, the index is invalid
SELECT * FROM employee WHERE salary=5000
 
4、the composite index is fully used, according to the left order of appearance name,salary, and the index takes effect
SELECT * FROM employee WHERE NAME='which Zha programming' AND salary=5000
 
5、Although the leftmost characteristic is violated, MySQL will optimize the execution of SQL, and bottom layer is reversed to optimize
SELECT * FROM employee WHERE salary=5000 AND NAME='Nezha Programming'
 
6、Reason
Composite index is also called a joint index, when we create a joint index, such as (k1,k2,k3), which is equivalent to creating (k1), (k1,k2) and (k1,k2,k3) three indexes, which is the principle of the leftmost match.
 
The joint index does not satisfy the leftmost principle and the index will generally fail.
 
18、 optimize like statement
Fuzzy queries, programmers prefer to use like, but like is likely to invalidate your index.
 
1、counter example
 
select * from citys where name like '%Dalian' ( do not use index )
select * from citys where name like '%Dalian%' ( do not use index )
 
2. Positive example
select * from citys where name like '%Dalian' (use index).
 
3、Reason
• First try to avoid fuzzy queries, if you have to use, do not use the full fuzzy query, should also try to use the right fuzzy query, that is, like '...%', is going to use the index.
• Left fuzzy like '%...' cannot use the index directly, but can be changed into like '...%' by using the form reverse + function index.
• Full fuzzy query is not optimized, must be used if you recommend the use of search engines.
 
19、 using explain to analyze your SQL execution plan
1、type
 
1.      system: only one row of the table, basically not used.
2.      const: table with a maximum of one row of data, with more triggers when querying the primary key.
3.      eq_ref: for each combination of rows from the previous table, read a row from that table. This is probably the best join type, except for the const type.
4.      ref: for each combination of rows from the preceding table, all rows with matching index values will be read from this table.
5.      range: only rows in the given range are retrieved, using an index to select rows. range can be used when comparing keyword columns with constants using the =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN or IN operators.
6.      index: This join type is the same as ALL, except that only the index tree is scanned. This is usually faster than ALL because index files are usually smaller than data files.
7.      all: a full table scan.
8.      Performance ranking: system > const > eq_ref > ref > range > index > all.
9.      In actual sql optimization, the ref or range level is achieved last.
 
2、Extra common keywords
• Using index: only getting information from the index tree, without querying back to the table.
• Using where: The WHERE clause is used to restrict which row matches the next table or is sent to the client. Unless you specifically request or check all rows from the table, the query may have some errors if the Extra value is not Using where and the table join type is ALL or index. You need to query back to the table.
• Using temporary: mysql often builds a temporary table to contain the results, typically when the query contains GROUP BY and ORDER BY clauses that can list columns by case.
 
20、some other optimizing way
1, when designing the table, all tables and fields are added with the corresponding comments.
 
2, SQL writing formats, keyword size to stay the same, use retract.
 
3、Before modifying or deleting important data, make a backup.
 
4、Many times using exists instead of in is a good choice
 
5、The fields after where, pay attention to the implicit conversion of its data type.
 
Not using indexes
1| SELECT * FROM user WHERE NAME=110
 
(1) Because without the single quotes, it is a comparison between character strings and numbers, and their types do not match
(2) MySQL will do an implicit type conversion to convert them to numeric types and then compare them
 
6. Try to define all columns as NOT NULL
 
NOT NULL columns are more space-saving, and NULL columns need an extra byte as a Flags to determine whether they are NULL or not.
NULL columns need to pay attention to the null pointer problem, NULL columns need to pay attention to the null pointer problem when calculating and comparing.
 
7, pseudo-delete design
 
8, the database and table character set as far as possible to unify the use of UTF8
 
(1) can avoid the problem of NetBeans.
(2) can avoid, different character set for the conversion of comparison, resulting in the index failure problem.
 
9. select count(*) from table.
 
 Such a count without any conditions will cause a full table scan, and there is no business sense, is a must to eliminate.
 
 10、Avoid expression operations on fields in where
 
 (1) SQL analysis, if the field is related to the expression on the full table scan
 (2) clean fields without expressions, the index takes effect
 
 11、About temporary table
 
 (1) avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.
 (2) in the new temporary table, if a one-time insertion of a large amount of data, then you can use select into instead of create table, to avoid creating art5r into a large number of log.
 (3) If the amount of data is not large, in order to ease the resources of the system table, you should first create table, and then insert.
 (4) If temporary tables are used, be sure to delete all temporary tables explicitly at the end of the procedure. First truncate table, and then drop table, so that you can avoid a longer period of time to lock the system table.
 
 12、indexes are not suitable for building on fields with a lot of duplicate data, such as gender, sorting fields that should be created index
 
13、Remove distinct filter fields to less
 
 1.The statement with distinct occupies more cpu time than the statement without distinct
 2.When querying many fields, if you use distinct, the database engine will compare the data and filter out duplicate data
 3.However, this comparison, filtering process will take up system resources, such as cpu time
 
 14、try to avoid large transaction operations to improve the concurrency of the system
 
 15、all tables must use Innodb storage engine
 
 Innodb "support transactions, support row-level locking, better recovery", better performance under high concurrency, so it is no special requirements (that is, Innodb can not meet the function such as: column storage, storage space data, etc.), all tables must use Innodb storage engine.
 
 16、try to avoid using cursors
 
 Because the cursor efficiency is poor, if the cursor operation data more than 10,000 lines, then you should consider rewriting.
 
coffee 直连行业大牛导师,1v1模拟面试与求职指导
mentors
airplay 实战与求职精品课程
数据科学
软件工程
人工智能
金融商科
产品经理
产品设计
bookmark 2000+名企面试真题
amazon google tiktok microsoft meta