I'm one of the fans of open source Database Management System (DBMS), called MySQL. The main reasons why I like MySQL are: free, easy to use, stable and powerful. Some people will doubt about The latter twos, but indeed MySQL is stable and powerful. I use MySQL a lot in company, not because my company cannot afford commercial DBMS, but simply because MySQL is really a fast and dependable solution, as my department often assigned to do projects that have tight deadline.
But still MySQL, yet powerful, is not like commercial DBMS like Oracle. MySQL needed to be used properly if you want to get maximum performance (although commercial DBMS too :p no I mean in simplest things). Take Indexes for example. In MySQL too we have indexes, but maybe some people haven't notice how MySQL handles indexes.
To start learning about MySQL indexes, let's see below query
select * from car where color='red'
If you have around 100 records in car table, it's fine not to have index. But how about if you have around millions of records, or maybe billions?. Index is a must. General rule in database world is that you should have index on column which you have query state that column in where condition. Like above example, you should index on color column, moreover if you do the query a lot, and the table itself has lot of records. The other rule we should have index on column, is the column which involves in join query (mostly, in foreign key to other table).
select * from car a inner join company b on a.id_company=b.id
Then, you should have index on id_company column too.
Wait wait wait...
What you mean by 'should have index on id_company column too' ?? It means, we should have two indexes, one for color column, and the other one for id_company.
Simple like that :D ? Of course no, there someother considerations on MySQL indexes. There will be also compound index, that is index in more than one column. Please note that is different the previous case (two single indexes vs compound index)
- Two single indexes == We have two indexes, that is one for column A, and the other for column B (like the previous case)
- One compound index == We have one index, that is applied to two or more columns, let's say column A and column B
Got the differences??
Let say you have query:
1. select * from car where color='red'
2. select * from car where color='red' and id_company=14
3. select * from car where color='red' and id_company=14 and production_year=2000
A. Indexed column(s) : [color]
Let's say you have index only on column color.
Then the index will apply on three queries above, but only part applied to Query 2 and Query 3. Why? Because the indexed column is only color. Thus, in Query 2 case, MySQL still has to do full table scan (after applying index color) to find records that satisfy where id_company=14 statement
Conclusion for Indexed column A:
Query 1 --> full index use
Query 2 --> part index use
Query 3 --> part index use
B. Indexed column(s) : [color, id_company] (compound index)
Same as above analogy, the index will be fully used in Query 1 and Query 2. But for Query 3, the index will only partly use to find records, as column production_year is not indexed.
There maybe will come a question regarding whether Query 1 will benefit from index color and id_company. There answer is yes, because the query will use the first part of compound index, that column color to find the records
Conclusion for Indexed column B:
Query 1 --> full index use
Query 2 --> full index use
Query 3 --> part index use
C. Indexed column(s) : [color], [id_company] (two single indexes)
For Query 1, surely this case will benefit from the two single indexes. But how about Query 2? Yes it will also benefit, but not as good as B. Indexed column case. Why? Because the MySQL can only use one kind of index per query. As for case C, we have two indexes, then only one index will be used. This is the one of the limitations for MySQL DBMS. The same rule apply to Query 3.
Conclusion for Indexed column C:
Query 1 --> full index use
Query 2 --> part index use
Query 3 --> part index use
D. Indexed column(s) : [color, id_company, production_year] (compound indexes)
This index Case, is the best solution for the queries stated above. All queries will utilize the index that Case D has.
Conclusion for Indexed column D:
Query 1 --> full index use
Query 2 --> full index use
Query 3 --> full index use
Fiuhhh, so far we have Case D as the best solution. But how about we have another query to challenge??
4. select * from car where production_year=2000
The answer? If you only have compund index on [color, id_company, production_year], then to solve query, MySQL needs to do full table scan!!! Why?? Because production_year is the last part of the compound index (the order does matter!!!). So MySQL cannot use the index. It's different from Query 1 : select * from car where color='red'. The color is the first part of compound index, and it can utilized. So the solution?? You will need additional index on column production_year. So, will need to have indexes like this
- [color, id_company, production_year]
- [production_year]
Index on modifier function used in where
The last lesson (cieh :p) that we need to learn from this session, is that MySQL is not smart enough to apply the index on modifier function used in where statement.
Let's say the car table has another column, called sale_date. The sale_date is datetime SQL type. You have single index on column sale_date. . Then you have SQL query like below
- select * from car where year(sale_date)='2006'
Will the sale_date index be used? The answer is no!!
MySQL is not smart enough to do that. The solution? Never use the modifier function on where statement if you want to make use of index. So, change the SQL query into
- select * from car where sale_date between '2006-01-01' and '2006-12-31'
Then the index will be fully utilized :)