DB performance troubleshooting workflow using Percona PMM
Some time you may face with the situation when your DB became slow. You may notice it during developing your system. Maybe database on staging worked fine, and after deploy to prod you realised that is works not so good as you expected. Or even maybe you didn’t make any changes and the performance of your DB started degrading. In any case you probably would want to get to know the reason, to improve performance and to make your app or infrastructure better. And PMM can help you to investigate this situation. It gives you very informative visualisation of your databases state and tools for more deep analysis of how they are working.
Let’s try to troubleshoot performance issues, using Percona Monitoring and Management. There are two good entry points to start your investigation.
Step 1. Check your resource limits
They are hardware resources and network limits. First of all take a look at Network IO dashboard on your home dashboards page.
If you see that it is very close to your network throughput that means you should expand the your channel. The next thing is to pay attention to CPU and Memory usage. You may see the whole dashboard, and if you discovered that almost all resources are utilised, you may get some more details about the load time and correlate it with some events in your system.
The same with Memory Limits. Another common limit is a disk speed. If you see that read or write is close to disk limit, probably you should change it for faster one.
Step 2. Explore your DB work
If you just build your app and faced with resources limitation, the only way for you is to tune your hardware. But what if your current system performance began degrade without any changes from your side? Or maybe everything is OK with resources limits, but your database is still slow? In this case lets deep dive into database metrics.
First of all take a look on the database connections dashboard. If you see significant increase of connections try to realise if these are new customers, or maybe some bugs in your system. One more important metric is the number of queries per second. If you see that its number is much higher than usual, compare it with the number of connections. Increasing on connections number logically will increase the number of queries. But if you see that the number of queries per second is too high with the same number of connections, you should explore your system to find the reason of such behaviour.
Increasing of customers number is a good sign for business, and if you see that your database fails to cope with it, it means that it is time to scale your system. But increasing of connections and queries numbers can also indicate that you have some bugs in your system, like unclosed connection, duplicated queries etc.
Step 3. Deep dive into queries
It is good if you have solved performance issues an didn’t get to this step. But what if your resources are fine, connections and queries are also in normal range, but you still face with performance issues. That is a time to deep dive into exploring of queries. PMM gives you good tool to do this — Query Analytics
On this dashboard you may see queries, ranked by Load. Load is a metric which defined as Average Active Queries which is a mix of query time execution and the query count. And you can see that possibly some too heavy queries executed too many times. Inspect them, try to understand if you can optimise query expressions or maybe the frequency of their execution. Try to specify query parameters and use received data more effectively.
To conclude, we may say that PMM can’t solve your database performance issues. But it gives you very powerful tools to determine issues reasons. Using it you may find the weakest places of your system and improve them the most effectively and economically.