As it is a Friday, things got a little bit slow. Maybe it was intentional, or maybe it’s just how my brain operates. I do know it is a wrong mindset. I shouldn’t be looking forward to a weekend. If I do, it means I’m not enjoying my work. Either way, I shall talk about my day, what I have done and what I have learnt.
First of all, all of our SIT issues were cleared during the week. So my plan for the day was to reply to the product support cases that I have opened with the relevant logs and screenshots and also clear some of my open tickets in JIRA.
The first thing I did was to bundle up the necessary logs and screenshot that I have identified the day before that showed proof of what I’m talking about. Then I uploaded them as a zip file to the support site. Between 10am and 10.30am, two of my colleagues arrived. They were mostly free. So I got one of them to help me with fixing the bug logged in JIRA. It was a quick and simple fix. Just add an additional event that will serve as one of the triggers for the business rule. He did that and tested it. It works. So the ticket was close.
In the meantime, I shifted my focus to another support case and that was regarding the super slow performance of the application when loading accesses. Based on the comments provided by the support person, I attempted to tune the application servers. It mostly involved me tweaking the connection pool for both database and JMS messaging. Tried several combinations and after several server restarts, the applications were still performing badly.
Then it was lunch. Rain was pouring down too. We went for lunch at around 1.30pm.
After lunch, I came back and decided to tweak the Admin Task, which represented actions that users can perform in the identity application. The Product Support team as well as the Documentation on the official site did mention that those Admin Tasks will cause extremely slow loading when there multiple tabs enabled. Each tab basically represent a screen or a set of events– think event-driven. So the best way to optimize was to remove the unecessary tabs. And why was it slow? It’s because some of the tabs are actually related to access and roles. Everytime the task is loaded, the application check whether the current user is authorized to perform actions related to each of these tabs against the database. By removing them, especially if they are not necessary, will remove the need for peforming such checks. Anyone who has been working with databases and applications long enough will know that repeated database access is slow.
Before long, it was 3pm. Another colleague joined in. So now there are four of us in that room. The third colleague’s purpose for being here is to troubleshoot the database performance with regards to reports generation. He suspected that there was CPU throttling. From time to time in the past few months, when he tried to run database queries, he noticed that the CPU usage never quite go beyond 30%. Sometimes it just work fine and the report is generated. As it is causing problems when generating Access and User Acocunt reports, a decision was made to work with the customer’s System Admin team and Database Admin team so that they can monitor the server while he did the testing. This time, for some reason, there was no longer any such signs of CPU throttling. Most of us were there when the reports were being generated for the past few months. There were indeed signs of throttling. So we suspected that the customer’s team did a silent fix. Reports are now being generated smoothly. There was nothing else he or we can do about it. There was no further proof of the throttling.
And back to what I was doing. I was removing tabs from the Admin Task. At first, I removed a few irrelevant tabs from a few of the Admin Tasks that we are using to implement our solution in the identity manager. Then I restarted the connector. The restart process was slow and so I spent most of my time waiting. Once the connector is started, I begin testing by loading the Access module with a test user. Everytime I did that, I used the stopwatch on my phone to time the loading process. Then by 5.20, it was clear that the earlier tuning I did on the application server may be causing the performance issue and so I start dropping the number of database connections available in the initial pool size as well as dropping the JMS pool size. Then I went on to remove all the unnnecessary tabs from the remaining Admin Tasks that we used in our solution.
By 5.30, two of my colleagues decided to leave as they no longer have anything else to do. As for me, I continued to tweak and by around 6.00pm, I managed to reduce the load time by 20 seconds. It wasn’t much considering that the overall load time was still beyond a minute. By that was still some improvement. So I decided to call it a day. It was the same with the other colleague who had switched to working on his other project after the testing with the database server yielded no proof of CPU throttling.
So until now, what’s the lesson learnt? There is one take away. When it comes to performance tuning, it is tedious and painful. But you have to know what you are looking for and how the different components interact with each other and only then can you know where to start looking and do the necessary tunings. It requires meticulous time keeping and recording of the steps taken so that you can reproduce the problem as well as know how to reverse to and restart from some kind of logical checkpoint.
Here I conclude my journal for today.