Oracle performance tuning – an update

A tutorial is now available to perform Oracle performance tuning of applications and SQL statements. This tutorial has been expanded to include case studies, which will go a long way in better understanding of concepts explained.

Link to the PDF tutorial here: Tuning.pdf

This current posting is an extension to the existing post on performance tuning, which you can still refer to – for more resources on the topic.

Tuning
Tuning

Share

Oracle: Snapshot too old?

Berlin Wall
Berlin Wall
Okay, so you have received the Oracle error ORA-01555 Snapshot Too Old and have no clue how to go about resolving it? This post is made for you then. (The first time an application developer has written about this rather than a DBA.)

First, why does this occur? When you run a query, Oracle retains that data in a “snapshot”. The underlying tables in that query might continue to get changed, but you will see the data as it was when you executed the query. You can keep moving back and forth (rows) within the snapshot using the cursor. However, as you might expect: Oracle cannot hold that snapshot for ever. For how long it retains the snapshot is defined via the UNDO_RETENTION parameter.

So one way to solve this problem might be to increase the limit defined by this parameter. However, that is not always the best solution.

This problem normally occurs when a process opens a cursor (by running the query), and processes each row one by one. For example, let’s assume the process runs a query that returns 10000 rows. Processing each row takes, on average, 10 seconds. It goes on to the next row after processing the previous. Hence the total processing of all these rows will take around 28 hours. If your UNDO_RETENTION is defined as 10 hours, this process will fail on the snapshot too old error.

One of the best ways to solve this problem is to execute performance tuning on this process. This should be carried out specifically on the part of the processes that runs within the query in question, and should be targeted at reducing the time it takes to process one row. For example, if we can get our processing time down to 3 seconds, we will be done within about 8.5 hours, which is below our current setting for UNDO_RETENTION. In most cases, this can actually be done. (Read more here and here.)

A second way to solve the problem is to use a temporary table. For example, suppose you want to analyse all open purchase orders. From the table containing POs, pull the ones that are open, and put them into the temporary table. Since the temporary table is being used only by your process, Oracle will not have to hold the “snapshot” for you. Again the main driver query is the candidate for putting into temporary table. This will also make your process faster overall if it’s based on a not-so-small subset of a large table.

However, a third solution is also possible. For our problem we had a process that had to run for days and days, rather than doing something and finishing. So obviously, we got this error.

To solve the problem, we exited the loop after every n rows, and then reentered it. For example, if the pseudocode looked as below prior to the fix:


select something from somewhere;
while (rows) {
  do process
} 

We changed it as below:


hasAtleastOneRow = True;
while (hasAtleastOneRow) {
  hasAtleastOneRow = False;
  select something from somewhere where rownum<n;
  while (rows) {
    do process
    hasAtleastOneRow = True;
  }
} 

Note that the SELECT statement must have a mechanism to prevent picking up rows that have already been processed earlier. This could be a flag-condition or ‘check’ another table. For example:

select po_id from po_table where po_status='O' and rownum<100
and not exists(select 1 from po_temp where po_table.po_id = po_temp.po_id)

As part of the 'do process' then, we should insert into po_temp.

How do we select the value of 'n'? You will have to do some timing and hit-and-try here. Try to keep the highest value that is guaranteed to take lower processing time compared to the undo retention window.

Share

Performance tuning tips

Tuning

Today I will share with you a couple of tips on process performance tuning: rewriting your code to be faster.

This, for a change (contrasted with my previous posts on performance tuning), has nothing to do with Oracle or SQL: you can use these tips in any language.

When your code has been identified as having a performance issue, the first task is to go through the code with a fine toothed com from a performance perspective. Is there something that you can immediately notice and change?

After that is done, one should look at caching as one optimizing method. Caching is used in multiple domains, from web-browsing to microprocessor memory and there is no reason why your code should not benefit from it. As an example, an application had some logic to map department IDs: given a source department, it had to determine a target department. However, several different database tables needed to be consulted in a chain mode, some of which allowed ranges (e.g. for input department 1234A to 999B, have the output department as 526C). All this took substantial time, per department input. To solve the problem, we created a cache table: when the process looked up one department, it added that to the cache table (along with the corresponding output department). The next time it encountered the same input department ID, it would just pick up the value from the cache. How long to retain the cache is an important parameter, and your own requirements will need to decide that.

This can also be implemented using a Hash table (or a LoadLookup as some languages call it), rather than using a database table.

The second tip is moving the decision making earlier in the flow. For example, with one of the reports, certain rows were selected for processing in the beginning. Each row was thereafter processed one by one. At that time, part of the logic was to check some row fields and determine that this particular row did not need to be processed. This check was moved upwards, during the time of initial selection. As a result, the overall processing time went down.

Please go apply these two tips in your projects and let me know your feedback.

Share

Top 10 considerations when preparing a software test plan

testing

Click on images to enlarge

-> Test the parts of the application that have changed since the last cycle / go live

This part of the test plan is very obvious: test the changes to the application. Each change needs to be tested individually if possible, or as groups if the number of changes is large, and is known by the name regression testing.

For example, if you added a new field called ‘maximum pay by date’ to the voucher batch interface, then you could test the interface for this – having both data with this date entered, and with this date set to blank.

There is nothing more to this one – its normally the facet of testing that does receive the due focus during testing.

-> Test sampled parts of the application that have NOT changed

Now we come to something that does NOT receive the due focus. The parts of the application that remained unchanged. No, you do not have to test ALL if it. If you can test all of the application (especially with automated tools, as discussed below) – nothing like it. However, at least test 10-15% of functionality that has not changed.

For example – as discussed above – if you changed the voucher batch interface, then you can test the online voucher entry. Under the online voucher entry, test at least one scenario that has not changed.

The rule of the thumb is that if in a module having 100 test cases, 40 have changed – then test those 40 that have changed, and test 6-10 of those 60 that have not changed.

-> Look at it from the end users perspective: do one full cycle end to end

Next to include in the plan is something you can call integration testing: if your application is about users entering vouchers and getting paid – perform this cycle as a user would do. Many times we IT folks test only our application – the one we are developing and forget the rest of the glue technology. It falls into the category where we want to do it, yet are lazy at – so we find some short-cuts.

Once I was asked to carry out testing for a reconciliation report that had already been tested by the developers. I uploaded the same input twice, which ended up showing double on the final report. It turned out that the developer had missed this because he tested only on the basis of data that already existed in the system, and did not upload any new vouchers.

-> Stress testing

stress testing

Stress testing should again be a very critical part of your test plan. How many users are expected to use the application? during normal hours? during peak hours? Plan for all such scenarios. Design the business process that would take place if the application does fail – the idea should be that the user’s work doesn’t get halted.

There are stress testing tools available both free and commercial that you can use to simulate users.

In one of my projects, a web application that was created for 800 users, failed under a load of 35. Increasing the number of processors, or the number of server boxes is not a guaranteed way of handling load on the application: the application has to be designed to support the load from the ground up, and tested suitably.

-> Performance testing

performance testing

How long does a file take to get processed? How long does the user expect it to take? How long it takes for the screen to open/save?

The user expectation part is sometimes ignored. Please go ask the users of your application now what their expectation is – or it might already be too late in terms of coding.

The developers might think if a process runs for one hour its good enough. However, the users might be needing to run it six times a day during the closing period. Hence one hour might not be fast enough. In such a scenario we had to run four parallel instances of a process to achieve the user specified timing.

-> Concurrency testing

Can two different instances of the new process run together? The panel you just created: can it be used by two persons at the same time? Does it cause deadlocks at the database level if 100 instances of the process are run together?

Can two different versions of the application exist on the same machine?

These are the kind of questions that you ask yourself while working on the ‘concurrency’ aspect of test plan/execution.

A team of developers once needed to clone a process, and create slightly different functionality. However, it turned out that when both the processes were run together, 1 times in 10, one of the processes would fail. This was noted after go live 🙂 Turned out the cause was incorrect use of the shared temporary tables by one of the processes.

If you are interested in Deadlocks technically please read my posting: “Oracle Deadlocks: the What & the How“.

-> Unit test before Integration testing

Our laziness at work again: we ‘trust’ our work and want to move directly to integration testing. Partially, the waterfall model of software development is also to blame here.

99% of the times, after the developer moves directly to integration testing – the very first test case for the application fails, and the developer comes back to the unit testing phase. 🙂

Unit testing is a very critical part of your test plan – if you do it right, you will find hundreds of issues that will otherwise never get detected. Even not during integration testing.

Build ‘driver modules’ to iterate through all the ‘ifs and whiles’ that have been coded. Try out all avenues control can flow through.

-> Create test history

Creation of a test history is as important as doing the testing. Being able to, at a later date, answer such questions as: ‘what are cases we tested?’, ‘what are the problems we found?‘ etc is very helpful. Showing a clean slate (a ‘pass’ on all test cases) at the end of all our test iterations is not so helpful. In short, record the problems found, even though they may get corrected later on.

-> Automated testing

Automated testing solutions can be a big help. It does not mean that all testing be delegated to the automated testing mechanism: but it can definitely be an add-on to your manual testing.
In changing the order entry functionality, use it to enter 1000 different orders. There are several solutions available (use google) that will record the user actions, and will repeat those actions later with different data.
At a very simple level, AutoIt is a great tool for automated data entry, and is free (GPL). Its very flexible and has a great library of functions built into its scripting language. I use it all the time, and not just for testing!

-> Code review

While we focus on all these great ways of testing let us not forget our tried and tested workhorse: code review. Being humans, we are tempted to feel that by doing better testing (being easier to do) we can offset the need for a good code review, but there are hundreds of reasons to do code review.
There may be some program flows designed for rare situations which may never get tested. Code review in such a case will contribute ideas for such test cases. Documentation may not be in sync with the code, with the potential to make future changes difficult. There may be code improvements possible: for example, replacing an ‘if condition’ with a more specific check.

There are other things, depending on your scope you may also want to include them:

-> Knowledge transfer/competence testing

-> Backup & recovery testing

All the best, post your comments here.

Share

When NOT to normalise the database

database
When talking of Database Normalisation, textbooks often talk of BCNF, fifth and higher normal forms. However, in practice (in large software/ERPs) I have rarely noticed normalisation beyond Third Normal form. In fact, there is a certain degree of redundancy that is desirable.

While doing database design, I believe there are two critical aspects that should be kept in mind but I see ignored in a lot of common software.

The first is the time aspect of data. First – an example from finance. Consider a company having multicurrency invoicing. The tables can be designed as:

INVOICE: InvoiceID, ..., Currency, BaseCurrency, TransactionDate, ...
CONVERSIONS: FromCurrency, ToCurrency, EffectiveDate, RateMultiplier

This is a design having no redundancy. On the basis of the three fields in the INVOICE relation, we can always find out the latest row from the CONVERSIONS table having EffectiveDate less than TransactionDate. Hence we can determine the RateMultiplier.

Consider another design:

INVOICE: InvoiceID, …, Currency, BaseCurrency, TransactionDate, RateMultiplier, …
CONVERSIONS: FromCurrency, ToCurrency, EffectiveDate, RateMultiplier

Here, the system determines the value of the RateMultiplier at the time of invoice creation and records it permanently within the INVOICE table itself. To me this would be more mature design. Why? Because a lot of data in the INVOICE table would actually depend on the RateMultiplier: for example the VAT details. If on 1-JAN-2009 we know that the exchange rate is 1.1. However, on 3-JAN-2009 we come to know that the rate was incorrectly recorded. Someone changes the CONVERSIONS table to reflect the new exchange rate, of 1.2. All the details in the INVOICE table for the invoices created between 1-JAN and 3-JAN become inconsistent since the BaseCurrency is now inconsistent with the RateMultiplier.

Now consider an example from HR appraisal systems. A table stores what stage an appraisal process is at for a particular employee. This is then used to decide what access he has.

STAGE_CURRENT: EmpID, Stage

Note that this has no Date, or Year field. An employee is trying to see records for the previous year appraisals, yet is unable to see some of the data, because current appraisal process is still in initial stage.

The next problem is that of storage of “under calculation” fields. For example, consider the training department maintains the scores of each student trained. The test administered is of 100 marks, but has a weightage 40. Proposed design:

SCORES: CandidateID, TestID, Score, Flag

At the time of recording, the Flag is set to N. Thereafter a process runs that multiplies the score by 0.4 and sets the Flag to Y.

In my opinion a better design would be to retain both the scores even though the pre-weightage score is not relevant to the business process, because a process can also terminate in between due to erroneous data being supplied. Hence if the process ends after setting the flag to Y, and before changing the score; or in reverse order: after changing the score and before setting the flag then we end up with inconsistent data. Improved design:

Scores: CandidateID, TestID, Score, WeightedScore

At the time of recording, Score is entered and WeightedScore is set to zero. Thereafter a process runs that multiplies the Score by 0.4 and stores the value in WeightedScore.

The central idea is to retain all information permanently so that even if the process fails, we know what data existed.

Share