Topher's Top Ten (Logical) Tuning Tips

This isn't comprehensive, and it's just my opinions, but it may help those new to databases or sysbase. Hey, even the gurus may find a tip or two. Anyway, here you go ....

  1. Know Your Data (and how it's accessed)

    The most important thing you can do to tune a database is to have an understanding of what's in it, how it's arranged, and how people are going to look at it. One oddball querry, if done frequently or at the wrong time, can really throw things out of whack. The best way to do that is:

  2. Build an i/o profile

    Simply put, measure *every* querry in the system in terms of i/o's. Use best, worst and average cases (if applicable). Weight the i/o's in terms of frequency and criticality. Then (and only then) do you have the information you need to make informed tuning decisions.

  3. Select indexes based on cost

    Every index has a cost and a benefit in i/o's. When you select an index, measure it's impact on the entire i/o profile. Try to minimize the weighted, aggregate number of the profile. Keep in mind that you can add additional columns to the end of an index without affecting the usefullness of the first few columns. Although the write i/o's and index height is affected, the buy may be worth it if you can either double-use the index or get index coverage for some querries.

  4. Know what the database is doing

    Don't assume that the way you wrote a query is the way it will execute. When you build your i/o profile, use the "set showplan on" and "set statistics io on" options to get the information you need. Find out what the database is *really* doing, not what you think it should be doing.

  5. Re-write problem querries

    Querries which contain complex joins or are just plain nasty should be re-written. There are a number of ways to rewrite them. Some of my favorites include:

  6. Encapsulate, encapsulate, encapsulate

    Stored procedures and triggers are the most golden gifts from the sybase gods. Encapsulation lets you separate, tune, redesign and even re-engineer your database separate from the rest of your application. It brings things up to a functional (business) view and allows you to think in terms of getting x,y and z given a,b and c. This allows you to:

  7. Denormalize and only do things once

    Try to get the things you need in one shot. Have it all there waiting for you instead of having to hunt for it. The one to always do when denormalizing is to base it on cost as validated by retesting your io profile. The two things to NEVER do when denormalizing are: a) never denormalize a database that hasn't yet been normalized b) never denormalize an unencapsulated database

    If a querry requires calculations, perform them when the data is put in, not when it is read out (if at all possible). This can be done with triggers for semi-static calculations. For frequently updated calculations, use a daemon that operates within your level of accuracy/precision. EVERYTHING should be cost based. Look for ways to avoid doing repetative things. Lists are a great example. If you have a bunch o' tables that are joined and reformated to create a nice looking pick list, then use triggers which do the join/reformat of a single row when it is inserted or updated. Another great example is complex sort keys. If a sort key is complex or calculated, try to create another column which has the sort key premade so that the optimizer can use it.

  8. Understand your optimizer

    There's some really good stuff out and about which explains the optimizer and SARG's and OR's and IN's and stuff. Find it and grasp it. My favorite optimizer tip? Perform calculations against literals, not columns. For instance, to get stuff 90 days overdue, instead of:

    datediff(dd,lastpay,getdate()) > 90

    do:

    lastpay < dateadd(dd,getdate(), -90)

  9. Do a contention diagram (understand your lock manager)

    From your io profile and querry plans, identify potential points of resource contention. Are you always going after the last couple pages of a table? Is a given table mostly being accessed by two diametrically opposed querries? Are querries competing for different types of locks? Is a querry grabbing a lot of pages for a small amount of data? Am I getting starved, or am I starving others? What is my blocking/deadlock potential? What things need to be accessed simultaneously.

  10. Build small tranactions

    Transactions should be small, atomic and fast. Transactions hold locks and block other processes. If you have a long transaction, especially one which requires communication back and forth with the front-end (e.g. committing multiple inserts), put them into temp table and then build a transaction only around the portion which moves them to the permenant table. Another technique to use is the "optimistic transaction," which does a great deal of transaction stuff outside the transaction boundry and then validates it in the transaction (being optimistic that nothing changed). If the validation fails, then the whole process is re-done. An example:

    while @@rowcount = 0 begin

    select @appl = min(appliance) from merchandise where type = "washer" and available = 1

    begin tran update merchandise set blah, blah, available = 0 where appliance = @appl and available = 1 commit tran

    end

    What this buys you is that the transaction is small for the most common case (nobody else grabbed it). It makes the exception take much more time, so it only works if you know that the common case far outweighs the exception. In this example, assume the common case occurs 90% of the time. Built in a strict transaction, if it took 1 second to do the transaction, it would take a 100 seconds to do a hundred of them. But, in the optomistic transaction, if the common case takes 0.3 seconds and the exception takes 2.5 seconds, it would only take 52 seconds to do a hundred. That's a 48% savings! This is especially true if you need to divide querries with temp tables.

Topher's Top Ten (Physical) Tuning Tips

  1. Start with a logically tuned system

    (see 1-10 above)

  2. Watch your data Cache

    Use "dbcc memusage" to watch what's going on in cache. What's there, what's not, when it's there and how much room is taking up. You live and die on your cache hits. If stuff is moving back and forth in and out of cache, then you have nasty contention and your getting a lot of physical io. This can be caused by: a) not enough cache b) big table scans

  3. Use your contention diagram

    Physically spread to avoid contention and maximize concurrence.

  4. Believe in controllers

    Spreading things over multiple physical devices is a good thing, but the key factor is disk controllers. Multiple spindles on a single controller doesn't buy you a lot.

  5. Use segments (but only where you need to)

    Using segments allows you to put and keep things on a given physical device. Don't abuse it! One or two well placed segments can buy you more than lots of overlapping ones. Separating data and indexes for concurrency is good, as is spreading contention or *large* concurrent tables. However, you need to have good space and growth estimates before segmenting, because a full segment will bring you to a grinding, screeching, soul-gouging halt.

  6. Multiple Databases

    Multiple smaller databases does a number of things. First of all, it speeds up backup/restore times (for a database, not an entire system). More importantly, it gives you multiple sets of syslogs. Syslogs is your single most important point of resource contention in a database. The only way to spread it is by using multiple databases.

  7. Consider the effect of mirrors

    Mirrors can be more than just security devices. Although they increase write times, they can reduce read times up to 50% (theoretically). For read intensive applications, mirrors can be a good deal. For write intensive applications they cost performance.

  8. Understand and evaluate the effectiveness of New Technology

    RAID, solid state devices, battery backup disk caches, etc. could all be performance boosters. However, make sure they don't come back to bite you. A solid state device has a minimal impact on a read intensive/high cache hit application; even worse, a solid state device which gets full (ouch!).

  9. Montior your system

    Usage profiles may change, or be different in production. You may have to re-weight i/o profiles or contention maps and, as a result, re-divide data. In evaluation and testing, watch *everything* and try to correllate it to application functions. If you have built in instrumentation (e.g. function/begin time/end time logging), turn it on regularly and re-evaluate.

  10. Do ad-hoc querries elsewhere

    Don't do ad-hoc querries on an OLTP system. First of all, it's an unpredictable cost drain. Second of all, you can't balance the reports since the data keeps changing. For the "reports" which need to be run on the OLTP system, define them (as many as needed) and include them in the i/o profile and contention maps.

OK, maybe there are some pearls of wisdom in here and maybe there aren't. It's possible that most of this is stuff everybody knows, but hey ....

If you liked this, or have suggestions or questions, drop me a line. If you didn't like it, post replies to alt.flame.

Topher
---
Christopher "Topher" White   |  "The problem with justice in this country 
McCaw Cellular Communications|    is that there are too many hung juries
Kirkland, Washington         |     and not enough hung defendants."
Standard disclaimers apply   |          -- Dennis Miller