SYBASE ADMINISTRATION WHITE PAPER v2.0
Copyright © 1996 By Edward M. Barlow and SQL Technologies, inc.
All Rights Reserved
-------------------------------------------------------------------------------------------------------------------------------------------
THIS DOCUMENT IS A WORK FOR COMMENT.
This document is a work for comment and I welcome any comments, notes, and assistance. This document is hereby released to the public for free. You are welcome to redistribute and copy it so long as you do not alter or remove any of the copyright notices. You are welcome to use any sections in any of your own documents without the copyright notice so long as you do not directly resell those documents (i.e. you can copy it directly into your failover manual but do not copy it directly into your book). The intent this paper is to support new administrators and to give experienced administrators a better idea of how to manage systems.
If you have material to contribute, I am happy to give credit. Specifically, I have been told that I gloss over some system 10/11 issues (how to use roles etc...).
It is suggested that all users look at my free stored procedure library. This library augments the Sybase procedures and can be picked up at the following web site: http://www.tiac.net/users/sqltech/
One Cheap Plug: The author does Sybase consulting in the New York area and is available for a variety of different types of consulting contracts. I welcome you to contact me if you have a need for Sybase consulting, especially if you are just beginning a project or if you are setting up a data center. I have very good contacts throughout the New York contracting world and if I am not the best person for you, I can point you in the right direction. I have also recently produced a full featured Sybase administration tool named SQL Administrator. This tool provides automation of many of the routine procedures that administrators need every day. SQL Administrator solves many of the problems that I point out in this document. Information on SQL Administrator is attached at the end of this document. I welcome direct inquiries regarding this tool.
Special thanks to Pablo Sanchez at SGI who gave me many ideas for this document and who maintains the Sybase FAQ, a document I suggest you read. Thanx to John Lee who sent me some good ideas.
Cheers
Ed Barlow
sqltech@tiac.netebarlow@pipeline.com
212 366 1137
SUMMARY
Sybase is an increasingly popular database. The Sybase company sells a variety of products that can be used to manage data. Management of the Sybase product line is not, however, simple. Administrators spend considerable time dealing with policy and procedural issues. A lack of good administrative utilities has severely hampered the abilities of systems administrators to proactively manage their environments and has led to active administrator involvement only when problems arise. The lack of good policies and the perpetual "crisis" management has led to the general perception of low quality in the support arena. To deal with the lack of tools, administrators have developed a variety of little ad hoc utilities that they use to manage their systems. Every site has their own tool kit of administrative utilities that are used for anything from dumping the database to adding logins. This document is designed to help systems managers and systems administrators manage their UNIX / Sybase systems environment by recommending policies, procedures, and utilities necessary to manage a Sybase environment. It is to be hoped that, something in this document will make your servers more stable and, if you are an administrator, will allow you avoid a few frantic 3 AM phone calls.
At a high level, the way to break the cycle of perpetual crisis management is for administrators to be responsible for performance and responsible for quality. To handle these responsibilities, it is necessary to define a set of procedures and facilities that will ensure the quality of the systems environment. It is also necessary to develop or purchase tools that will allow the administrator to actually be responsible. Historically, due to the lack of tools provided with the server, the only metric used to measure quality has been "system availability" (or lack of it) during critical business operations. This is clearly insufficient.
At a low level, there are several distinct areas which administrators must deal with. Administrators must develop contingency plans for disasters (backups, recovery, failover). Administrators must develop methods to manage their users (id administration, turnovers, phone escalation lists). Administrators must also ensure system security. Finally, administrators are responsible for developing procedures that ensures system availability and that optimize and tune the system so it runs at its best.
Sybase provides a large amount of information on the state of its servers (errorlogs, system tables...), but that information can not be found in any one location unless one collects that information themselves. The fact that information is so spread out makes it very difficult to understand what is happening, and makes it difficult to productively manage problems. With the standard tools, an administrator can not even know which of his servers are up and which are not! As UNIX has few conventions, it is likely that each machine the administrator deals with will have different tools, disk layouts, and even could be running different versions of Sybase. The first key to administration is to define and enforce standards. The next key is to be able to collect the distributed information and to be able to analyze it. This collection should be done by software. For example, the Sybase error log for each server should be collected in real time so new errors can be analyzed by administrators when they occur. Without this type of software, most administrators never look at their error logs unless there is a problem. Performance information should be also be collected periodically and there should be facilities to analyze server performance. The third key to administration is to define procedures. Changes to the system and the resolution of problems should be documented. If nobody knows that an error occurred, there can be no evaluation or analysis to make sure the problem does not happen again. If there is no procedure for users to request login id’s, administrators can not be accountable for who has access to the data. The last key to administration is to be proactive, not passive. The best way never to have a problem is to set the systems up solidly and to fix problems before they occur.
This document will sketch a view of the Sybase Client Server architecture and will then attempt to deal with the issues that all administrators will face as they attempt to manage their systems. Specifically, this paper will deal with routine administration, disaster planning, support, how to set up new systems, how to define standards, change management, performance optimization, security, and backups.
SYSTEMS ARCHITECTURE DEFINITION
Systems Architecture Description
The Following diagram schematically describes the components of a typical UNIX / Sybase computing architecture:

In this architecture, workstations, such as Apples, PC’s, or Sun SPARCstations, are known as clients. Servers are dedicated computers that provide other computers and workstations on the network with access to shared resources like files and databases, or services like printers, modems, and mail. Users access servers transparently through front end software that they run on their workstations or PC’s.
Major system services reside on database, application, and file servers. Database servers manage database engines, like Sybase. Application servers manage specific applications or sets of applications. File servers support generic applications like word processing and maintain user directories. The other servers provide support services for these servers. Name servers are used for host lookup and network security, and time servers are used to maintain the accuracy of system clocks. Compute servers are used for compute intensive operations such as financial optimizations. The final component of the architecture are the Network Gateways, which connect the network with networks of different types (e.g. IBM SNA). Any physical machine may serve multiple server functions if performance permits.
A simple version of this architecture consists of only clients (PC’s) and machines that run SQL Servers (Sybase on UNIX or NT or Microsoft SQL Server on NT).
Components of The Networked Solution
Database Servers
It is good practice to make Sybase servers as uncomplicated as possible. The systems should be simple because you want them to be optimized for reliability. They should be virtually identical to your standard (plain vanilla) UNIX platforms except for modifications necessary to run the database well. Production database servers should run only database engines (no application programs) and the administrative tasks (monitoring, backups) necessary to keep the database up and running. There may be cases where it can make sense to run applications on your servers, but your instincts should forbid this.
Database servers will typically be similar to the standard UNIX configuration except that:
All database servers that you own should be configured identically except for external disk drives and a few configuration files. This allows failed machines to be quickly replaced with an identical machine.
Compute Servers / Application Servers
Compute servers are used to perform numerically intensive computations such as statistical analysis or optimizations. By dedicating a server to these tasks, the load on local servers is reduced. Applications servers are used to provide higher powered CPU capabilities to applications that require better performance than can be provided by the local workstation.
File Servers
File servers are used to store home directories, commercial software, and common system software (electronic mail, print queues). There is an ongoing debate whether Sybase servers should mount their Sybase directory from a file server.
Gateway Servers
Systems rarely exist in a vacuum. Data for your systems often exists in other databases (e.g. DB 2) or on other networks, and some networks (e.g. IBM SNA) require gateway servers to manage the data transfer. Gateway servers are typically only used when there is a requirement for timely transfer of data (e.g. nightly batch transfer is not acceptable) because the maintenance of a Gateway is much more difficult than the use of file transfer (ftp).
Name Servers
Name servers are used by the network operating system to store a centralized list of host names and I.P. (internet protocol) network addresses. These machines are maintained by a central department. Applications use name servers to look up address based on names so users need not remember network addresses.
Time Servers
Any environment with a large number of clocks (each machine on the network contains an internal clock) is subject to clock "drift". Unless appropriate precautions are taken, the time stamps on these machines become unreliable. Time servers store an accurate clock and can be used to set the clocks on other systems accurately. Be wary of system clock errors. PC clocks can easily be more than 24 hours off (mine is right now), and nobody ever really checks server clocks. I recieved mail last week from someone whose clock was 1 Year off! This can lead to interesting situations where updates to a row occur earlier than the actual insert of that row. Clients should rely on the Sybase getdate() function in preference to their own clock functions to timestamp transactions so least all clients will then have the same clock error.
Workstations / Clients
Workstations are "dataless" machines on the network. This means that they do not contain any volatile data such as user directories and data files. They will contain an operating system (Windows, UNIX) and temporary space. Within a work group, all workstations should be identically configured and should therefore be totally interchangeable. This allows failed workstations to be swapped out with no loss of data or time. Additionally, when a user's workstation is down, that user may continue to work on any other available workstation belonging to the same work group. Note that workstations are typically not backed up.
Application software distribution is a problem that must be worked out in the client server world. While there is some performance penalty (especially on slow networks) to running the application from a network server, release control becomes very difficult if the application software is located on client workstations. It is recommended that new releases of application software be stored on file servers. As an alternative, write simple batch script that starts the application after testing whether the software on the file server is the same as the local software. This script should copy that software locally if a new version is on the file server.
Data Center Support
Server machines should be located in secure locations like data centers. Data centers protect you from having hardware stolen or tampered with, offer an UPS power source, and usually have operations staff to perform tasks like changing tapes. Data centers access should be only for administrative personnel (i.e. they should not be accessible by the cleaning staff). Locating the administrators near the actual hardware helps in new server installations and upgrades, but is not necessary. It is suggested that a virtual console service be set up so administrators can remotely log in to the servers on the console. A virtual console service is very useful when you are at home and type halt instead of reboot.
Data center operators should be trained to reboot machines and to change and store tapes. Written procedures for these actions should be provided. This assists your abilities to provide phone support from home.
Corporate Network Services
Dial In Services
Secure dial in capabilities should be offered to allow support personnel to access systems on a 24x7 basis. The security of this service is of great concern, as this is the point of easiest break in to the network. There are several products (SecureID...) that can be used to provide adequate security on this link.
Internet Gateway
Each administrator should regularly read information provided on the internet and should have access to the web. There are several web sites that offer documentation and support on the net.
The following web sites are recommended:
http://www.tiac.net/users/sqltech/
http://www.sybase.com:80/
http://sybase.pnl.gov:2080/Sybase/.Sybase.html
http://www.powersoft.com
The following newsgroups are recommended:
comp.soft-sys.powerbuilder
comp.databases.sybase
The Sybase FAQ is currently maintained by Pablo Sanchez at the following web address
http://reality.sgi.com/employees/pablo_corp/Sybase_FAQ/index.html
Electronic Mail Services
Electronic mail is a primary method of notifying administrators of errors. A naming schema should be developed for consistent mailings of errors (i.e. do not code the actual recipients' names anywhere). I suggest the setup of a central server with .forward files that will rebroadcast errors to the appropriate administrator. Names could be set up using a scheme like servername_sybase and servername_UNIX.
ROUTINE ADMINISTRATION
Define Administrators Tasks
One of the biggest problems with administration is its ad-hoc nature. There are routine tasks that should be performed daily, but the boring and repetitive nature of these tasks leads to a situation where these tasks are not performed. The repetitive nature of these tasks means that software to automate them is of great interest. Administrators should have a daily, weekly, and monthly checklist of things to do on each of their servers. Note that almost all of the items on this list can be done by software. The following is a sample checklist of things to do:
ROUTINELY (every 5-15 minutes)
NIGHTLY
WEEKLY
MONTHLY
PERIODIC
Automation
Sybase provides a database engine and almost no tools to manage that engine. Rest assured that most of the tasks mentioned above can (and should) be performed automatically by software. This is a clear example of automation reducing the costs of managing your environment. Many administrators have utilities and tools that they have written that automatically perform the functions mentioned above.
ID Administration
Significant administrator time is spent adding and deleting users. The manual nature of the process as defined by Sybase is error prone and expensive. It is recommended that procedures for this be tightened up in multi server environments. Users typically can be defined as falling into one of several categories, and setup of these users should be script driven, not manual (i.e. create an add_developer or add_production_user script to add users). This script should perform the sp_addlogin, sp_adduser, and sp_defaultdb scripts for all servers that are releveant.
Routine Audits
Systems should be audited by other administrators on a routine basis. This kind of auditing ensures that standards are uniformly enforced, and guarantees that the environment is stable.
DISASTER PLANNING
Many systems have moved to the Sybase environment and Sybase is considered to be a reliable production-quality database product. Fundamentally, however, the client server environment has many components, and is more susceptible to failure than is the Mainframe environment. The failure of any component (Client, LAN, Servers, Reference Data, Application) can prevent users from performing their jobs. While every effort will been made to minimize the risks to the environment, the "owners" of the application must assume the business risks of system failure. This section is intended to inform users of the type and magnitude of risks that are faced in the UNIX & Sybase environment and to provide information necessary to the administrators to manage these risks.
Most infrastructure problems can be solved quickly by the infrastructure support team(s). Many of the more common problems (disk failure...) are automatically handled by the systems in such a way that users will be unaware that any problem has occurred. Most of the remaining problems can be solved by rebooting the servers. Problems that require a system reboot can be reasonably common (once every one to two months) and can cause up to 30 minutes of downtime if they occur during a normal business day. The response time for this is mostly related to getting in touch with the appropriate systems administrator although it might take a while to bring up Sybase if a large number of transactions have not been checkpointed, a condition that should only arise due to poor application design. Note that all times listed in this document relate to normal business day operation - if an administrator is not on site, problem resolution will take longer. Less common will be problems that the administrators can not fix with a reboot (corruption, application error...). Most likely, the problem will be minor, and the administrator will recommend that the system limp along until after the application deadlines, at which point the system will be repaired. There will, however, be times when the system is corrupted to the point where it can no longer be used. I can not give an estimate of the frequency of this type of problem. Unrecoverable corruption is a very rare case, but most long time administrators have seen it occur on old (more buggy) versions of the SQL server. The procedures documented herein are an outgrowth of the problems with these earlier versions. Sybase claims that post 4.9.2 versions are stable, and I have not seen unrecoverable corruption with these versions in normal operation. If unrecoverable corruption occurs, there are two methods of recovery - backups and a contingency server setup.
Backups are the normal method of recovery for some databases, but the time required to load larger databases may preclude usage of this method under production conditions. Use of backups in production conditions requires rolling transaction logs forward to an appropriate point prior to the corruption. 5-10 minutes of USER work will be lost if we failover to the contingency server (the time since the last transaction log dump).
A more complicated failover method is to use contingency hardware . A normal contingency setup involves a second server that will contain a version of the server that is 5-10 minutes behind the primary. If a major corruption occurs, the administrator will reboot the contingency server with the same network I.P. address and socket as the primary and users will continue to work. 5-10 minutes of User work will be lost if you failover to the contingency server. Please note that, in the worst case scenario, it is possible that the corruption would have propagated from the primary to the backup server (it depends on what caused the problem). There are several different failover configurations that I have seen in production.
Tools
There are several unix backup products that can help here. One product, RoadRunner, can perform raw partition as well as file system backups. This helps in a contingency situation where the sa’s can just unload the tape on the contingency machine and the raw devices will get created. Then you just need to load database from the most recent copy of the dump. The load database step is needed because there may be data in cache during the backup process, however, in theory, if the site can have the sql server down before doing this type of backup, you will not need to reload.
Contingency
Contingency is defined as the ability to recover from a massive or catastrophic failure within the environment (such as a bomb, massive power failure etc.). Contingency is generally defined as involving "off site" facilities. In the event that no contingency plan is defined, it should be noted that most database servers are fairly portable and can be moved to whatever location the users have been relocated to (provided the systems still exist). Additionally, the administrator should have a "gold copy" tape for each machine, so the system can be reloaded from tape. In the event of a massive or catastrophic disaster, significant downtime should be expected.
Large scale disasters are beyond the scope of this document. The following situations must be dealt with on a company wide basis.
· Massive Data Center Failure or Major Power Outage
· Backbone Network Outage - main communication links between locations are down
Time To Recover From Disasters
There are two factors that impact the time to recover from problems. The first factor is the time that it takes to contact the "correct" support individual. The second factor is the time for that individual to correct the problem. The time taken to contact the correct support individual is determined by the procedures that have been instituted and can be minimized if given appropriate attention by management. Appropriate escalation and contact lists must be discussed with support groups on an application by application basis.
The following problems are within the scope of Sybase administration:
· Problem: LAN Failure
LAN failures typically can be cleared up quickly once identified, although these are often the most difficult problems to isolate. A LAN failure would impact only users on the given LAN, although a Server Ring failure would prevent any users from accessing the database. It is possible to dual route Sybase servers (i.e. have a server listen on multiple interfaces). This is generally done for routing reasons (some users are on Ethernet and some on Token Ring) and not for disaster administration reasons. If an application might need to connect to several network addresses, all should be predefined for the clients. Redefining connections on the clients is a complex procedure and should not be performed after a disaster strikes.
· Problem: Serious Hardware Failure
While many of the system components are redundant (disks, disk controllers, network interface cards), many are not (mother boards, memory...). Generally, when there is a problem of this type, the administrator will not have the tools to quickly isolate the problem, so it is generally advisable to have spare hardware in place to manage this type of disaster. Note that if the configuration is thought out in advance, it is possible to use another system that is normally used (like a development server) as your spare system. In the event of hardware failure, the systems administrator will bring down the system and will bring up the fail over system with the same I.P. address as the original primary. Clients will then be able to log in to the system with no changes to their workstation. Most vendors offer hardware contracts for their servers and disks that will allow replacement of any system component within 24 hours.
Booting with the same I.P. address is important if the clients are not unix based (ie. Windows...). These clients have I.P. entries hardcoded in their ini files, and are not easily modified to handle failover situations.
· Problem: Single Hard Disk Failure
Hard disks are historically the most unreliable component of computer systems because they have many moving parts. On production servers, Sybase is set up to protect user data by mirroring all data so it is stored on more than one drive. In the event of a drive failure, Sybase will continue running with no downtime experienced. The failed component must be replaced at a later date. Refer to the later secion on RAID for other ways to handle disk failures.
· Problem: Failure of Disk Controllers, Failure of Network Interface
Most systems have a surplus of disk controllers and network interface cards because they are useful to improve performance. If one of these fails, the administrator will need to re configure the system software but this can be done fairly quickly depending on the problem. This type of problem will be extremely rare.
· Problem: Serious Database Problem
It is possible for data to be corrupted due to previously undiscovered Sybase software bugs or other unforeseen circumstances. Often a simple reboot of the server will clear up the problem (this usually takes a few minutes but actual recovery time is based on the number of uncommitted transactions). In the event that the problem can not be solved within a short time, fail over should be implemented.
· Problem: Application Problem
Applications can also encounter serious problems with the consistency of their data. Each application group is responsible for investigating their potential for problems and for instituting appropriate recovery procedures.
Failover Configurations
This section describes method of disaster recovery that involve varying degrees of redundancy of hardware. Note that a disaster that is significant enough to warrant fail over to a backup system is an extremely rare event. Most Sybase systems have NO fail over system. The configurations being proposed are solutions where there is a high cost a disaster. There are several levels of this type of redundancy as follows:
A second system is defined to be used in the event of a failure. No data is kept on internal disks besides the OS, so the new system can be swapped in easily. The second system should look somewhat like the primary, although it need not have all the horsepower. In a disaster situation, the physical machines are swapped and the systems rebooted.
It is possible to share disks between systems using SCSI A/B switches. With two A/B switches, a system can be designed to easily switch between primary and secondary (you just halt both machines, switch both A/B switches, and reboot). The secondary is then the primary. For this configuration to work, it is essential that no data be maintained on internal disks. This does not protect against database or application corruption.

This configuration uses a second system that is identical to the primary for a failover system.
UNIX Failover
There are several products on the market that manage the UNIX end of things and that will automatically failover if the system detects heartbeat trouble. If this type of solution is desired, it is advisable to purchase a product instead of developing your own software. The systems that I know good things about are Qualix Firstwatch and Heartbeat. Openvision should be avoided (can I say that) as it is overly complex. This solution should be analyzed carefully before installation as a Sybase solution.
Simple Sybase Solution
A Sybase specific solution involves using some mechanism to copy dumps and transaction logs to the second server and then to loading them into a second server. This second server is only a few minutes behind the primary and can be failed over fairly easily. A reboot and manual change of the hosts I.P. address is necessary.
Each Sybase production server will periodically (every 5-10 minutes) dump their transaction logs to disk and then copy (ftp) them to their fail over machine which will load the transaction logs. This configuration is typically (henceforth known as athe warm standby. A hot backup is the setup where one machine automatically goes live when there is a problem on the primary. In the warm standby situation, hot backup) which will load the transaction logs. n Nightly dumps will be transferred using the same mechanism. As the fail over system will only load transaction logs, it will be lightly loaded.
In the event of systems failure, the warm standbyhot backup is rebooted with the same I.P. address of the primary system and clients will simply need to reconnect to their application to access data. Note that because the fail over system is rebooted with the new I.P. address, no changes need be made to clients for during the fail over scenario. It is expected that the administrator will repair the primary production server and will schedule downtime for maintenance after a disaster is encountered.
This configuration will result in 5-10 minutes of data being lost if the fail over server is activated. The application team is responsible for recovering this data.
Be aware that if the administrator does not create databases with the same dbid’s on both primary and secondary cross database joins will not work correctly.
Be also aware that if the logins do not have the same suids the users permissions in the backup can differ from the primary. Login administration can be a difficult process. It is recommended that a tool be used to keep suids for given logins consistent across machines.
The backup system need not have disk mirroring.
For large systems, the backup system can be used to run dbcc’s as it will be lightly loaded.
Saving Money
If multiple systems exist that you wish to mirror in this way, it may not be necessary to purchase an exact mirror of every system. As the secondary system will be lightly loaded, it is possible that this system serve to backup multiple systems. You could thus have one system serve as a backup for multiple primary systems. If more than one primary has gone down (this can not happen in real life - but someone will ask the question), the secondary system will serve as all the primaries that have failed - it will just be a bit slow. Use of one secondary to manage multiple primaries is a good cost saving move, but is probably impractical due to business reasons.

This configuration increases the work required of the administrator if a fail over occurs. The backup server should still load data from the other production servers, but it obviously can not do this on its old I.P. address unless it is configured with multiple I.P. addresses.
SUPPORT
Appropriate Contact & Escalation Lists
It is very important that users know at all times how to contact support personnel. Escalation procedures in case the support personnel can not be contacted should also be well defined. To handle vacation coverage, automated messaging systems (voice mail) or help lines should be set up by each individual administrator. It is essential that all support staff should have dial in capabilities from their homes. Since few administrators do not have PC’s, this is generally not a problem. Administrator availability and responsiveness is very important. On one of my contracts, the perception that administrators were not available on weekends during a critical development phase (actually they did not have dial in access so it did take over an hour to get a response - they needed to drive in to solve any problems), led a key developer to hack the sa password on one of the development systems to allow him to reboot the server as needed. He was caught and fired, a stupid loss that set the whole project back over a month (what a dumb situation...).
Pagers
Pagers are generally the preferred method of contacting administrators. I have found pager codes are very useful. After the caller types in his phone number, he hits the ‘*’ key and then a code to indicate the message’s severity. There are quite a variety of Alpha numeric pagers on the market too. Many companies have set up modem based systems where monitoring jobs can send alpha messages to the adminsitrators pagers. A sample of a paging system for contact of administrators follows:
|
Code |
When To Use |
Response Time |
|
*1 |
Serious Emergency. Production down |
ASAP. Administrator will get of his train at the next stop and will call back. |
|
*2 |
Emergency |
10 Minute Response. Administrator will call back when they next can. |
|
*3 |
Call Back |
1 Hour Response. Administrator will call back when convenient or when their meeting ends |
|
*4 |
Just Saying Hi |
Call back whenever |
Scheduled Downtime
All production applications should have procedures to deal with scheduled downtime. It is recommended that voice messaging phone lists and / or electronic mail lists be set up each application, and that a single individual be identified from the application team as the prime point of contact for problems. In the event of system bugs, scheduled downtime, or any other important message, the phone/mail lists can be used to easily distribute the appropriate information.
Outage Reports
Any time a production server outage disrupts user activity, an outage report should be filed with the administrator's manager as well as with the appropriate business contact. I have usually used electronic mail. This outage report should describe the outage and should describe corrective actions. This report should be filed within 24 hours of the outage.
SERVER SETUP
Server setup is important, because once a system has been installed, it is very difficult to change it. As noted elsewhere, the most important aspect of setting up a server is that it looks like all other servers.
One very good method to install new machines is to create a "vanilla" machine (which probably sits on somebody’s desk or in a lab). When a new machine is desired, the machine is opened and a second internal drive installed. The administrator then copies the whole UNIX configuration to that second internal disk drive that is then installed in the new machine. The new machine is never booted from CD. This saves you the cost of extra CD drives, guarantees consistency, and allows quick swapping of the reconfigured spare drive in to any machines that fail. This works well for small server configurations, but large multi processor systems must still be configured by hand.
Another excellent method for guaranteeing consistency is to have a single administrator responsible for configuring ALL systems in the company. That individual will quickly learn his job well and it will save time (and money) as well as forcing all systems to be identical.
The following are guidelines to consistently setting up your systems.
How To Purchase A Server
Hardware price / performance today have fallen to the point where it matters little what type hardware you purchase for your systems. The systems are ALL good. Low end ($20,000 excluding disks) server machines can easily handle a 100 user decision support application or a 50 user OLTP application. The key to purchasing your server is consistency with other servers you have. Leveraging your backup software is more cost effective than purchasing a cheaper machine and then spending days rewriting your software.
The first step to purchasing a new system is to establish the application characteristics. Is the application read oriented or write oriented. What are the applications busy hours. What will its typical user base be. Once these are all defined, find a machine that will suit the needs of your application in terms of power. As mentioned above, the actual hardware is of limited importance unless you are talking about a very large application.
Once the system hardware is identified, you need to decide how much disk and RAM you will need. UNIX swap space should be configured to at least 2x RAM. Generally the internal UNIX disk will be sufficient to manage the UNIX end of the application. Many administrators do not like to purchase an internal disk because of recovery issues. These administrators buy at least a GB disk for the operating system.
Estimate the size of the application database. Add about 30 MB for system databases. Add the size of tempdb. All databases except tempdb need to be mirrored and should have space allocated for database dumps. I generally like tempdb to be at least 100 MB.
|
Usage |
Multiplier |
Example |
Total |
|
SYBASE SYSTEM DB’s |
3x (mirrored, dump) |
30 |
90 |
|
SYBASE tempdb |
1x |
100 |
100 |
|
Sybase Application |
3x (mirrored,dump) |
100 |
300 |
Typically, it is a good idea to have significant spare hard disk capacity in your application databases (50%), because users generally prefer to deal with issues like space once instead of every six months. Purchasing disk drives from the hardware vendor is generally NOT a cost effective solution. It is suggested that a reputable disk subsystem vendor (like Box-Hill) be contacted in addition to the hardware vendors. Hardware vendors disks tend to be old technology and tend to be overpriced.
Simple space analysis is only one part of the story as far as purchasing disks. As each database will require several devices which (which will be mirrored), it is important not to run out of disk partitions for your disks. Under system 10, you will need about 10 disk partitions just to run a basic server (master, sybsecurity, sybsystemprocs, data, log - everything mirrored) and will probably need an additional partition for tempdb. While tempdb can be placed on master (and often is), there are some minor backup and recovery issues here (dealing with the time to run them) and buildmaster is slower on a 300 MB device than on a 40 MB device. Be aware that with the large disks available today, it is possible to have adequate disk space, but not have an adequate number of available raw partitions.
All UNIX servers that I have seen have at least 64 MB of RAM. Sybase does, however run fine on systems with only 32 MB. Large applications should purchase 128 or 256 MB RAM. Additional RAM will help because it gives the server the ability to cache more of the database which will speed IO throughput.. Most smaller servers have their own tape drive, although larger systems can have dedicated external tape stackers. A CD player is NOT necessary for these systems even though all software is distributed on CD. Software installation is a very rare process, and should be done out of a central server (except for OS distribution). It is sufficient to use a common CD drive that is located in a secure location. It is useful to track who has this CD drive, as it will not be available when you most need it. You can never have too many tapes. Tapes are very expensive, and they should be stored in a secure location. The minute a new server is delivered, the administrator should mark it with an indelible magic marker. This prevents systems from "vanishing" into thin air. One final note, it is unnecessary to purchase a 19" Color monitor for a console. A cheap green WYSE terminal will do. It is the console, and it does not need to run X windows.
The most important (and time consuming) aspects of installing a new system are the acquisition of space (and power) in the data center and networking (cabling). These tend to be overlooked items and you should do this well in advance of the server arrival. It is also important to acquire the I.P. address of your server before the system arrives.
One useful tool when you install new servers is some form of label maker. This is very useful for labeling machine names and for marking disk drives. This can be very useful in disaster recovery.
Production vs. UAT vs. Development
There should be separate production and development servers, each on their own machine. When financial constraints dictate, separate SQL servers on the same machine can be acceptable, but this solution should only be used as a last resort. Proper system setup involves totally separating production and development / test environments.
In larger environments, the UAT (staging, quality assurance, acceptance - whatever you wish to call it) environment should logically reflect the production environment as closely as possible. This system need not be mirrored as is production, and need not have the same horsepower in it's CPU. This means that you could use a SparcStation 20 as a UAT server for a SparcCenter 1000 with limited impact. Mirroring and CPU horsepower are generally transparent to the application.
How to Set Up UNIX
If you intend to install more than a few UNIX servers, it is important to write down any options used to set up the systems. This way, all your systems will be installed consistently. If one system has /usr/games, they all should. The directory structure of your UNIX servers should also be predefined and consistent. Under SunOS 4.1.3, it is recommended that /var, /usr, and /tmp be on their own disk partitions. Any local information should also be on its own partition (/home or whatever), and it is generally good practice to put the directory in which you will put Sybase dumps (/dumps) on its own device. It is also recommended that UNIX swap space be at least twice the RAMRam for each machine. Depending on the number of external disk drives, it is generally NOT a good idea to put much other than the operating system on your internal disk drives (do not buy internal drives if you do not need to).
UNIX Directory Structure
It is recommended that a standard file system be used for all Sybase servers so administrators can know where tools and critical files are in the case of a disaster. One way to do this is to create a Tool Server that is automounted in an appropriate location. This tools server would be a file server that contains all the nice utility programs that administrators would like to have on their system, but would contain nothing essential to the functioning of the server so the unavailability of this tool server will not break your Sybase server. If a separate tool server is not deemed a good idea, a staging area could be set up on a central server, and tools copied to the Sybase servers. A standard file system allows administrators to fill in for each other and makes support easier and more cost effective. This is both a technical and business issue. As an example, this directory tree should contain directories:
|
DIRECTORY |
PURPOSE |
|
.../logdumps |
A directory to place Sybase transaction log dumps. This should be located near the .../dumps directory. |
|
.../dumps |
A directory to place Sybase dumps should be local to each machine. This should be located near the .../logdumps directory. |
|
.../Sybase |
The Sybase directory should be either placed on each system or automounted. No user or administrator files should be placed in this directory.. Nothing besides the original sybase software (plus the stuff in the install directory) should reside here. |
|
.../tools/security |
Security Auditing tools. Directory should be read only. |
|
.../dbcc |
Directory for dbcc output |
|
.../tools/Sybase |
Set of tools useful for Sybase monitoring. Example tools should include perl, gzip, top, ksh, cops, ntp, gcc, sybbooks... |
|
../errorlogs |
Error Logs |
|
.../audits |
A directory for system audit reports should be rw mounted on each system. |
Note that .../ indicate that the location of the above directories is unimportant. The existence of the directories in a consistent location is important, but the actual location is not.
An example file system is as follows:

UNIX File Permissions
It is important that the Sybase directory on the UNIX system have no writable files by group other. It is important that all Sybase raw partitions (and UNIX devices if these are used) be rwx by the Sybase account only. There are several methods to break Sybase security in (Pre System 10) if this is not the case.
Common UNIX Files
A set of common UNIX files is very useful for your Sybase servers.
printcap: It is nice to have a few printers common to all UNIX machines. This allows you to print errorlogs from any Sybase server you have access to.
interfaces: The Sybase interfaces file should have all necessary servers in it. A master interfaces file would allow this transparently. Be aware, some construe an interfaces file with too many servers in it to be a security hole.
Startup
The Sybase server and all necessary processes should start when the system is booted. Under SunOS 4.1.3, this is managed from the /etc/rc.local file.
Cron
The UNIX Cron scheduler is usually used to manage nightly backups and transaction log dumps if no other scheduler is available. If this is the case, it is recommended that two shell programs be written: one for the nightly backups and dbcc’s, the other for the transaction log dumps. These programs should not be writable by other than the owner. Any jobs written to run out of cron do not have their environment pre-set, and variables like $SYBASE should be set explicitly.
Disk Mirroring
Some form of disk mirroring is essential in all production environments. In smaller Sybase environments, using Sybase’s software mirroring is adequate, but software mirroring has a performance impact, and hardware mirroring should be used for any systems in which I/O performance is a bottleneck. As a note, the timing numbers that I have seen indicate that mirroring does NOT speed read access as the manual says it does. The availability of cheap hardware mirroring (RAID ) subsystems makes use of RAID a very nice thing if you have the budget. Mirroring in hardware is much faster than mirroring in software. Disk Striping is something that should also be implemented as it spreads disk io across disks with little effort. Disk striping can be implemented using by a product like Veritas or Disk Array Plus. Use of Sybase segments to stripe disks can offer significant performance benefits, but they truly can become a maintenance headache and use of them has been phased out in most large Sybase shops. If you are using Sybase segments, be sure never to map multiple segments to the same device. If you map multiple segments to a device, you will be unable to tell how full the segments are, and you will run out of space eventually.
Note that the 4.9 manual stated that disk mirroring would speed disk access. This has been acknowleged as a documentation bug by sybase.
RAID
Raid is the way to tune most modern Sybase systems (especially if you have money). The following are my notes on Raid. I dont know much about how to compare RAID hardware, but the following are general concepts.. Please do significant testing on different RAID vendors. The following are my notes on raid:
1) Raid 5 may not always be the best. Just saw some numbers that showed 1 + 3 to be better than 5.
2) Do not be fooled by the packaging, RAID technology needs good underlying disk and controller technology. Devices like the sparc storage array use very old technology disks and therefore are not fast RAID arrays (although recent numbers bely this).
3) Raid enhances performance (obviously), but your raid box can add single points of failure. Some raid boxes have 1 power supply for the whole chasisbox. What happens if it goes. How many controllers do you have? What kind of connection to the box? Are the disks hot swapable? Think your hardware through.
4) Striping across disks is key to performance out of sybase. Just using hardware mirroring does not do the same thing. If you cant stripe, dont raid. A good stripe manager is a good plan.
5) Stripe sizes matter a *LOT*. Talk with your vendor and have them suggest some values (16K is the value for SUN suggested). You probably need to tune here. Look at your disk montoring software.
6) I am not sure how to compare "raid boxes". I think that if you have a good number of controllers (i suggest 1 controller per 1-2 disks), you should just be able to simply compare disk seek times... but i am probably wrong here. 10 controllers with 30x1GB disks with striping would probably be better than 10 controllers & 10x3GB disks.
7) I believe that RAID and striping make disk based sybase tuning irrelevant. This means that there is no need to spread data, log, and indexes out across devices... This is a big deal to us sybase guys. Think about it though... If you are striping, then your average disk usage will be fairly similar between disks. There will be no hot spots. If you have no hot spots, you dont need to tune, especially if there are a lot of users.
8) Monitor your raid configurations in production to be sure that there are no hot spots. Hot spots should not exist and point #7 is only relevant if there are no hot spots.
9) I think the best way to lay out your RAID disks is in 3 sections. Section 1 is data you wish mirrored and is spread over half your controllers. Section 2 is the mirror of section 1 and is spread over the other half. The final section is your unmirrored data, which is spread over all your controllers. This is ok, assuming that access to the unmirrored data is not correlated to the mirrored data.
10) Do keep at least 2 sybase devices (data and log). You do not do this for performance (see point 7), but because you want a good logical layout in case you ever want to move to other servers or hardware. A rumor that I recently heard was that sybase performs better when data is spread over several devices instead of just one. I dont know why this would be the case (a disk address is a disk address) but if you have some information here, I would appreciate it.
Sybase Devices
Raw Partitions Vs UNIX Files (and ramdisks)
Production Sybase devices should be always placed on raw partitions, rather than on mounted file systems due to recovery issues. UNIX files are write buffered while raw devices are not, which means that if the power cuts out, it is possible that information is not written to disk and is therefore lost if Sybase is using UNIX Files. tempdb can be placed on mounted file systems (which are faster than raw partitions) or on a ramdisk because it does not matter if data is lost there. [ NOTE: this does not apply to OS like AIX where ASYNC I/O is applied to both block and character devices.]
On Sun boxes, UNIX file writes are done using "synced" IO so heavily used OLTP systems where the Unix file buffer is not sized big enough will have their buffers fill up and basically halt further writes since the operation is syncronous.
Under 4.9 and System 10 (but not with System 11), Sybase internally queues I/O requests in order of device number (vdevno). This means that the master device (by default vdevno 0) gets serviced first. If you create a separate device for tempdb it should have a low device number (like 3, which is the first one available in system 10). It might also pay to spread tempdb over multiple devices (untested), which will reduce possible contention. This is only of interest if tempdb is heavily used .
Location Of Devices
It is important to spread data and log for each database across multiple disks and controllers. The log device logs all changes to the database and so will be in direct IO contention with changes to the actual data. This means that if you have data and log on the same disk, the disk will write the log and then write the data. If they are on separate disks, the operations will be performed in parallel. Note that simply placing information across disks is not enough, and the administrator must also analyze the disk configuration with respect to the disk controllers. While disk controllers have a buffer so they are not truly serial (as are disks), this buffer can fill up at which point the disk controller would be the bottleneck.
A very simple method to lay out devices on disks is to use a simple round robin approach. Larger systems should probably purchase a disk / volume manager to spread data evenly. The round robin approach goes something like this: start with master and go through all the other devices needed (and their mirrors). Simply assign these devices to disks in order. Creativity is useful here. Personally, I prefer systems with an even number of disks because they are simple to understand (even if they are slightly sub optimal performers). Mirrors for devices on disk A are always placed on disk B. The following is an example of a round robin with 3 databases and 4 disks. Disk 1 is always mirrored to disk 3 and 2 is always mirrored to 4.
|
USE |
DISK |
MIRROR |
|
master , model, sybsystemprocs |
disk 1 |
disk 3 |
|
tempdb |
disk 2 |
N.A. |
|
database 1 data |
disk 1 |
disk 3 |
|
database 1 log |
disk 2 |
disk 4 |
|
database 2 data |
disk 1 |
disk 3 |
|
database 2 log |
disk 2 |
disk 4 |
|
database 3 data |
disk 1 |
disk 3 |
|
database 3 log |
disk 2 |
disk 4 |
Warning: once devices are set up, it is difficult to move them around. Extra partitions on these disks would be used to store UNIX disk dumps.
Mirroring
All production devices (including master) should be mirrored (except the tempdb device). The server should be started with the -r option so that the master mirror is damaged, the server will start (it will not if you do not use -r). One problem with mirroring is the fact that failure of a mirror is not automatically recognized by administrators. Be sure to monitor your errorlogs or system tables periodically to detect broken mirrors.
Tempdb
Tempdb need never be mirrored because it is always recreated on server restart. Mirroring tempdb would just slow it down. As tempdb is recreated on startup, tempdb should be placed on a UNIX (cooked) file system instead of on a raw partition for performance reasons. This works fine. As a test, take your server and halt it. Truncate the UNIX file that contains tempdb. Restart.... tempdb is recreated.
Many administrators remove the 1st 2 MB of tempdb from the master device (this is by no means essential and most administrators do not). If you leave the first 2MB of tempdb on master, you will not see much of the performance gain from placement of tempdb on a ramdisk or on a UNIX file system because Sybase will use the 2MB of tempdb in master before using the remainder of tempdb. The Sybase installation procedure will put the 2 MB of tempdb on the master device by default. tempdb is a heavily used database (used in all joins). If tempdb and master are both on the same disk, you could have disk contention. On the other hand, the master device (until System 11) is slightly faster than other devices.
If you have a big budget, many large systems will have tempdb placed on faster media (ramdisk). A ramdisk provides a fairly significant overall system performance boost if tempdb is placed on it. As a rule of thumb, I expect about 20% performance gain from placement of tempdb on a ramdisk (this was the result of one application I comparison tested).
Databases Over 2 GB
Backup and recovery becomes very complicated with databases over 2 GB in size because you can not dump these databases to file systems. In 4.9 or earlier systems it is good practice to avoid databases larger than 2GBIT. IS GOOD PRACTICE TO TRY AND AVOID DATABASES LARGER THAN 2 GB. Any Sybase Databases that are larger than 2 GB must be backed up directly to tape (in pre system 10 systems) or dumped to striped disk devices (post system 10). The time it will take to recover any database this size will severely impact your ability to restore in any disaster situation.
There are also problems with the time it takes to DBCC your system once you exceed a few GB in size. Generally it will not be possible to dbcc everything every night in this case. One solution is to dbcc different databases every night. If you have system 10, Yyou can use dbcc tablealloc and dbcc indexalloc instead of dbcc checkalloc to check individual tables within your database on a table by table basis. This is analogous to the relationship of dbcc checktable to dbcc checkdb. You could schedule different tables on different nights so that over a few days you check the whole database.
One method to solve the very large database problem is to split out the larger tables into their own databases and use views to access data in these databases. Thus if your database has 5 tables each 1 GB in size, you would create 6 databases. The main database would be small and contain the 5 views. The other databases would contain the data (one table per database). This solution works well in some cases, but it can lead to space usage, permission, and recovery issues that the administrator should think through. One of the many databases can fill up unless growth is well planned. Any recovery of a database can lead to referential integrity issues that must be resolved.
Other Sybase Setup
Be sure to reset the memory and user connections variables in the Sybase server using sp_configure. Forgetting to reset these parameters will not help your performance.
Tempdb is set by default to an absurdly small size. Tempdb should be increased immediately on server startup.
I highly recommend that all administrators install the Extended Sybase Stored Procedure library. It is free and can be downloaded from: http://www.tiac.net/users/sqltech/
Generally, log space should be between 20 and 25% of data space. All databases should separate data and log. Each Sybase database should have two dump devices predefined with a consistent name. One will be for log dumps <database>_logdump and one for full dumps <database>_dbdump. A test of these dump devices is essential when they are set up. Invariably, the path will be typed incorrectly or Sybase will not be able to write into this directory and there is no way to tell from casual observance that this happened.
Networking Setup
Consistent use of broadcast and netmasks is important to having your systems function well.
Never install a server on a new I.P. address unless you are SURE that no other system is using that address. After you check with the appropriate individuals, ping it to be sure.
I have found it good practice to have a networking person or really competent UNIX administrator check my work on the network side. UNIX networking is one of the more cryptic aspects of system configuration (have you ever successfully mucked with uucp... yuch...)
NAMING STANDARDS
Sybase supports object names up to 30 characters long so there is no particular need to use abbreviations.
UNIX
Hostname
Systems are typically named either according to some standard or according to some theme (planets, cars, states...). The naming standards I have seen usually involve concatenating business unit, location, role, and a number. For example, the name slnydev1 could represent the New York sales development #1 server and slsfprod2 could represent the San Francisco sales production #2 server. A standard of this sort is usually only used in larger installations because most administrators prefer names that are fun so long as the number of servers is small. A final standard that can be used in small shops is simply to name the system after what it is doing (sales, dev1). System names are in small letters.
Aliasing of hostnames can be used to have multiple names for a given system. The internal name is only used by scripts on that machine, so if your server is named cheqprod1 (Chicago production equity server), the users might simply refer to it as equity1. Remember that most users of Sybase applications never see the system name except in their PC setup (which an administrator will probably do) so the name is of limited importance.
If a failover configuration is available it is important to be able to differentiate between the primary (active) system and the secondary (failover) system. Names will vary depending on the failover software used. If your software guarantees that a certain name will always be primary, use something like sysa and sysa_backup. If you cant guarantee this, do not name the servers in this manner. It is very confusing when the server saturn_b is primary instead of saturn_a, and can lead to errors.
Kernel
A recommended naming convention for kernel’s is to concatenate keywords with the plus sign. For example, SYBASE+DBE or SYBASE.
Sybase Setup
Server Names
Naming requirements of Servers are similar to names of systems. Server names should be in capital letters. One server machines, the Server should be the same name as the System (in capital letters).
All servers should have their name set so @@servername works.
Sybase Disk Devices
Sybase disks should have names that help administrators. Note that it is difficult to change disk device names once they have been set. If the administrator can guarantee disk stability (i.e. usually), names like sd4b or c1d1t2s3_log are great. The other standard is to name the devices with names like datadevice and logdevice. The purpose here is to help the administrator spread data and logs between disk devices easily.
Devices placed on UNIX files should never be placed in the Sybase home directory. This directory should contain only Sybase provided files.
Sybase Dump Devices
There are two types of database dump devices: full and transaction log. The naming standard that I have seen commonly used is <database>_dbdump for full dump devices and <database>_logdump for transaction log devices. I recommend creation of both of these devices for each database created even if they will not be used. If the devices are created, it is an easy matter to have a script that autoconfigures itself to dump all databases (except model and tempdb) every night and another that autoconfigures itself to dump all transaction logs of databases that have truncate log on checkpoint off.
The dump devices should point to files that the administrator can find easily. One standard that works well is to have a directory /dumps (in the root directory). This directory can be a symbolic link to whichever disk you wish the disks to be placed on. In that directory should be two directories for each server. One will be named <SERVER>_dumps and the other <SERVER>_logs.
Login Names
My personal favorite naming standard for logins is <first letter of first name><last name>. This seems intuitively better than names having the opposite order and is clearly superior to names that have no intrinsic relation to the actual user (admin01, mars). Another naming convention is <lastname>_<firstname>. This convention is a bit wordy but does alleviates some confusion.
All accounts should be named for the primary user. Use of group accounts violates a basic tenant of security (be able to identify who does something). While it is work to add a large number of users, in the long run you will be glad you did it.
While much has been discussed about cases where multiple users have the same name (two John Smiths...), these cases are actually quite rare in real life. If such a situation arises, simply change one or both of the names by adding a number to it (i.e. jsmith2).
Database Names
Database names should usually be obvious based on the type of data in that database. As with all Sybase names, 30 characters are allowed, and long names are preferred. Database names should be in lower case and should not include the word database or the initials db. One note, sa should never be the only dba for a database. This not only implies that the sa password is widely available, but that you will never be able to delete that database because you will forget why you set it up and will never be able to contact anybody to remind you (it happens). The use of a group account (salesdbo...) as dbo will lead to the same problem (who should you contact in a year when the database finally runs out of space...
Sybase DDL
The following suggestions are made regarding Sybase DDL names.
Table
Tables should be named with the common English noun that represents a row of data. Table names should be singular and lower case (i.e. use person instead of people). The table name should not contain the words file or table. Many organizations come up with a list of synonyms for keywords (i.e.. use yr for year and qty for quantity). Ending the name of the table with _lu is a good idea when dealing with lookup tables (ie. color_lu, type_lu).
Trigger
Triggers should be named <tablename>_<action>_trigger, where <action> represents ins, del or upd.
Views
View names should be defined exactly like table names, but should be prefixed by a v_.
Stored Procedures
Stored procedure names should start with a prefix (like ap_ for application procedure, pp_ for Powerbuilder procedure, sp_ for system procedure...) that allows broad categorization of the procedures. I have found names with prefixes easier to read than those with suffixes (like _proc). Plurals can be used in procedure names if it is relevant. For example, get_row and get_rows mean different things intuitively. One recommended grammar is <prefix>_<verb>_<adjective>_<table> that leads to procedures like ap_get_all_orders, ap_upd_instruction, and ap_get_tall_people.
It is useful to break up procedures that are used by front ends like Powerbuilder from those with a more general purpose. Stored procedures that are used closely with front ends typically belong to one screen only and will be changed when that screen changes. If the developer knows the procedure is not used elsewhere, the change will not impact the environment as severely. Add the screen name to the grammar for front end stored procedures. A good grammar here is <prefix>_<screen>_<adjective>_<noun>, which will lead to names like pp_ord_get_ord, which may look a bit redundant, but actually is quite clear in real life (a Powerbuilder procedure that gets orders and is called from the orders screen).
Columns
Columns should be named with the common English noun representing the data element. The units of the element should be discernible from the name if necessary. Column names should never be reused for another purpose within the same database (this is very confusing) and all columns that have the same name should have the same definition (char(10) not null) to prevent errors when joining data. It should be assumed that any columns named xxx can be directly compared with any other columns named xxx.
Other DDL
User datatypes should be in lower case and have a tp_ prefix.
User rules should be in lower case and have a rl_ prefix.
User defaults should be in lower case and have a df_ prefix.
CHANGE MANAGEMENT
Systems Control
The only way to explain this is that developers MUST not have access to production systems. This is a cardinal sin. This means that developers have their own system, and production users have their own system. If a unit test or staging system is desired, it should fall under the realm of the developers.
The DDL for each server should be saved separately. This becomes more complicated if a system that extracts DDL directly from the modeling tool (Erwin...) into the database is used. It is recommended that DDL be exported to files in all production and staging systems.
Source code control is a big issue with database systems. It becomes much more complicated when multiple versions of the same database are placed in different locations. The only way to not get your database layouts fouled up is to develop a plan for application upgrades to each database. It is recommended that the development database always be in "sync" with the most current version of the model. The test and production databases should be upgraded only when developers request a turnover..
Use of a source code control system is essential in multi user development (especially front end development). Many tools such as Powerbuilder have source code control built in (Bravo!). Source code control for the DDL (tables and stored procedures) needs to be standardized, but it is not essential that a formal tool be used as it is rare that multiple developers will work on either the model or on a single procedure. Usually, the author of each procedure will be in charge of making changes to that procedure, and a rigorous procedure is not necessary. Courtesy, commenting changes, and making nightly backups of the source directories is essential to an ad hoc source code control system.
Documentation of procedures is one of the overlooked concepts. I have had good luck with a word document template in which a brief summary of each procedure (name, author, what it does, parameters...) is rigorously maintained. Of course, the DDL files need also be updated, so many people try to define an automated mechanism to document their procedures. These mechanisms are rarely used and I have found that NOTHING beats a nicely formatted (bold, multiple fonts) manual page for readability. As a consultant, I ALWAYS turn over quality documentation... It is worth the time to do the documenting twice.
One final note, a representative of the users and all developers should be notified of turnovers. Lack of developer communication has resulted in numerous problems that should never have occurred (i.e. developer a does not know that the code has changed and is trying to fix a problem). Electronic mail regarding the fact that the turnover occurred and mentioning exactly what changed is important.
Control of Clients - Software Distribution Issues
It is important to manage client software distribution. This can be a thorny issue when you have hundreds of clients for an application. It is suggested that large applications set up some automatic software distribution system. For performance reasons, clients generally run applications off of the local hard drive. Some method of checking the applications LAN server and copying new software if there has been an upgrade is generally advisable and will save you from a version control nightmare.
Use of Views To Aide Change Management
One good method of managing versions is through the use of views. Basically, each table will be represented by a view that selects all columns from that table. The view for the table xxx would be named v_xxx_01. When an underlying table is changed, all the views are changed so that they function identically. For example if xxx had the column y and z, the initial view would be select y,z from xxx and would be named v_xxx_01. If column p was added and column z changed name to t, a new view v_xxx_02 would be added (select x,t,p from xxx) and v_xxx_01 would be changed (select y,z=t from xxx). Application logic would not need to change at all.
PERFORMANCE TUNING
Hardware Tuning
Generally tuning systems for optimal performance is a simple matter. If you spend more money, you get more performance. There are, however, methods of cost effectively tuning for performance. In order of importance to overall system throughput, I would:
Sybase Tuning
The first trick to tuning your system is to identify the essential characteristics of that system and to identify potential bottlenecks. Consider if your system is read or write intensive. You can often add indexing to speed things up on read intensive tables. The same is NOT true for write intensive tables. The next thing to do is to analyze your system to decide where to tune. Tuning an IO subsystem is only useful if your system is IO limited. It is best to decide which
Spend some time optimizing disk device layout. It is an easy way to increase your performance.
Avoid cursor processing unless necessary. It is generally faster to process batches than row by row because that is how the system is tuned. Loops are especially slow.
Avoid long indexes or unindexed tables that have more than a few rows. If there are any null fields in an indexes add 5 characters to the index length plus one per field that allows nulls. Generally null fields in an index are not desirable due to data distribution issues.
Make sure your statistics are updated with information that reflects peak business usage. If you have a table that starts each day with 0 rows and ends each day with 2000 rows, the statistics should be updated somewhere between 1000 and 2000 rows.
Tables that are constantly updated or inserted into can be come very fragmented. Periodically rebuilding indexes to these tables can save space and increase performance. This is very important when the index’s first field is either a datetime or an identity field (ie. you are always adding to the end of the table)
Make sure that "transactions" do not linger in the database longer than is necessary. Any application using Sybase transactions should perform as fast as possible. Transactions hold locks, which can block other users and slow the system dramatically.
Triggers should only be used in rare circumstances (specific RI checks, auditing). It is preferable to restrict access to the database to stored procedures instead of using triggers. Remember, triggers are NOT compiled, and are therefore slow. They are also counter intuitive (something happens when you are not looking) and make debugging a nightmare. Finally, triggers are by definition a transaction so long, complex, triggers can easily cause deadlocking. When deadlocking is a serious problem, the first thing I look at is trigger design.
Use showplan and statistics io as much as possible on slow queries. Look for table scans and deferred updates (tmp tables being used).
The optimizer can only handle 4 tables at a time (at least untilunder Pre system 110, but I do not think this has changed). This means that a 5 table join is performed as search for a gooda four table join, with the results being joined "somehow" to the remaining tables. The optimizer often will scan tables when you are over the 5 table limit and it is to the 5th table. As the optimizer will use a temporary table, it is preferable for you to explicitly create the temporary table.
SECURITY
Three levels of security are required to prevent unauthorized access to applications and data. The first level prevents unauthorized access to the network. The second level prevents normal users from viewing data that they should not have access to, and prevents them from damaging the systems. The final level deals with knowledgeable users (like an administrator) who know how to circumvent procedures. Be aware that the Client Server environment is intrinsically insecure and knowledgeable hackers can break security (and will be able to do so until some network encryption like kerberos is widely available). All you can do is be aware of the risks and watch for odd occurrences.
Unauthorized access to the network is prevented by restricting dial in access to the network. There are a variety of products that secure the network.
Attacks by individuals who have access to the premises are much harder to protect against. Password security is a good front line method of securing data on systems basic security procedures should foil most unauthorized access attempts, provided users follow normal security procedures such as logging out at night, using passwords that are hard to crack, locking their terminals when they are unattended for periods of time, and not writing their passwords where anybody can find them. Any system that someone might wish to break into must have additional security customized on an application by application basis. Sybase security is very flexible, and sensitive parts of the database can be restricted to a given set of users at the database, table, or even column level. Setting up a Sybase security system (with groups, grants and revokes) can prevent damage by normal users who have managed to circumvent your front line defenses.
Both UNIX and Sybase have the concept of a trusted user who can perform maintenance and who can fix the system if it breaks. Normally access to these "trusted accounts" is restricted to a small group of individuals who act as administrators. The UNIX / Sybase "trusted accounts" are as follows:
|
Account Name |
Account Description |
Frequency |
|
root |
UNIX administrator. Required for maintenance |
1 per machine |
|
Sybase sa |
Sybase administrator. Required for maintenance |
1 per Sybase server |
|
dba |
database administrator. Required to grant database access |
1+ per database |
UNIX Security
The UNIX end of your system should be really secure because most production Sybase servers do not permit logins by normal users. This being the case, your security is all but guaranteed. It is also good practice to run some UNIX Security tool (e.g. cops...) nightly. These tools are widely available and are free. At a minimum, they can detect break-ins after the fact.
The following is a list of additional recommendations that should be considered to keep your system secure:
Sybase Security
The very nature of Sybase makes for a large number of potential security problems. Many administrators neglect security and allow login id = password as a default. If there is ever a problem, sloppy procedures can land you in trouble. Basic precautions are important.
Using System 10 Roles
System 10 has offered an interesting concept in the use of roles. Basically, you can set up permissions so that different user id’s can do different things, and Sybase provides no way of finding out which roles have been granted to whom (a real good security plan)
[ NEED INFORMATION HERE ] except for some unacceptable ones. The claimed advantage here is accountability, but unless you are auditing everything (using sybases brain-dead auditing facilities) it is not real in most environments.
Generally, administrators will have the three special roles (sa_role, oper_role, sso_role) granted to them. I guess the sso_role is appropriate if you site has a separate security officer that adds or drops users or if you have special software that does this. I have yet to see it in practice. The oper_role is more useful for general environments, where operators might need to get in to the system, but generally systems are set up so cron jobs do the dumps automatically without human intervention. The bottom line is that I dont see roles as that useful until Sybase comes out with a "user definable" role, where you can control exactly the type of permission you want a user to have.
Application Security
Application security is a complex issue. Each application should develop a set of internal audits that guarantee that nobody is misusing the system. The effort expended here should be proportional to the damage that could be caused due to a problem. The other important thing to remember here is that Sybase does NOT allow users to change their passwords unless they have a tool like isql (which is probably not desired). If users should be able to change them, the application must code up the front end for this.
How to force users to set their passwords
Some applications start all users with the same password (which need not be known by the users). When the application logs in for the first time, it will test for that password and, if they match, will ask the user to input a new password. This allows users to set their own passwords and allows administrators to deactivate accounts that have not been used.
In conjunction with the above idea, the application could track when users set their passwords. This would allow the application to manage expiration of passwords and to force users to reset their passwords in a safe, secure manner.
How to Preclude Use Of Other Front End Tools (isql)
One common idea is to have the user's password be something different from the server password (programatically encrypt it in the front end). The method of encryption need not be complex, so long as the users do not find out (i.e. you could just change the case of the password...). This will ensure that users can log in to the server through the application, but will make it virtually impossible to log in using another front end like isql.
What To Do In Addition To Sybase Security
Applications should provide their own level of security. For examples, if the application relies on stored procedures to make sure the "books balance", a little checking procedure could be run every night to make sure that nothing "funny" happened during the day (i.e. a user hacked in some data). This is not really the administrator's department, but as the chief system "cop", it would not hurt to make this kind of suggestion to the application developers. It really is good practice to have a debug_application stored procedure. I generally offer a debug option on my applications menu’s so users can find if things are wrong before calling me.
How To Give Different Users Different Looks And Feels
It is easy to develop a security table that is read by the front end. This table could define which menu items each user has access to and which functions the user can use. It is generally good practice to allow setting and disabling of each menu item if you are going to allow any control at all. There are two ways to guarantee that users have rows in this application security table. One way is to place an insert and delete trigger on sysusers and sysalternates (this does work and I have done it) to manage that table. When sp_adduser or sp_addalias is run, a permissions row is added. You could also write a stored procedure that is run when the application is started. This procedure will return the appropriate permission information, adding rows based on the defaults if none exist.
Forcing The Users To Log Out At Night
Users will leave their application logged in to the system when they go home at night. This is a big security hole (to say the least). It is easy to use the applications idle timer (Powerbuilder for example has one) to log user out if inactive for too long. Sybase provides no server mechanism to disconnect clients (except for a reboot which is not necessarily desirable), so if a client mechanism is available, use it.
Security Mailing Lists
Christopher Klaus of Internet Security Systems, Inc. has produced a nice comprehensive FAQ of security mailing lists. These security mailing lists are important tools to network administrators, network security officers, security consultants, and anyone who needs to keep abreast of the most current security information available.
To get the newest updates of Security files check the following services:
mail info@iss.net with "send index" in message
http://iss.net/
ftp iss.net /pub/
I have not copied his information in here but highly recommend retrieval of this list. One good mailing list to subscribe to is the Best of Security mailing list. To join, send e-mail to best-of-security-request@suburbia.net and, in the text of your message (not the subject line), write:
subscribe best-of-security
To unsubscribe to mailinglists managed by majordomo (like the above one), send a message of ‘unsubscribe xxx’ to the same address you subscribed to.
BACKUPS
Every night a set of scripts is run on each server that will perform routine administrative. These scripts are generally known as "operations scripts". These scripts are responsible for running backups and for running a program known as DBCC (database consistency check) on each database to check for data integrity problems. There are several issues that deal with backups, most notably the fact that Sybase does not provide you with them.
The following actions are generally performed by Operations Scripts as part of normal Nightly Procedures:
Dump Database
Object Level Dumps (optional)
Dump Transaction (usually runs all day on 5-15 minute interval)
UNIX Tape Backup
Update Statistics
Rebuilding Indexes
Recompile Stored Procedures
DBCC & Filter Output
Server Reboots
Failover System Management (loads of full dumps and log dumps)
Browse Sybase Errorlogs
Backup and Recovery Procedures
Generally, Operations Scripts are shell scripts and are run out of cron, the UNIX scheduler. Many larger shops have, however, purchased scheduling packages that are significantly more robust than cron. Recently, I have found several shops that have written the majority of their scripts in the perl language (with sybperl extensions). Perl has many features that shell does not, and is not a bad choice for your scripting language.
The Operations Scripts are generally composed of several modules. One module should dump transaction logs. Another should manage full database dumps to disk and manage runs of the dbcc program. If necessary due to failover setup, A third module should ftp files from the production systems to any fail over systems. A fourth module should provide a method to load log dumps into this backup server (this is difficult to do by hand). A final module should be a driver module that runs a configurable set of Nightly processing scripts to occur in the appropriate order.
The output of the results of the nightly jobs should be reviewed by operators (either print the output or copy it to a common directory). Any problem discovered while the nightly jobs are running should alert the appropriate error notification system. When an alert is trapped, the operations staff will contact the appropriate systems administrator who will resolve the problem.
Sequencing the operations scripts is important. It is important to ensure that your backups occur during the required window of processing time. This may be difficult on large database systems. Dbcc’s take the longest of any of the processing steps (unless you are performing extensive bcp’s).
Processing Steps
There are several ways to perform backups, but the best is probably to dumping each Sybase database to a UNIX file. A UNIX file provides a high level of recoverability because it is readily available and therefore the files can be loaded much faster than tape (you usually have to find the tape). It is recommended to have 2 full dumps on disk at any point in time (in case one is bad). Dump files should be stored with a datestamp tacked on to the end (i.e. name the file xdb_dump_Jan_11_1990_11:33PM instead of xdb_dump).
Dump files are large. There is no reason why a dump file should not be compressed. Regular UNIX compression does a good job. GNU provides a free compression program (gzip) that is better and faster. Be aware that compression of files requires that the file system have significant extra space.
Once all databases are dumped to tape, dbcc’s should be run against each database. It is recommended that dbcc checkdb, dbcc checkcatalog, and dbcc checkalloc be run against each database every night. After the dbcc’s are run, their output should be filtered and if any errors occurred, an error notification system should be called warn operators.
DBCC will result in spurious warning messages when a database is not in single user mode. There are two methods for dealing with spurious messages. Method one is to rerun the dbcc command and then diff the output using the UNIX comm command. Method two is to look for particular messages and then just rerun a checktable on the tables that are shown to be corrupted by the first pass.
Any special processing (update statistics, dbcc object level dumps...) should then be performed.
A program should then scan the Sybase error log and mail any new messages to the administrator.
After the dbcc’s finish, all UNIX file systems (which include the Sybase dumps) should be backed up to off-line media (tape).
In 4.9 and earlier, it was only possible to perform a single operation per server at once (i.e. you could not dbcc multiple databases at once). DOES SYSTEM 10 CHANGE THIS??? Under System 10, multiple dbcc’s can be run at once.
Another procedure should be in place to automatically dump transaction logs for any databases that need it (this can be autodetected in the database). Like database dumps, transaction log dumps should be saved with the datestamp tacked on to the end. Be sure to name the dumps in a manner that facilitates loading the log dumps in order. Failure to load them in order will result in the server rejecting the load, an error that is almost impossible to comprehend or fix.
Tapes
Tapes should be labeled clearly and should be stored in a secure tape library (do not scrimp here if you are doing it yourself). It is important to define a tape procedure. It is also important to be aware of the number of times your tapes can be written to before they fail.
How long to store your tapes is a matter of business policy. Because tapes do go bad, it is recommended that older tapes be used for permanent storage. An example of a tape policy is the following: Nightly tapes should be kept for at least a week. End of week tapes will be kept for 6 months. End of Month tapes will be kept on site for 6 months and then stored permanently off site. Your procedures will vary.
A few more notes on tapes. If you are using the UNIX dump and load command, be aware that these commands are not intuitive (to say the least) and it is probably good to have written procedures on how to work the loads. I have found that the data center operations staff can usually mount a tape within 30 minutes in an emergency, but this time should be included in any documents relating to recovery that you might write. Random checking of your tapes for readability is also a good idea. I have encountered at least one situation where the tape that was to be loaded was blank. When you need your tape backup, you should be sure it is there.
Issues With Backup Procedures
If an application consists of multiple databases, remember it is possible to have dumps from different times yielding referential integrity problems (i.e. the app database refers to stock ABC and the stock database has no such record in it). This can happen if your dumps occur at the same time any load is occurring. It is important to understand the application requirements, and to be sure no loads are being performed during the database dump window.
Administrator Responsibilities
Familiarize yourself with the procedures required to recover from disasters through the use of backups. Be sure you understand the procedures for recovering a lost master device or a lost master database. Before you can say that you can recover you should be confident of the following:
One good idea here is to make backups of your main master tables using bcp each night.
OTHER
Killing SQL Server Processes
The normal way to shut down a SQL server is to issue a shutdown command. If any user processes are running, the shutdown will wait until they are done, and you will need to issue a shutdown with no_wait. This will usually stop the server. If it does not, you should issue a UNIX kill -15. Kill -15 is a special kill that the server will interpret as a shutdown with no_wait internally. Only if the kill -15 does not succeed should the administrator consider issuing a Kill -9.
Sometimes when the Unix Kill command is issued, the dataserver will leave stray UNIX dataserver engine processes running on the machine. These processes hold on to their shared memory and can cause the UNIX operating system to swap. To find engines that belong to no server, simply look for engines owned by /etc/init (process id 1). These should be killed.
The reason to shutdown a SQL server using the shutdown command is straightforward. The shutdown command checkpoints transactions in all databases. This simplifies the servers restart because it will not need to sort through the transaction logs of these databases. A manual checkpoint will do the same thing.
How to Break Sybase
There is a simple way to break Sybase that all administrators should be aware of. Log in to the server and type "begin tran". Leave yourself logged in. Voila. Eventually the transaction logs for the database in which you reside will fill up because the dump transaction command can only clear the logs up to the point of the first uncommitted begin transaction. If you see transaction logs filling up, long running transactions are one of the first things you should check for.
This implies several things about how applications should work with Sybase. No application should hold transactions open for long periods of time. One option to check for is the use of the autocommit option in Powerbuilder. All Powerbuilder applications should set their Autocommit option to TRUE so transactions are not held open. If they don’t, a user logging in to Sybase through Powerbuilder and then doing NOTHING will cause transaction logs to fill up.
ISQL
Isql is a pretty basic tool and has many shortcomings but it has one wonderful advantage. It is free. Here are two things to know about isql when used in scripts.
To use isql in a shell script try using input redirection as follows:
isql -Ulogin -Ppassword <<EOF
use xxx
go
sp_who
go
EOF
To prevent the password from being displayed from isql (it is not in all versions - recent versions have fixed this security hole).
isql -Ulogin <<EOF
password
use xxx
go
sp_who
go
EOF
STORED PROCEDURE LIBRARY
I highly suggest administrators pick up my free stored procedure library. It enhances the procedures that Sybase provides and is VERY useful. It has been published several times through comp.databases.sybase. It can be downloaded from the following web site:
http://sybase.pnl.gov:2080/Sybase/barlowwww.tiac.net/users/sqltech
FINALLY (A PLUG FOR MY PRODUCT)
I know it cheapens the rest of the document (is this internet guilt?) but I wrote the document primarily because I felt like it and I am giving it away for free so I feel entitled... I have spent the past year (full time) developing a Sybase shareware utilitySQL Administrator, a PC and UNIX Sybase Administration tool. This is the first tool that I have seen that was written by an administrator for administrators. It beats the pants off anything else on the market (ie. it does what you want). IWe have really been stopped cold on the sales end though (is there really no distribution mechanism at all for Sybase tools?) and so I am tacking on a brief description heredistributing it as shareware off of my internet site.. Check it out!
The software is available for all versions of Sybase. It is currently available for SunOS and Solaris, but the UNIX agents can easily be ported to any hardware with a recompile and a minimal shell script testing (easy). I have additional product documentation that I will send if you send me a fax # and I can send you a sales package by snail mail if you are interestedDetails on: http://www.tiac.net/users/sqltech.
SQL ToolkitAdministrator
SQL Administrator is a full featured tool that assists Sybase administrators in all aspects of their work. Sybase administration relies on significant manual intervention to discover and correct problems, which leads to costs not scaling well. The cost to administer 20 servers is twice the cost to administer 10 servers. SQL Administrator allows automation of many of the manual tasks that are currently performed, and provides administrators with the information they need to make their environment more stable and perform better. This is a key design philosophy. SQL Administrator does NOT help you type "create database", which seems to be the point of all the other "so called" Sybase administration packages on the market. I never had a problem typing my own commands and do not see the point of these packages. SQL Administrator WILL give you the information you need to administer your systems in the form you need it.
Our UNIX agent software collects information from the master database, the database system tables, the Sybase, nix, and dbcc errorlogs, and discovers basics about your unix configuration into a "master master" database. Administrators then access this data from a nice Powerbuilder (as of 1/96 PC only - X version available shortly). We provide many reports and analysis tools that allow you to easily understand what is going on in your environment. Additionally, SQL Administrator provides a bunch of utilities that you are probably developing yourself to