The first step in creating a database is creating a plan that serves both as a guide to be used when implementing the database and as a functional specification for the database after it has been implemented. The complexity and detail of a database design is dictated by the complexity and size of the database application and also the user population.
The nature and complexity of a database application, and also the process of planning it, can vary significantly. A database can be relatively simple and designed for use by a single person, or it can be large and complex and designed, for example, to handle all the banking transactions for thousands of clients. In the first case, the database design may be slightly more than a few notes on some scratch paper. In the latter case, the design may be a formal document hundreds of pages long that contains every possible detail about the database.
In planning the database, regardless of its size and complexity, use the following basic steps:
- Gather information.
- Identify the objects.
- Model the objects.
- Identify the types of information for each objects
- Identify the relationships between objects
Gathering Information
Before creating a database, you must have a good understanding of the job the database is expected to perform. If the database is to replace a paper-based or manually performed information system, the existing system will give you most of the information you need. It is important to interview everyone involved in the system to find out what they do and what they need from the database. It is also important to identify what they want the new system to do, as well as to identify the problems, limitations, and bottlenecks of any existing system. Collect copies of customer statements, inventory lists, management reports, and any other documents that are part of the existing system, because these will be useful to you in designing the database and the interfaces.
Identifying the Objects
During the process of gathering information, you must identify the key objects or entities that will be managed by the database. The object can be a tangible thing, such as a person or a product, or it can be a more intangible item, such as a business transaction, a department in a company, or a payroll period. There are usually a few primary objects, and after these are identified, the related items become apparent. Each distinct item in your database should have a corresponding table.
The primary object in the pubs sample database included with Microsoft® SQL Server™ 2000 is a book. The objects related to books within this company's business are the authors who write the books, the publishers who manufacture the books, the stores which sell them, and the sales transactions performed with the stores. Each of these objects is a table in the database.
Modeling the Objects
As the objects in the system are identified, it is important to record them in a way that represents the system visually. You can use your database model as a reference during implementation of the database.
For this purpose, database developers use tools that range in technical complexity from pencils and scratch paper to word processing or spreadsheet programs, and even to software programs specifically dedicated to the job of data modeling for database designs. Whatever tool you decide to use, it is important that you keep it up-to-date.
SQL Server Enterprise Manager includes visual design tools such as the Database Designer that can be used to design and create objects in the database. For more information see,
Identifying the Types of Information for Each Object
After the primary objects in the database have been identified as candidates for tables, the next step is to identify the types of information that must be stored for each object. These are the columns in the object's table. The columns in a database table contain a few common types of information:
- Raw data columns
These columns store tangible pieces of information, such as names, determined by a source external to the database.
- Categorical columns
These columns classify or group the data and store a limited selection of data such as true/false, married/single, VP/Director/Group Manager, and so on.
- Identifier columns
These columns provide a mechanism to identify each item stored in the table. These columns often have id or number in their names (for example, employee_id, invoice_number, and publisher_id). The identifier column is the primary component used by both users and internal database processing for gaining access to a row of data in the table. Sometimes the object has a tangible form of ID used in the table (for example, a social security number), but in most situations you can define the table so that a reliable, artificial ID can be created for the row.
- Relational or referential columns
These columns establish a link between information in one table and related information in another table. For example, a table that tracks sales transactions will commonly have a link to the customers table so that the complete customer information can be associated with the sales transaction.
|