SQL Server applications fundamentally rely on databases. In data-centric applications, a deep understanding of data storage within SQL Server and optimal database layout is crucial for performance. SQL Server 2000 distinguishes between system and user databases. System databases are essential for server configuration, database settings, and features like SQL Server Agent and replication. Every SQL Server installation includes master, msdb, model, and tempdb system databases. User databases are specifically created to store and manage application data.
Databases in SQL Server consist of data and log files, storing data and transaction logs respectively. Transaction logs are vital for recovery, meticulously recording every data modification. Data files hold the actual database content, organized into 8KB pages. Except for text and image data, 8KB is the maximum size for a single record. Data file space is allocated to tables and indexes in 64KB units called extents (eight 8KB pages). For space efficiency, SQL Server uses mixed extents, allowing a single extent to be shared by up to eight tables or indexes if they require less than 64KB of storage individually. Transaction logs store information sequentially as log records.
SQL Server Database Servers can utilize multiple data and log files to accommodate growth and distribute files across different physical disk systems. Data files on separate disks can be combined into a logical unit called a filegroup. Tables, indexes, and large data types like text and image can be assigned to different filegroups, enhancing system performance through data dispersion. Auto-growth can be enabled for data and log files, allowing automatic expansion to meet increasing storage demands.
Designing an effective physical database structure starts with defining the logical database model. This involves identifying key data entities within your application. For instance, in an order-entry system, entities could include customers, orders, products, and suppliers. Analyzing read/write activity for each entity is crucial for efficient database organization. By understanding access patterns, you can strategically allocate entities to different filegroups. While smaller databases may perform well with a single file, larger systems benefit significantly from multiple data files distributed across physical disks for improved performance and scalability on the database server.
Physical database layout design includes choosing the right RAID level for disk configurations. RAID enhances both performance and fault tolerance. Performance is improved through striping, which distributes data across multiple disks. Fault tolerance is achieved through mirroring or parity data. Mirroring duplicates data on separate disks. Parity involves calculating parity bits, which are used to reconstruct data on failed disks. Distributed parity ensures that parity information is spread across multiple disks, enabling data recovery from a single disk failure.
Once the physical database structure is aligned with the logical model, database application design can begin. This involves creating an entity-relationship (ER) diagram, detailing entities, attributes (columns), column properties, and table relationships. Normalization is a key process with three common rules aimed at reducing data redundancy, minimizing dependencies, and establishing a more accurate and manageable data structure within the database server environment.