Microsoft Access Tables: Primary Key Tips
and Techniques
by Luke Chung, FMS President
Most databases, whether it's Microsoft Access, SQL Server, Oracle, etc.,
let you create tables and specify any field or fields as the primary key.
Unfortunately, this makes it easy to create primary keys that are not
optimal which can cause performance problems and even worse, architectural
problems that become expensive to fix later.
Background
Every table in a relational database should be keyed. Records in a
relational database can be sorted and physically stored in any order, but
the key field (or fields) define uniqueness for that record. This makes it
easy to link this table to data in other tables. The primary key is also
important for secondary indexes.
Secondary indexes can be one or more fields that are defined to speed up
searches by maintaining a pre-sorted list. For instance, a customer table
may have a secondary index on Zip Code to allow quicker searches on zip code
values.
Primary Key Design Tips
Here are some tips for creating primary keys. As with all rules, there
are specific situations where one may deviate from them. Additionally, if
your tables are small, adjusting the primary key may have no perceivable
impact, so these tips may not matter. However, for larger tables and to
support scalability over time, these tips can have a huge impact. The hope
is to know these reasons and why you may intentionally deviate from these
best practices.
Primary Key Fields Come First
The primary key should be the first field (or fields) in your table
design. While most databases allow you to define primary keys on any field
in your table, the common convention and what the next developer will
expect, is the primary key field(s) coming first.
Use a Meaningless Primary Key
The values in a table's primary key should never change over time. For
instance, a numeric customer ID can point to a customer, and regardless of
whether they change their name, address, phone number, etc., we can track
the data properly without updating more than one table.
If a primary key value has meaning, it can change which requires updating
the tables that reference this value. That may be automatic through
cascading referential integrity, but it adds overhead, slows performance,
and may create locking conflicts as the data is updated.
Therefore, it's critical to avoid using things like people's names as
primary keys. Fields with meaning can also result in duplicates, which is a
problem when the primary key is supposed to define each record uniquely.
Fortunately, it's easy to define a unique primary key. An auto generated
value such as an Identity column in SQL Server or an AutoNumber field in
Microsoft Access does this. It doesn't matter if you delete a record and the
key values are no longer consecutive. The primary key should have no meaning
other than the permanent tag to that record.
Exception
A reasonable exception to this is geographic lists such as Countries,
States, and Zip Codes, where the name is keyed rather than using a separate
ID. These data are relatively stable and small.
Only Use One Numeric Field as the Primary Key
Uniquely tagging a record can be done with a number (long integer).
There's no need to have more than one field defining its uniqueness. Text
fields require more bytes than numeric fields, so using a number saves
considerable space.
Making the primary key as small as possible reduces the memory required
to open the primary key when links or searches are performed.
Impact on Secondary Indexes
Primary key size also impacts secondary indexes. A secondary index
contains the sorted information on the field or fields in the index, plus a
reference back to the primary key. Searching on the secondary index is fast
because it uses a presorted list, but still points back to the record using
the primary key to retrieve all of the record's other fields.
The smaller the primary key, the smaller the secondary index, and the
faster the links and searches are. If a table has lots of secondary indexes
and lots of records, the difference can be very significant if you are not
using a single numeric field as your primary key.
Don't use Primary Keys to Prevent Duplicate Records
Sometimes people justify composite (multi-field) primary keys to
avoid duplicates. This is the wrong approach. To prevent duplicates,
create a secondary index for those fields and define it as unique. That
lets the secondary index handle that job while the single field primary
index retains its efficiencies.
Additional Resources
Our
Total Access Analyzer
add-in performs detailed analysis of your MS Access
database objects and detects tables with non-optimal primary keys, field
type inconsistencies, plus hundreds of others tips
to improve your database designs.
|