Database Management System| Information Systems

Database Management System| Information Systems

We have covered a good deal of information regarding methodology. Why (when) would the rigor of a methodology be useful? Select one step in the Logical Data Model from the Connolly text that you feel would “always” be useful and explain why?

A web search for relevant articles and information could be employed to assist you in your response. Your response should be a reply to the original message.

300 Words Minimum
Chapter Objectives

In this chapter you will learn:In this chapter you will learn:

• The purpose of a design methodology.

• The three main phases of database design: conceptual, logical,

and physical design.

• How to decompose the scope of the design into specific views of

the enterprise.

• How to use ER modeling to build a local conceptual data model

based on the information given in a view of the enterprise.

• How to validate the resultant conceptual data model to ensure

that it is a true and accurate representation of a view of the

enterprise.

• How to document the process of conceptual database design.

• End-users play an integral role throughout the process of

conceptual database design.

In Chapter 10Chapter 10 we described the main stages of the database system

development lifecycle, one of which is database designdatabase design. This stage

starts only after a complete analysis of the enterprise’s requirements

has been undertaken.

In this chapter, and Chapters 17Chapters 17–1919, we describe a methodology for

455

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid27876
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37780
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_029.xhtml#eid41343
the database design stage of the database system development

lifecycle for relational databases. The methodology is presented as a

step-by-step guide to the three main phases of database design,

namely: conceptual, logical, and physical design (see Figure 10.1Figure 10.1).

The main aim of each phase is as follows:

• Conceptual database designConceptual database design—to build the conceptual

representation of the database, which includes identification of

the important entities, relationships, and attributes.

• Logical database designLogical database design—to translate the conceptual

representation to the logical structure of the database, which

includes designing the relations.

• Physical database designPhysical database design—to decide how the logical structure

is to be physically implemented (as base relations) in the target

DBMS.

Structure of this Chapter

In Section 16.1Section 16.1 we define what a database design methodology is

and review the three phases of database design. In Section 16.2Section 16.2 we

provide an overview of the methodology and briefly describe the

main activities associated with each design phase. In Section 16.3Section 16.3

we focus on the methodology for conceptual database design and

present a detailed description of the steps required to build a

conceptual data model. We use the ER modeling technique described

in Chapters 12Chapters 12 and 1313 to create the conceptual data model. The

conceptual data model described in this chapter is the starting point

for the next phase of database design described in the following

455

456

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid28003
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36569
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36642
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36811
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31274
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_022.xhtml#eid32613
chapter.

In Chapter 17Chapter 17 we focus on the methodology for logical database

design for the relational model and present a detailed description of

the steps required to convert a conceptual data model into a logical

data model. This chapter also includes an optional step that describes

how to merge two or more logical data models into a single logical

data model, for those using the view integration approach (see

Section 10.5Section 10.5) to manage the design of a database with multiple user

views. The logical data model described in Chapter 17Chapter 17 is the starting

point for the final phase of database design described in the following

two chapters.

In Chapters 18Chapters 18 and 1919 we complete the database design

methodology by presenting a detailed description of the steps

associated with the production of the physical database design for

relational DBMSs. This part of the methodology illustrates that the

development of the logical data model alone is insufficient to

guarantee the optimum implementation of a database system. For

example, we may have to consider modifying the logical model to

achieve acceptable levels of performance.

Appendix DAppendix D presents a summary of the database design

methodology for those readers who are already familiar with

database design and simply require an overview of the main steps.

Throughout the methodology the terms “entity” and “relationship”

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37780
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid28150
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37780
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39671
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_029.xhtml#eid41343
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_054.xhtml#eid85549
are used in place of “entity type” and “relationship type” where the

meaning is obvious; “type” is generally added only to avoid

ambiguity. In this chapter we mostly use examples from the

StaffClient user views of the DreamHome case study documented in

Section 11.4Section 11.4 and Appendix AAppendix A.

16.1 Introduction to the Database Design Methodology

Before presenting the methodology, we discuss what a design

methodology represents and describe the three phases of database

design. Finally, we present guidelines for achieving success in

database design.

16.1.1 What Is a Design Methodology?

Design methodology

A structured approach that uses procedures, techniques, tools, and

documentation aids to support and facilitate the process of design.

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_020.xhtml#eid30021
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_051.xhtml#eid84853
A design methodology consists of phases each containing a number of

steps that guide the designer in the techniques appropriate at each

stage of the project. A design methodology also helps the designer to

plan, manage, control, and evaluate database development projects.

Furthermore, it is a structured approach for analyzing and modeling a

set of requirements for a database in a standardized and organized

manner.

16.1.2 Conceptual, Logical, and Physical Database Design

In presenting this database design methodology, the design process is

divided into three main phases: conceptual, logical, and physical

database design.

Conceptual database design

The process of constructing a model of the data used in an enterprise,

independent of all physical considerations.

The conceptual database design phase begins with the creation of a

conceptual data model of the enterprise that is entirely independent

of implementation details such as the target DBMS, application

programs, programming languages, hardware platform, performance

issues, or any other physical considerations.

456

457

Logical database design

The process of constructing a model of the data used in an enterprise

based on a specific data model, but independent of a particular DBMS

and other physical considerations.

The logical database design phase maps the conceptual data model on

to a logical model, which is influenced by the data model for the target

database (for example, the relational model). The logical data model is

a source of information for the physical design phase, providing the

physical database designer with a vehicle for making trade-offs that

are very important to the design of an efficient database.

Physical database design

The process of producing a description of the implementation of the

database on secondary storage; it describes the base relations, file

organizations, and indexes used to achieve efficient access to the

data, and any associated integrity constraints and security measures.

The physical database design phase allows the designer to make

decisions on how the database is to be implemented. Therefore,

physical design is tailored to a specific DBMS. There is feedback

between physical and logical design, because decisions taken during

physical design for improving performance may affect the logical data

model.

16.1.3 Critical Success Factors in Database Design

The following guidelines are often critical to the success of database

design:

• Work interactively with the users as much as possible.

• Follow a structured methodology throughout the data modeling

process.

• Employ a data-driven approach.

• Incorporate structural and integrity considerations into the data

models.

• Combine conceptualization, normalization, and transaction

validation techniques into the data modeling methodology.

• Use diagrams to represent as much of the data models as possible.

• Use a Database Design Language (DBDL) to represent additional

data semantics that cannot easily be represented in a diagram.

• Build a data dictionary to supplement the data model diagrams

and the DBDL.

• Be willing to repeat steps.

These factors are built into the methodology we present for database

design.

16.2 Overview of the Database Design Methodology

In this section, we present an overview of the database design

457

458

methodology. The steps in the methodology are as follows.

Conceptual database design

Step 1 Build conceptual data model

Step 1.1 Identify entity types

Step 1.2 Identify relationship types

Step 1.3 Identify and associate attributes with entity or

relationship types

Step 1.4 Determine attribute domains

Step 1.5 lDetermine candidate, primary, and alternate key

attributes

Step 1.6 Consider use of enhanced modeling concepts (optional

step)

Step 1.7 Check model for redundancy

Step 1.8 Validate conceptual data model against user

transactions

Step 1.9 Review conceptual data model with user

Logical database design for the relational model

Step 2 Build logical data model

Step 2.1 Derive relations for logical data model

Step 2.2 Validate relations using normalization

Step 2.3 Validate relations against user transactions

Step 2.4 Check integrity constraints

Step 2.5 Review logical data model with user

Step 2.6 Merge logical data models into global model (optional

step)

Step 2.7 Check for future growth

Physical database design for relational databases

Step 3 Translate logical data model for target DBMS

Step 3.1 Design base relations

Step 3.2 Design representation of derived data

Step 3.3 Design general constraints

Step 4 Design file organizations and indexes

Step 4.1 Analyze transactions

Step 4.2 Choose file organizations

Step 4.3 Choose indexes

Step 4.4 Estimate disk space requirements

Step 5 Design user views

Step 6 Design security mechanisms

Step 7 Consider the introduction of controlled redundancy

Step 8 Monitor and tune the operational system

This methodology can be used to design relatively simple to highly

complex database systems. Just as the database design stage of the

database systems development lifecycle (see Section 10.6Section 10.6) has three

phases—conceptual, logical, and physical design—so too has the

methodology. Step 1Step 1 creates a conceptual database design, Step 2Step 2

creates a logical database design, and Steps 3Steps 3 to 88 create a physical

database design. Depending on the complexity of the database system

being built, some of the steps may be omitted. For example, Step 2.6Step 2.6

of the methodology is not required for database systems with a single

user view or database systems with multiple user views being

managed using the centralized approach (see Section 10.5Section 10.5). For this

reason, we refer to the creation of a single conceptual data model only

in Step 1Step 1 and single logical data model only in Step 2Step 2. However, if the

database designer is using the view integration approach (see SectionSection

10.510.5) to manage user views for a database system, then Steps 1Steps 1 and 22

458

459

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid28243
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36817
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37834
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39813
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_029.xhtml#eid41917
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38623
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid28150
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36817
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37834
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid28150
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36817
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37834
may be repeated as necessary to create the required number of

models, which are then merged in Step 2.6Step 2.6.

In Chapter 10Chapter 10, we introduced the term “local conceptual data model”

or “local logical data model” to refer to the modeling of one or more,

but not all, user views of a database system and the term “global

logical data model” to refer to the modeling of all user views of a

database system. However, the methodology is presented using the

more general terms “conceptual data model” and “logical data model”

with the exception of the optional Step 2.6Step 2.6, which necessitates the use

of the terms local logical data model and global logical data model, as

it is this step that describes the tasks necessary to merge separate local

logical data models to produce a global logical data model.

An important aspect of any design methodology is to ensure that the

models produced are repeatedly validated so that they continue to be

an accurate representation of the part of the enterprise being

modeled. In this methodology the data models are validated in various

ways such as by using normalization (Step 2.2Step 2.2), by ensuring the

critical transactions are supported (Steps 1.8Steps 1.8 and 2.32.3), and by

involving the users as much as possible (Steps 1.9Steps 1.9 and 2.52.5).

The logical model created at the end of Step 2Step 2 is then used as the

source of information for physical database design described in StepsSteps

33 to 88. Again, depending on the complexity of the database systems

being designed and/or the functionality of the target DBMS, some of

the steps of physical database design may be omitted. For example,

Step 4.2Step 4.2 may not be applicable for certain PC-based DBMSs. The steps

of physical database design are described in detail in Chapters 18Chapters 18

and 1919.

459

460

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38623
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_019.xhtml#eid27876
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38623
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38385
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37552
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38428
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37612
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38593
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37834
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39813
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_029.xhtml#eid41917
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid40488
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39671
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_029.xhtml#eid41343
Database design is an iterative process that has a starting point and an

almost endless procession of refinements. Although the steps of the

methodology are presented here as a procedural process, it must be

emphasized that this does not imply that it should be performed in

this manner. It is likely that knowledge gained in one step may alter

decisions made in a previous step. Similarly, it may be useful to look

briefly at a later step to help with an earlier step. Therefore, the

methodology should act as a framework to help guide the designer

through database design effectively.

To illustrate the database design methodology we use the DreamHome

case study. The DreamHome database has several user views

(Director, Manager, Supervisor, Assistant, and Client) that are

managed using a combination of the centralized and view integration

approaches (see Section 11.4Section 11.4). Applying the centralized approach

resulted in the identification of two collections of user views called

StaffClient user views and Branch user views. The user views

represented by each collection are as follows:

• StaffClient user viewsStaffClient user views—representing Supervisor, Assistant, and

Client user views;

• Branch user viewsBranch user views—representing Director and Manager user

views.

In this chapter, which describes Step 1Step 1 of the methodology, we use the

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_020.xhtml#eid30021
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36817
StaffClient user views to illustrate the building of a conceptual data

model, and then in the following chapter, which describes Step 2Step 2, we

describe how this model is translated into a logical data model. As the

StaffClient user views represent only a subset of all the user views of

the DreamHome database, it is more correct to refer to the data

models as local data models. However, as stated earlier when we

described the methodology and the worked examples, for simplicity

we use the terms conceptual data model and logical data model until

the optional Step 2.6Step 2.6, which describes the integration of the local

logical data models for the StaffClient user views and the Branch user

views.

16.3 Conceptual Database Design Methodology

This section provides a step-by-step guide for conceptual database

design.

Step 1: Build Conceptual Data Model

Objective

To build a conceptual data model of the data requirements of the

enterprise.

The first step in conceptual database design is to build one (or more)

conceptual data models of the data requirements of the enterprise. A

conceptual data model comprises:

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37834
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid38623
• entity types;

• relationship types;

• attributes and attribute domains;

• primary keys and alternate keys;

• integrity constraints.

The conceptual data model is supported by documentation, including

ER diagrams and a data dictionary, which is produced throughout the

development of the model. We detail the types of supporting

documentation that may be produced as we go through the various

steps. The tasks involved in Step 1Step 1 are:

Step 1.1 Identify entity types

Step 1.2 Identify relationship types

Step 1.3 Identify and associate attributes with entity or relationship

types

Step 1.4 Determine attribute domains

Step 1.5 Determine candidate, primary, and alternate key attributes

Step 1.6 Consider use of enhanced modeling concepts (optional step)

Step 1.7 Check model for redundancy

Step 1.8 Validate conceptual data model against user transactions

Step 1.9 Review conceptual data model with user

Step 1.1: Identify entity types

Objective

To identify the required entity types.

The first step in building a conceptual data model is to determine and

460

461

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36817
define the main objects that the users are interested in. These objects

are the entity types for the model (see Section 12.1Section 12.1). One method of

identifying entities is to examine the users’ requirements

specification. From this specification, we identify nouns or noun

phrases that are mentioned (for example, staff number, staff name,

property number, property address, rent, number of rooms). We also

look for major objects, such as people, places, or concepts of interest,

excluding those nouns that are merely qualities of other objects. For

example, we could group staff number and staff name with an object

or entity called Staff and group property number, property address,

rent, and number of rooms with an entity called PropertyForRent.

An alternative way of identifying entities is to look for objects that

have an existence in their own right. For example, Staff is an entity

because staff exist whether or not we know their names, positions,

and dates of birth. If possible, the users should assist with this activity.

It is sometimes difficult to identify entities because of the way they are

presented in the users’ requirements specification. Users often talk in

terms of examples or analogies. Instead of talking about staff in

general, users may mention people’s names. In some cases, users talk

in terms of job roles, particularly when people or organizations are

involved. These roles may be job titles or responsibilities, such as

Director, Manager, Supervisor, or Assistant.

To confuse matters further, users frequently use synonyms and

homonyms. Two words are synonyms when they have the same

meaning, for example, “branch” and “office.” Homonyms occur when

the same word can have different meanings depending on the context.

For example, the word “program” has several alternative meanings

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31333
such as a course of study, a series of events, a plan of work, and an

item on the television.

It is not always obvious whether a particular object is an entity, a

relationship, or an attribute. For example, how would we classify

marriage? In fact, depending on the actual requirements, we could

classify marriage as any or all of these. Design is subjective, and

different designers may produce different, but equally valid,

interpretations. The activity therefore relies, to a certain extent, on

judgement and experience. Database designers must take a very

selective view of the world and categorize the things that they observe

within the context of the enterprise. Thus, there may be no unique set

of entity types deducible from a given requirements specification.

However, successive iterations of the design process should lead to the

choice of entities that are at least adequate for the system required.

For the StaffClient user views of DreamHome, we identify the

following entities:

Staff PropertyForRent

PrivateOwner BusinessOwner

Client Preference

461

462

Lease

Document entity types

As entity types are identified, assign them names that are meaningful

and obvious to the user. Record the names and descriptions of entities

in a data dictionary. If possible, document the expected number of

occurrences of each entity. If an entity is known by different names,

the names are referred to as synonyms or aliases, which are also

recorded in the data dictionary. Figure 16.1Figure 16.1 shows an extract from

the data dictionary that documents the entities for the StaffClient user

views of DreamHome.

Step 1.2: Identify relationship types

Objective

To identify the important relationships that exist between the entity

types.

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36951
Figure 16.1 Extract from the data dictionary for the StaffClient user views of DreamHome showing a

description of entities.

Having identified the entities, the next step is to identify all the

relationships that exist between these entities (see Section 12.2Section 12.2).

When we identify entities, one method

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31377
is to look for nouns in the users’ requirements specification. Again, we

can use the grammar of the requirements specification to identify

relationships. Typically, relationships are indicated by verbs or verbal

expressions. For example:

• Staff Manages PropertyForRent

• PrivateOwner Owns PropertyForRent

• PropertyForRent AssociatedWith Lease

The fact that the requirements specification records these

relationships suggests that they are important to the enterprise and

should be included in the model.

We are interested only in required relationships between entities. In

the previous examples, we identified the Staff Manages

PropertyForRent and the PrivateOwner Owns PropertyForRent

relationships. We may also be inclined to include a relationship

between Staff and PrivateOwner (for example, Staff Assists

PrivateOwner). However, although this is a possible relationship, from

the requirements specification, it is not a relationship that we are

interested in modeling.

In most instances, the relationships are binary; in other words, the

relationships exist between exactly two entity types. However, we

should be careful to look out for complex relationships that may

involve more than two entity types (see Section 12.2.1Section 12.2.1) and

recursive relationships that involve only one entity type (see SectionSection

12.2.212.2.2).

462

463

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31466
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31543
Great care must be taken to ensure that all the relationships that are

either explicit or implicit in the users’ requirements specification are

detected. In principle, it should be possible to check each pair of entity

types for a potential relationship between them, but this would be a

daunting task for a large system comprising hundreds of entity types.

On the other hand, it is unwise not to perform some such check, and

the responsibility is often left to the analyst/designer. However,

missing relationships should become apparent when we validate the

model against the transactions that are to be supported (Step 1.8Step 1.8).

Use Entity–Relationship (ER) diagrams

It is often easier to visualize a complex system rather than decipher

long textual descriptions of a users’ requirements specification. We

use ER diagrams to represent entities and how they relate to one

another more easily. Throughout the database design phase, we

recommend that ER diagrams be used whenever necessary to help

build up a picture of the part of the enterprise that we are modeling.

In this book, we use UML, but other notations perform a similar

function (see Appendix CAppendix C).

Determine the multiplicity constraints of relationship types

Having identified the relationships to model, we next determine the

multiplicity of each relationship (see Section 12.6Section 12.6). If specific values

for the multiplicity are known, or even upper or lower limits,

document these values as well.

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37552
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_053.xhtml#eid85491
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31881
Multiplicity constraints are used to check and maintain data quality.

These constraints are assertions about entity occurrences that can be

applied when the database is updated to determine whether the

updates violate the stated rules of the enterprise. A model that

includes multiplicity constraints more explicitly represents the

semantics of the relationships and results in a better representation of

the data requirements of the enterprise.

Figure 16.2 First-cut ER diagram showing entity and relationship types for the StaffClient user views of

DreamHome.

Check for fan and chasm traps

Having identified the necessary relationships, check that each

relationship in the ER model is a true representation of the “real

world,” and that fan or chasm traps have not been created

inadvertently (see Section 12.7Section 12.7).

463

464

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid32262
Figure 16.2Figure 16.2 shows the first-cut ER diagram for the StaffClient user

views of the DreamHome case study.

Document relationship types

As relationship types are identified, assign them names that are

meaningful and obvious to the user. Also record relationship

descriptions and the multiplicity constraints in the data dictionary.

Figure 16.3Figure 16.3 shows an extract from the data dictionary that

documents the relationships for the StaffClient user views of

DreamHome.

Step 1.3: Identify and associate attributes with entity or relationship types

Objective

To associate attributes with appropriate entity or relationship types.

The next step in the methodology is to identify the types of facts about

the entities and relationships that we have chosen to be represented

in the database. In a

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37007
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37043
similar way to identifying entities, we look for nouns or noun phrases

in the users’ requirements specification. The attributes can be

identified where the noun or noun phrase is a property, quality,

identifier, or characteristic of one of these entities or relationships

(see Section 12.3Section 12.3).

Figure 16.3 Extract from the data dictionary for the StaffClient user views of DreamHome, showing a

description of relationships.

By far the easiest thing to do when we have identified an entity (x) or

a relationship (y) in the requirements specification is to ask “What

information are we required to hold on x or y?” The answer to this

question should be described in the specification. However, in some

cases it may be necessary to ask the users to clarify the requirements.

Unfortunately, they may give answers to this question that also

contain other concepts, so the users’ responses must be carefully

considered.

Simple/composite attributes

It is important to note whether an attribute is simple or composite (see

Section 12.3.1Section 12.3.1). Composite attributes are made up of simple

464

465

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31591
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31634
attributes. For example, the address attribute can be simple and hold

all the details of an address as a single value, such as “115 Dumbarton

Road, Glasgow, G11 6YG.” However, the address attribute may also

represent a composite attribute, made up of simple attributes that

hold the address details as separate values in the attributes street

(“115 Dumbarton Road”), city (“Glasgow”), and postcode (“G11 6YG”).

The option to represent address details as a simple or composite

attribute is determined by the users’ requirements. If the user does

not need to access the separate components of an address, we

represent the address attribute as a simple attribute. On the other

hand, if the user does need to access the individual components of an

address, we represent the address attribute as being composite, made

up of the required simple attributes.

In this step, it is important that we identify all simple attributes to be

represented in the conceptual data model including those attributes

that make up a composite attribute.

Single/multi-valued attributes

In addition to being simple or composite, an attribute can also be

single-valued or multi-valued (see Section 12.3.2Section 12.3.2). Most attributes

encountered will be single-valued, but occasionally a multi-valued

attribute may be encountered; that is, an attribute that holds multiple

values for a single entity occurrence. For example, we may identify

the attribute telNo (the telephone number) of the Client entity as a

multivalued attribute.

On the other hand, client telephone numbers may have been

identified as a separate entity from Client. This is an alternative, and

465

466

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31664
equally valid, way to model this. As you will see in Step 2.1Step 2.1, multi-

valued attributes are mapped to relations anyway, so both approaches

produce the same end result.

Derived attributes

Attributes whose values are based on the values of other attributes

are known as derived attributes (see Section 12.3.3Section 12.3.3). Examples of

derived attributes include:

• the age of a member of staff;

• the number of properties that a member of staff manages;

• the rental deposit (calculated as twice the monthly rent).

Often, these attributes are not represented in the conceptual data

model. However, sometimes the value of the attribute or attributes on

which the derived attribute is based may be deleted or modified. In

this case, the derived attribute must be shown in the data model to

avoid this potential loss of information. However, if a derived

attribute is shown in the model, we must indicate that it is derived.

The representation of derived attributes will be considered during

physical database design. Depending on how an attribute is used, new

values for a derived attribute may be calculated each time it is

accessed or when the value(s) it is derived from changes. However,

this issue is not the concern of conceptual database design, and is

discussed in more detail in Step 3.2Step 3.2 in Chapter 18Chapter 18.

Potential problems

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37897
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31687
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39916
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_028.xhtml#eid39671
When identifying the entities, relationships, and attributes for the

user views, it is not uncommon for it to become apparent that one or

more entities, relationships, or attributes have been omitted from the

original selection. In this case, return to the previous steps, document

the new entities, relationships, or attributes, and reexamine any

associated relationships.

As there are generally many more attributes than entities and

relationships, it may be useful to first produce a list of all attributes

given in the users’ requirements specification. As an attribute is

associated with a particular entity or relationship, remove the

attribute from the list. In this way, we ensure that an attribute is

associated with only one entity or relationship type and, when the list

is empty, that all attributes are associated with some entity or

relationship type.

We must also be aware of cases where attributes appear to be

associated with more than one entity or relationship type, as this can

indicate the following:

(1) We have identified several entities that can be represented as a

single entity. For example, we may have identified entities

Assistant and Supervisor both with the attributes staffNo (the staff

number), name, sex, and DOB (date of birth), which can be

represented as a single entity called Staff with the attributes

staffNo (the staff number), name, sex, DOB, and position (with values

Assistant or Supervisor). On the other hand, it may be that these

entities share many attributes but there are also attributes or

relationships that are unique to each entity. In this case, we must

decide whether we want to generalize the entities into a single

entity such as Staff, or leave them as specialized entities

representing distinct staff

roles. The consideration of whether to specialize or generalize

entities was discussed in Chapter 13Chapter 13 and is addressed in more

detail in Step 1.6Step 1.6.

(2) We have identified a relationship between entity types. In this

case, we must associate the attribute with only one entity, the

parent entity, and ensure that the relationship was previously

identified in Step 1.2Step 1.2. If this is not the case, the documentation

should be updated with details of the newly identified

relationship. For example, we may have identified the entities

Staff and PropertyForRent with the following attributes:

Staff staffNo, name, position, sex, DOB

PropertyForRent propertyNo, street, city, postcode,

type, rooms, rent, managerName

The presence of the managerName attribute in PropertyForRent is

intended to represent the relationship Staff Manages

PropertyForRent. In this case, the managerName attribute should be

omitted from PropertyForRent and the relationship Manages should

be added to the model.

DreamHome attributes for entities

466

467

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_022.xhtml#eid32613
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37399
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid36940
For the StaffClient user views of DreamHome, we identify and

associate attributes with entities as follows:

Staff staffNo, name (composite: fName, IName),

position, sex, DOB

PropertyForRent propertyNo, address (composite: street,

city, postcode), type, rooms, rent

PrivateOwner ownerNo, name (composite: fName, IName),

address, telNo

BusinessOwner ownerNo, bName, bType, address, telNo,

contactName

Client clientNo, name (composite: fName,

IName), telNo, eMail

Preference prefType, maxRent

Lease leaseNo, paymentMethod, deposit (derived

as PropertyForRent.rent*2), depositPaid,

rentStart, rentFinish, duration (derived

as rentFinish – rentStart)

DreamHome attributes for relationships

Some attributes should not be associated with entities, but instead

should be associated with relationships. For the StaffClient user views

of DreamHome, we identify and associate attributes with

relationships, as follows:

Views viewDate, comment

Document attributes

As attributes are identified, assign them names that are meaningful to

the user. Record the following information for each attribute:

• attribute name and description;

• data type and length;

• any aliases that the attribute is known by;

• whether the attribute is composite and, if so, the simple attributes

that make up the composite attribute;

• whether the attribute is multi-valued;

• whether the attribute is derived and, if so, how it is to be

computed;

• any default value for the attribute.

467

468

Figure 16.4 Extract from the data dictionary for the StaffClient user views of DreamHome showing a

description of attributes.

Figure 16.4Figure 16.4 shows an extract from the data dictionary that

documents the attributes for the StaffClient user views of

DreamHome.

Step 1.4: Determine attribute domains

Objective

To determine domains for the attributes in the conceptual data

model.

The objective of this step is to determine domains for all the attributes

in the model (see Section 12.3Section 12.3). A domaindomain is a pool of values from

which one or more attributes draw their values. For example, we may

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37276
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31591
define:

• the attribute domain of valid staff numbers (staffNo) as being a

five-character variable-length string, with the first two characters

as letters and the next one to three characters as digits in the

range 1–999;

• the possible values for the sex attribute of the Staff entity as being

either “M” or “F.” The domain of this attribute is a single character

string consisting of the values “M” or “F.”

A fully developed data model specifies the domains for each attribute

and includes:

• allowable set of values for the attribute;

• sizes and formats of the attribute.

Further information can be specified for a domain, such as the

allowable operations on an attribute, and which attributes can be

compared with other attributes or used in combination with other

attributes. However, implementing these characteristics of attribute

domains in a DBMS is still the subject of research.

Document attribute domains

As attribute domains are identified, record their names and

characteristics in the data dictionary. Update the data dictionary

entries for attributes to record their domain in place of the data type

and length information.

STEP 1.5: Determine candidate, primary, and alternate key attributes

Objective

To identify the candidate key(s) for each entity type and, if there is

more than one candidate key, to choose one to be the primary key and

the others as alternate keys.

This step is concerned with identifying the candidate key(s) for an

entity and then selecting one to be the primary key (see SectionSection

12.3.412.3.4). A candidate keycandidate key is a minimal set of attributes of an entity

that uniquely identifies each occurrence of that entity. We may

identify more than one candidate key, in which case we must choose

one to be the primary keyprimary key; the remaining candidate keys are called

alternate keysalternate keys.

468

469

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31717
People’s names generally do not make good candidate keys. For

example, we may think that a suitable candidate key for the Staff

entity would be the composite attribute name, the member of staff’s

name. However, it is possible for two people with the same name to

join DreamHome, which would clearly invalidate the choice of name as

a candidate key. We could make a similar argument for the names of

property owners. In such cases, rather than coming up with

combinations of attributes that may provide uniqueness, it may be

better to use an existing attribute that would always ensure

uniqueness, such as the staffNo attribute for the Staff entity and the

ownerNo attribute for the PrivateOwner entity, or define a new attribute

that would provide uniqueness.

When choosing a primary key from among the candidate keys, use the

following guidelines to help make the selection:

• the candidate key with the minimal set of attributes;

• the candidate key that is least likely to have its values changed;

• the candidate key with fewest characters (for those with textual

attribute(s));

• the candidate key with smallest maximum value (for those with

numerical attribute(s));

• the candidate key that is easiest to use from the users’ point of

view.

In the process of identifying primary keys, note whether an entity is

strong or weak. If we are able to assign a primary key to an entity, the

entity is referred to as being strong. On the other hand, if we are

unable to identify a primary key for an entity, the entity is referred to

as being weak (see Section 12.4Section 12.4). The primary key of a weak entity

can be identified only when we map the weak entity and its

relationship with its owner entity to a relation through the placement

of a foreign key in that relation. The process of mapping entities and

their relationships to relations is described in Step 2.1Step 2.1, and therefore

the identification of primary keys for weak entities cannot take place

until that step.

DreamHome primary keys

The primary keys for the StaffClient user views of DreamHome are

shown in Figure 16.5Figure 16.5. Note that the Preference entity is a weak entity

and, as identified previously, the Views relationship has two attributes,

viewDate and comment. 469

470

epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_021.xhtml#eid31795
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_027.xhtml#eid37897
epub://k27oy2ap24fbg1fg0erl.1.vbk/OPS/loc_026.xhtml#eid37389
Figure 16.5 ER diagram for the StaffClient user views of DreamHome with primary keys added.

Document primary and alternate keys

Record the identification of primary and any alternate keys in the data

dictionary.

STEP 1.6: Consider use of enhanced modeling concepts (optional step)

Objective

To consider the use of enhanced modeling concepts, such as

specialization/generalization, aggregation, and composition.

In this step, we have the option to continue the development of the ER

model using the enhanced modeling concepts discussed in ChapterChapter

1313, namely specialization/generalization, aggregation, and

composition. If we select the specialization approach, we attempt to

highlight differences between entities by defining one or more

subclassessubclasses of a superclasssuperclass entity. If we select the generalization

approach, we attempt to identify common features between entities to

define a generalizing superclass entity. We may use aggregation to

represent a ‘has-a’ or ‘is-part-of’ relationship between entity types,

where one represents the ‘whole’ and the other the ‘part’. We may use

composition (a special type of aggregation) to represent an association

between entity types where there is a strong ownership and

coincidental lifetime between the ‘whole’ and the ‘part’.

For the StaffClient user views of DreamHome, we choose to generalize

the two entities PrivateOwner and BusinessOwner to create a superclass

Owner that contains the common attributes ownerNo, address, and telNo.

The relationship that the Owner superclass has with its subclasses is

mandatory and disjoint, denoted as {Mandatory, Or}; each member of

the Owner superclass must be a member of one of the subclasses, but

Order from us and get better grades. We are the service you have been looking for.