In this assignment you will develop a data model from a file of un-normalized data. Later we will load and normalize the data.
The data is a simplified extraction of the UNESCO World Heritage Sites registry. The un-normalized data is provided as both a spreadsheet and a CSF file:
The columns in the data are as follows:
name,description,justification,year,longitude,latitude,
area_hectares,category,states,region,iso
You are to design a database model that represents this flat data across multiple tables using "third-normal form" - which basically means that columns that have vertical duplication, such as region need to be placed in their own table and linked into the main table using a foreign key.
category states region iso
Cultural Afghanistan Asia and the Pacific af
Cultural Afghanistan Asia and the Pacific af
Cultural Albania Europe and North America al
Cultural Albania Europe and North America al
Cultural Algeria Arab States dz
Mixed Algeria Arab States dz
Cultural Algeria Arab States dz
Cultural Algeria Arab States dz
You will model diagram that describes the tables, one-to-many relationships, and foreign keys sufficient to represent this data efficiently with no vertical duplication. This assignment does not need any many-to-many relationships.
Name the first table site
, use singular names for all of the table
names. Use the exact name of the column for the model field names and
foreign key names. Even though the data labels one column as states
,
name your table state
and the foreign key state_id
.
(1) Create a picture of the tables, fields, and relationships. Make sure all the columns in the original data are represented in the diagram. You can use an online diagramming tool like https://dbdiagram.io/home or you can draw a picture on paper, take a phot and turn in an image.
(2) Develop all of the CREATE TABLE
statements needed to construct the tables following
the conventions for field naming used in the lectures. (a) Primary keys should be id
,
(b) logical keys should be UNIQUE NOT NULL
, and (c) foreign keys should be named table_id
based
on the name of the destination table. You should make sure your create statements work
by running then in PostgreSQL.