Forum Moderators: open

Message Too Old, No Replies

Design a database of geographical locations

How to design the database of all the places in a country

         

coolclu3

8:00 am on Dec 29, 2008 (gmt 0)

10+ Year Member



Hi all,
I am trying to design a database to store information on 'places'. Naturally I would have

TABLE 'place' (
id,
name,
longitude,
latitude,
country,
state,
street name,
...
)

However, I realized that in a way, a state is also a 'place', so i thought of having a hierarchical place, where the table 'place' has one more column 'parent_id' and a column called 'relationship_with_parent'. So every place will have a parent place.

For example, the Google's headquater

Google Inc.
1600 Amphitheatre Parkway
Mountain View, CA 94043

would, possibly, be decomposed into 5 rows (with made up ids) as follows:
(id, name, parent_id, relationship_with_parent)
row1: 1, Google Headquater, 2, number 1600
row2: 2, Amphitheatre Parkway, 3, inside
row3: 3, Mountain View, 4, postcode 94043
row4: 4, California, 5, state of
row5: 5, US, 6, country of
row6: 6, America, none, top_level

As you can see, the 'relationship_with_parent' is very very roughly defined.

Can you, webmasters, please point out for me what would be the pros/cons of this design? Can this DB scale?

Thanks in advance
P/S: Is this somewhat similar to the schema for Google maps' DB, at all?

LifeinAsia

4:34 pm on Dec 29, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The main problem I see with your second structure is that you have to parse out what each row of information is.

I would basically keep your original "place" table structure, but add a ParentID field to it. I have a separate table for states, which holds different information.

piatkow

7:48 pm on Dec 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Start by creating a properly normalised data model and the design will fall out of it naturally.

phranque

1:47 am on Dec 30, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



you should start with the canonical definition(s) you will require for a "place".
if a place is a "point on a map", you probably want to define that as latitude and longitude with the supporting textual descriptions such as the address, or "highest point in florida", etc.
but "place" can also be defined as a line, a boundary or an area, such as "Route 66" or "the shore of lake erie" or "los angeles county".

coolclu3

7:09 am on Dec 30, 2008 (gmt 0)

10+ Year Member



Thanks guys,
@phrangque: A place, in my case, is an "area", therefore including "points" on a map. So it could be a state, could be a small village, or it could be a specific location like Google's headquater mentioned above.
@piatkow: My 2nd design would be the most normalized. But, you know, the more normalized it is, the slower the SELECT will be
@LifeinAsia: I'm not sure what you mean by the first sentence. You mean, by looking at one row alone, you can't figure out what sort of place it is?
If so, that's probably why the 1st design is preffered. But the DB will grow huge, and somewhat redundant.
For example, take the houses numbered from 1 to 100 on one street as an example. All of those 100 rows will contain exactly the information on country, state ...
I know hard disks are cheap but is this gonna cause a problem in this case?
Imagine, I want to store all the information of all the spots in the US. I don't know, but it could be say 1 billion rows for the 1st design, and a lot more number of rows for the 2nd design (but with less columns)

And I'm not sure which would be the best option for a scalable DB in this case?

lavazza

8:54 am on Dec 30, 2008 (gmt 0)

10+ Year Member



For example, take the houses numbered from 1 to 100 on one street as an example. All of those 100 rows will contain exactly the information on country, state ...

I humbly suggest that you read up on normalisation and then heed piatkow's advice about creating a properly normalised data model, with emphasis on the term model [google.com]

coolclu3

9:46 am on Dec 30, 2008 (gmt 0)

10+ Year Member



Hello lavazz,
Actually, I have read about normalization. But I still don't know to which degree my db should be normalized or denormalized.

I am only extremely concerned with performance. One of the SELECT queries I might do, for example, is to find out the details of all the houses number 35 in California (in any street)

That's why I am asking here. Anyone please be kind enough to roughly suggest a schema?

(I hope i'm not so confusing..)

lavazza

10:13 am on Dec 30, 2008 (gmt 0)

10+ Year Member



Reading up on third normal form might help

Please note: db design is NOT a trivial task

This does not mean it's 'rocket science' - but there IS a learning curve

Once you know the rules, you can break them (denormalise) for speed... but... I have a hunch that - even with a gazillion records, the benefits of denormalisation may well be not only negligible but also negated by increased maintenance costs (labour and time)

Whatever... learn the rules first, or hire someone who knows 'em already

:)

LifeinAsia

4:19 pm on Dec 30, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



All of those 100 rows will contain exactly the information on country, state ...

There's nothing wrong with that per se.
It would be better to normalize it better like I said and have a States table and move the Country field to that table.
One of the SELECT queries I might do, for example, is to find out the details of all the houses number 35 in California (in any street)

If you're going to do a lot of queries like that (although I'm not sure why you would want to), then it would probably make sense to separate the number and street name into different fields in your table.

coolclu3

1:49 am on Jan 2, 2009 (gmt 0)

10+ Year Member



I will have a separate State table. So this is a kinda a hybrid between the 2 designs.

Thanks guys & happy new year!