Which is the better of the two? - (deprecated) XML Development forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Which is the better of the two?

I have two ways of presenting an XML file, which is best?

Demaestro

7:16 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

So way one is:

<data>
<row phone_number="999-9999" city="Leduc" first_name="Viki" last_name="Vale" address2="" area_code="780" stage_name="Bat Babe" cast_member_id="1" address="8020 sparrow drive" province_id="1" transaction_reference_code="fgkjfgjfk" password="tester" email="#*$!X@example.com" />

<row phone_number="999-9999" city="Leduc" first_name="Viki" last_name="Vale" address2="" area_code="780" stage_name="Bat Babe" cast_member_id="1" address="8020 sparrow drive" province_id="1" transaction_reference_code="fgkjfgjfk" password="tester" email="#*$!X@example.com" />
</data>

The second way:

<data>
<row>
<phone_number>999-9999</phone_number>
<city>Leduc</city>
<first_name>Viki</first_name>
<last_name>Vale</last_name>
<address2/>
<area_code>780</area_code>
<stage_name>Bat Babe</stage_name>
<cast_member_id>1</cast_member_id>
<address>8020 sparrow drive</address>
<province_id>1</province_id>
<transaction_reference_code>fgkjfgjfk</transaction_reference_code>
</row>

<row>
<phone_number>999-9999</phone_number>
<city>Leduc</city>
<first_name>Viki</first_name>
<last_name>Vale</last_name>
<address2/>
<area_code>780</area_code>
<stage_name>Bat Babe</stage_name>
<cast_member_id>1</cast_member_id>
<address>8020 sparrow drive</address>
<province_id>1</province_id>
<transaction_reference_code>fgkjfgjfk</transaction_reference_code>
</row>

</data>

httpwebwitch

7:25 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I'd go with the second one

Demaestro

7:30 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Ya the more looking at other examples I do the more I think the second is the way to go as well.

Thanks WebWitch..

Any other input is welcome.

cmarshall

7:57 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I agree.

Attributes are a bit more of a pain to work with (especially in XSLT and Schema) than elements.

fside

3:04 pm on Mar 14, 2008 (gmt 0)

10+ Year Member

> So way one is: <

If just holding the data, I think either would be fine. For processing, again, it's not complicated to use the "@" symbol and read attribute values. Perhaps one way to think of it is as a database, and a hierarchical one at that. So the attributes are attributes. The element is your record. You don't want to unnecessarily repeat your attributes but rather use a point or relationship to transactions. So I might have the personal info in attributes, but that which goes to multiple transactions would be inferior/child records. Phone number, as well, because you might want home, office and cell, and emergency contact. Just like a database, or 'flattening' a relational db into its hierarchy. Addresses. Home, office, summer home, divorced husband, kid off at college. Maybe that would be too much. But that would my suggestion. Think of it as a db.

cmarshall

3:20 pm on Mar 14, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Where you have problems is in mixing the two [webmasterworld.com].

I think elements are more flexible for XSLT and Schema. I don't know performance ramifications.

httpwebwitch

2:28 am on Mar 17, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Ooooh, yes. heed that advice from cmarshall - if you mix the two, you may as well forget about using schema for validation.

I usually make the attribute vs childNode choice like this:

You can only use an attribute if there is one and only one possible value. I will always use an attribute if the attribute acts as an identifier.
for instance:
<elem id="123">
<webpage url="http:blahblahblah">
<book pagecount="432">
<person SIN="888999222333">

I'll use attributes when the data point is really closely bound semantically to the element, as in these examples.

But if there's a possibility that any of these data points would be plural, you must use a child Node. And if the data point is not semantically tightly bound to the element, then I prefer to use a childNode.
you mustn't, for instance, do this:
<person hobby="scuba" hobby="spherical trigonometry" hobby="oil painting" />

You must do this:
<person>
<hobby>scuba</hobby>
<hobby>spherical trigonometry</hobby>
<hobby>oil painting</hobby>
<person>

I favour using child Nodes in most situations, except, as noted, when the attribute is a unique identifier of some sort.

The choice is aesthetic, personal, and somewhat arbitrary. I don't think speed or optimization are a factor... I've never noticed any performance issues one way or another.

Another reason I like using childNodes? I use a "pretty print" formatter on XML which automatically indents nodes, and that makes it easier to scan your eyes over the data and see the rhythmic repetition of similar nodes

cmarshall

2:39 am on Mar 17, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I use a "pretty print" formatter on XML which automatically indents nodes, and that makes it easier to scan your eyes over the data and see the rhythmic repetition of similar nodes

That's a pretty damn good reason, IMNSHO.

Demaestro

4:47 pm on Mar 17, 2008 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I am finding the Util I use for recursively inserting the data into a table likes the second more as well.

WebWitch I like your criteria for when to use a attribute. Static values seem to belong there.

<ingredient unit="cups">
Flour
</ingredient>

fside

9:16 pm on Mar 17, 2008 (gmt 0)

10+ Year Member

"Static values seem to belong there"

You mean column headings. Attribute names. I think what belongs as attributes are - attributes. Where you don't have simple data, or have long nested html, xml, etc. strings, then it's best to have these in an inferior elements. Anytime you run into the limits of trying to hold a string within double quotes it's a bow to the limitation, not the ought. Of course one can escape and unescape, and so on. But it might seem too much. As I said, above, I think it's best to think of it as one would any db. But with that exception. So if only one address will ever be used, those can be attributes. If multiple addresses, it's best to think of that as a dependent table, and each set of addresses separately as a group of attributes for separate elements. Etc.