Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / April 2004

Tip: Looking for answers? Try searching our database.

newby (very) question on XML DB theory

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
ccc31807 - 16 Mar 2004 14:27 GMT
I have an assignment to survey XML database technologies in about
three weeks. I've looked closely at eXcite, reviewed XQuery and XPath
(and others) at w3.org, and essentially have tried to all the due
diligence I can.

I *know* that I will get a question like this, "Where can I go to read
up on the theory of XML databases?" If I cite to internet articles, I
will be told, "If it's not in print, it's worthless."

Question: does anyone suggest a source of XML DB theory, preferably in
journals of a professional and academic nature? IEEE and ACM are not a
problem.

Besides, I need to do a fast read on this, so I'm looking for some
substantial stuff that's not overly dense or highly theoretical.

Please reply to this message rather than privately. And accept my
gratitude in advance for all those who reply.

Thanks, Charles Carter.
Eric Kaun - 16 Mar 2004 17:35 GMT
> I have an assignment to survey XML database technologies in about
> three weeks. I've looked closely at eXcite, reviewed XQuery and XPath
[quoted text clipped - 8 lines]
> journals of a professional and academic nature? IEEE and ACM are not a
> problem.

The most lucent explanation of XML "data theory" is an article called
The Essence of XML, by Philip Wadler, formerly of functional/monad "fame".

In addition, look for W3C articles on the XML Infoset. Finally, look for
articles by Don Chamberlin about XQuery. I think XQuery is a mess, but
those explain it fairly well.

> Besides, I need to do a fast read on this, so I'm looking for some
> substantial stuff that's not overly dense or highly theoretical.

So you're looking for theory that's not highly theoretical? Hmmm...

> Please reply to this message rather than privately. And accept my
> gratitude in advance for all those who reply.

- Eric
Mikito Harakiri - 16 Mar 2004 18:38 GMT
> I have an assignment to survey XML database technologies in about
> three weeks. I've looked closely at eXcite, reviewed XQuery and XPath
[quoted text clipped - 14 lines]
> Please reply to this message rather than privately. And accept my
> gratitude in advance for all those who reply.

Checkout V.Vianu review in Sigmod Record

http://www.acm.org/sigmod/record/issues/0306/D2-DBP.VictorVianu.pdf
ccc31807 - 30 Mar 2004 19:20 GMT
Thanks much for these responses. The nature of the project has
changed, but my assignment hasn't, and I've found several articles in
SIGMOD that were very helpful.

Some very smart people have told me that there can be no such thing as
a native XML database, but when pressed, they acknowledge that XML is
a superior way to mark up data. Because of the time factor, searches
will be done using program modules, but I'm still to go ahead with the
presentation.

CC
Eric Kaun - 30 Mar 2004 21:26 GMT
> Thanks much for these responses. The nature of the project has
> changed, but my assignment hasn't, and I've found several articles in
[quoted text clipped - 5 lines]
> will be done using program modules, but I'm still to go ahead with the
> presentation.

What is the meaning of "mark up", and why would one want to do it? What is
the difference between "marking up" and actually defining your data in some
rigorous way?  I assume you're talking about marking up TEXT, rather than
data.
ccc31807 - 08 Apr 2004 19:20 GMT
> What is the meaning of "mark up", and why would one want to do it? What is
> the difference between "marking up" and actually defining your data in some
> rigorous way?  I assume you're talking about marking up TEXT, rather than
> data.

In this case, the data is all text (ASCII or ISO-8859-1). The data
will mostly be searched. That is, it will be updated very seldom. The
results of searches need to be displayed via HTTP. In this case, it
just makes sense to create a little application that can search the
documents using XPath, and use CSS to display the resulting data.

We can assume that the datums will look something like this:

<course no="CPSC101" credit="3" hours="3" lab="4">
  <title>Introduction to Programming Logic</title>
  <prerequisite>none</prerequisite>
  <description>Whatever the description is ... </description>
</course>

CC
Eric Kaun - 08 Apr 2004 20:31 GMT
> > What is the meaning of "mark up", and why would one want to do it? What is
> > the difference between "marking up" and actually defining your data in some
[quoted text clipped - 14 lines]
>    <description>Whatever the description is ... </description>
> </course>

Of course you can do it that way, although frequently-updated data aren't
the only data which benefit from a relational database. If you already have
it in that format, won't need to share the data with other programs, don't
need much efficiency, won't be doing ad hoc queries, etc. etc., then using
XPath on text in that format is the quickest way to achieve HTML output.
ccc31807 - 09 Apr 2004 14:26 GMT
Basically, we have a lot of data in documents with a highly irregular
structure that we need to make available over networks. Historically,
this evolved from printed text documents, to a spreadsheet, to a flat
file database, to a real database. The question is: can we do this
without the overhead of the RDBMS. My job was not to answer yes or no,
but point the decision makers in the direction of the information.

We're fairly well versed in DB stuff but totally ignorant as to XML
database possibilities.

CC

> Of course you can do it that way, although frequently-updated data aren't
> the only data which benefit from a relational database. If you already have
> it in that format, won't need to share the data with other programs, don't
> need much efficiency, won't be doing ad hoc queries, etc. etc., then using
> XPath on text in that format is the quickest way to achieve HTML output.
Akmal B. Chaudhri - 09 Apr 2004 16:06 GMT
> Basically, we have a lot of data in documents with a highly irregular
> structure that we need to make available over networks. Historically,
[quoted text clipped - 5 lines]
> We're fairly well versed in DB stuff but totally ignorant as to XML
> database possibilities.

O.K. Two very good sites with resources on XML and Databases and XML
Databases, etc. [1], [2]. Several books also available. Try a search at
Amazon on XML Databases, for example, and you should get a couple of hits,
e.g. [3], [4], [5].

HTH

akmal

[1] http://www.rpbourret.com/xml/
[2] http://xml.coverpages.org/xmlAndDatabases.html
[3] Professional XML Databases (if you plan to use an RDBMS)
[4] XML Data Management (this is one I helped edit)
[5] Designing XML Databases (if you plan to use an RDBMS)
Eric Kaun - 12 Apr 2004 17:06 GMT
> Basically, we have a lot of data in documents with a highly irregular
> structure that we need to make available over networks. Historically,
[quoted text clipped - 5 lines]
> We're fairly well versed in DB stuff but totally ignorant as to XML
> database possibilities.

I'd recommend looking at the basics: www.dbdebunk.com will give you a very
critical view of XML, and will point you to the writings of Chris Date and
Fabian Pascal. Using an XML database assumes you'll only ever want to spit
out XML in the same way you stored it; you sacrifice the meaning of the data
to the god of presentation. Stick with a relational DBMS, and try some
different ones. Is raw speed your main requirement? Do clients always want
the data only in one format?

I'm sure others can point you to XML database information, but rest assured
it's got a very feeble foundation.

- Eric
Dawn M. Wolthuis - 12 Apr 2004 17:47 GMT
> > Basically, we have a lot of data in documents with a highly irregular
> > structure that we need to make available over networks. Historically,
[quoted text clipped - 11 lines]
> out XML in the same way you stored it; you sacrifice the meaning of the data
> to the god of presentation.

That is true of some databases that persist the documents as documents
without the ability to query the values stored within.  But you can have
both -- easy presentation of data the way that it will most likely benefit
the user and the ability to query the data stored in those documents
"simply" by storing the data in nested (graph/tree) structures.  A query
tool that has been available under many different names for almost (but not
quite) 40 years (!!!) is the query language associated with PICK (such as
UniQuery, English, Retieve, Access, AQL, jQL, and many more).

> Stick with a relational DBMS, and try some
> different ones. Is raw speed your main requirement? Do clients always want
> the data only in one format?

I'm still thinking that is not your best bang for the buck.  I'm hoping the
XML databases get to the point where their query language is as easy as PICK
and have hope for them yet (in spite of those tags they drag along with
every value, ugh!)

> I'm sure others can point you to XML database information, but rest assured
> it's got a very feeble foundation.

Xindice (I think that is apache.org), Berkely DB-XML at sleepycat.com and if
you check the w3c.org site it will point you to others, I'm pretty sure.
Good luck!  --dawn
Jan Hidders - 13 Apr 2004 21:18 GMT
> I'm still thinking that is not your best bang for the buck.  I'm hoping the
> XML databases get to the point where their query language is as easy as PICK
> and have hope for them yet (in spite of those tags they drag along with
> every value, ugh!)

You think XQuery is too difficult? By the way, what makes you think that
in XQuery / XML databases you have to drag tags along with values?

-- Jan Hidders
Dawn M. Wolthuis - 14 Apr 2004 06:26 GMT
> > I'm still thinking that is not your best bang for the buck.  I'm hoping the
> > XML databases get to the point where their query language is as easy as PICK
> > and have hope for them yet (in spite of those tags they drag along with
> > every value, ugh!)
>
> You think XQuery is too difficult?

I think it is a language for IT professionals and would like to see a
standard "end-user query language."  I'll admit that I haven't done enough
work with XQuery to see just how simple one could make queries against a
database by defining functions, virtual data, and "views" of the data so
that the user need not think in terms of navigating. I realize that GUI's
can be put "on top" of it as it is, but such a GUI would presumably be
proprietary rather than a standard end-user tool for database access.  Users
who can ask their database now with a 40-year-old language by typing
LIST COURSES WITH INSTRUCTOR_LAST_NAME LIKE "VAN..."
(even though the instructor last name is not stored in the courses "file")
might not find the comparable XQuery statement an advance in query
languages.  I'm optimistic that after I dig into it further it will become
clearer to me how XQuery will really BE an advance.  I DEFINITELY like the
fact that it reads non-1NF data.

> By the way, what makes you think that
> in XQuery / XML databases you have to drag tags along with values?

I don't think you need to do so for XQuery, but I did think that was the
case with XML databases, or at least that your input to and output from the
databases using "read" and "writes" would retrieve XML documents.  I haven't
coded anything to use an XML database, so I could be completely wrong on
that -- feel free to enlighten me.  Thanks.  --dawn
Laconic2 - 14 Apr 2004 13:15 GMT
Why does it have to be a language?

Why can't it just be some kind of point-and-click drag-and-drop pick from
the menu  graphics oriented package like powerplay?    Once you have the
data loaded,  slicing and dicing it any way you want,  and packaging and
presenting it is child's play.

Isn't that the whole idea?  To turn data query into an arcade game?
Dawn M. Wolthuis - 14 Apr 2004 15:55 GMT
> Why does it have to be a language?
>
[quoted text clipped - 4 lines]
>
> Isn't that the whole idea?  To turn data query into an arcade game?

Yes, but have you seen any industry standards for query GUI's?  --dawn
Laconic2 - 14 Apr 2004 16:27 GMT
> Yes, but have you seen any industry standards for query GUI's?  --dawn

No, and that's the whole point.  If you share persistent data, and you don't
have standards, you get an unrecognizable mess.

If you impose rigid  standards on data as presented to the user,  you
prevent the user from "doing whatever the user wants".

I wish there could be user interface that would make the data "look like"
PICK data to you,  while the real underlying data is actually in 1NF,  and
maybe more.  Or how about a PICK with an SQL interface?
Dawn M. Wolthuis - 14 Apr 2004 16:35 GMT
> > Yes, but have you seen any industry standards for query GUI's?  --dawn
>
[quoted text clipped - 7 lines]
> PICK data to you,  while the real underlying data is actually in 1NF,  and
> maybe more.  Or how about a PICK with an SQL interface?

I've "done" both and I'm thankful that with "XML data model" languages (no
need to correct me on that) on the horizon, there will be no need to take
the non-1NF data and translate it to 1NF for any purposes.  I have a few
issues with various web services standards (so don't get me started) but I
sure appreciate that we don't have to 1NF the data anywhere in the
rocess.  --dawn
Eric Kaun - 14 Apr 2004 21:17 GMT
> I've "done" both and I'm thankful that with "XML data model" languages (no
> need to correct me on that) on the horizon, there will be no need to take
> the non-1NF data and translate it to 1NF for any purposes.  I have a few
> issues with various web services standards (so don't get me started) but I
> sure appreciate that we don't have to 1NF the data anywhere in the
> rocess.  --dawn

It seems a huge price to pay for avoiding 1NF... don't throw the baby out
with the bathwater.
Anthony W. Youngman - 28 Apr 2004 18:06 GMT
>> I've "done" both and I'm thankful that with "XML data model" languages (no
>> need to correct me on that) on the horizon, there will be no need to take
[quoted text clipped - 5 lines]
>It seems a huge price to pay for avoiding 1NF... don't throw the baby out
>with the bathwater.

1NF is a huge price to pay for having normalised data ... if you've got
any sense you'll throw the cuckoo out the nest :-)

Cheers,
Wol
Signature

Anthony W. Youngman - wol at thewolery dot demon dot co dot uk
HEX wondered how much he should tell the Wizards. He felt it would not be a
good idea to burden them with too much input. Hex always thought of his reports
as Lies-to-People.
The Science of Discworld : (c) Terry Pratchett 1999

Anthony W. Youngman - 28 Apr 2004 18:04 GMT
>> Yes, but have you seen any industry standards for query GUI's?  --dawn
>
[quoted text clipped - 7 lines]
>PICK data to you,  while the real underlying data is actually in 1NF,  and
>maybe more.  Or how about a PICK with an SQL interface?

In other words, any modern Pick?

AFAIK they pretty much all work fine with SQL - it's just that their
native query language is easier to use (if you're querying, not
updating).

Cheers,
Wol
Signature

Anthony W. Youngman - wol at thewolery dot demon dot co dot uk
HEX wondered how much he should tell the Wizards. He felt it would not be a
good idea to burden them with too much input. Hex always thought of his reports
as Lies-to-People.
The Science of Discworld : (c) Terry Pratchett 1999

Jan Hidders - 14 Apr 2004 21:55 GMT
>>You think XQuery is too difficult?
>
[quoted text clipped - 7 lines]
> who can ask their database now with a 40-year-old language by typing
> LIST COURSES WITH INSTRUCTOR_LAST_NAME LIKE "VAN..."

//Courses[starts-with(.//@instructor_last_name, "VAN")]

I've taught this stuff to CS students and non-CS students. I have no
idea why you think this would be too difficult for the latter. I do,
however, have an idea why a non-declarative query language that requires
programming if queries get a little bit more difficult would be
problematic for them.

Not that I would claim that XQuery is as simple as can be. Far from that.

>>By the way, what makes you think that
>>in XQuery / XML databases you have to drag tags along with values?
>
> I don't think you need to do so for XQuery, but I did think that was the
> case with XML databases, or at least that your input to and output from the
> databases using "read" and "writes" would retrieve XML documents.

?? Obviously you have to indicate which documents you are querying. What
does that have to do with dragging tags along?

-- Jan Hidders
Dawn M. Wolthuis - 14 Apr 2004 22:42 GMT
<snip>

From:

>> LIST COURSES WITH INSTRUCTOR_LAST_NAME LIKE "VAN..."

to:

> //Courses[starts-with(.//@instructor_last_name, "VAN")]

That's how far we have moved in 40 years?  It doesn't look like it is even
in a forward direction, does it?  Read each out loud.  Are you pleased with
that?  Look at where hardware has come during those same 40 years!
Cheers!  --dawn

<snip>
Jan Hidders - 14 Apr 2004 23:24 GMT
> <snip>
>
[quoted text clipped - 7 lines]
>
> That's how far we have moved in 40 years?

No, it's just a single XQuery expression that shows that a certain query
is not as complicated in XQuery as you implied it was. Why you or anyone
would think that one such example describes completely and in all
aspects "how far we have moved" is beyond me and strikes me as a bit
simplistic.

-- Jan Hidders
Dawn M. Wolthuis - 15 Apr 2004 00:21 GMT
> > <snip>
> >
[quoted text clipped - 13 lines]
> aspects "how far we have moved" is beyond me and strikes me as a bit
> simplistic.

We are not yet on the same wavelength on this one, Jan.  Please understand
that I'm excited about the possibilities with XQuery more than your average
SQL-kinda-guy.  I have not used XQuery yet, but I have read much of "XQuery
from the Experts" and have verified some of the aspects I care about, such
as use of a 2-valued logic.  I'm not concerned about the complexity of the
language "for me" but am lamenting the loss that I can see exemplified (not
proven) in what you see above.  This is the change from the language written
as GIRLS in the mid-60's to XQuery 40 years later, both working against very
similar data models.  XQuery has many features that GIRLS (the PICK query
language) lacks and will be, on the whole, a move forward.

But, I don't think this is just one of those "I like my tools because I know
them" issues -- I think that everyone here in Iowa could see what I mean by
reading those statements -- can you see it and understand at least a little
bit my lament on this?  Couldn't we retain the human-language-like nature of
both PICK and to a lesser extent, SQL, and still accomplish the goals of
XQuery?  Must it look and read like the language of a computer, even though
we know it is?
--dawn
Jan Hidders - 15 Apr 2004 20:08 GMT
> But, I don't think this is just one of those "I like my tools because I know
> them" issues -- I think that everyone here in Iowa could see what I mean by
> reading those statements -- can you see it and understand at least a little
> bit my lament on this?

Absolutely.

> Couldn't we retain the human-language-like nature of
> both PICK and to a lesser extent, SQL, and still accomplish the goals of
> XQuery?

No. But I'm quite sure that for a very limited subset one could come up
with an elegant and more natural language interface.

-- Jan Hidders
Mikito Harakiri - 14 Apr 2004 22:59 GMT
> //Courses[starts-with(.//@instructor_last_name, "VAN")]
>
[quoted text clipped - 3 lines]
> programming if queries get a little bit more difficult would be
> problematic for them.

May I suggest that XQuery is more complex than SQL? Because it's less
pure:-?

Take outer join for, example. As purity is broken (by allowing nulls into
result set) selection and outer join operation don't commute anymore. It
took me some time (with usenet help) to realize that

select * from t1 left join t2 on t1.id=t2.id and t2.id=2

is different from

select * from t1 left join t2 on t1.id=t2.id where t2.id=2

for example. This is never an issue with ordinary joins and selections.
Simplicity of the underlying algebra is the key for the query language
success. (Hmm, what about SQL?)
Jan Hidders - 15 Apr 2004 00:08 GMT
>>//Courses[starts-with(.//@instructor_last_name, "VAN")]
>>
[quoted text clipped - 5 lines]
>  
> May I suggest that XQuery is more complex than SQL?

You may. :-) Seriously though, who has claimed otherwise?

> Take outer join for, example. As purity is broken (by allowing nulls into
> result set) selection and outer join operation don't commute anymore. It
[quoted text clipped - 7 lines]
>
> for example. This is never an issue with ordinary joins and selections.

I'm not sure that is a good example because null values (for XML:
undefined attributes or missing elements) are actually dealt with in
XQuery in a very clear and consistent manner. But if your point is that
XML is a much more complicated data model and therefore its query
language in general a much more complicated query language, then, yes, I
would certainly agree.

> Simplicity of the underlying algebra is the key for the query language
> success. (Hmm, what about SQL?)

Actually the original definitions of semistructured data models were
really quite simple and elegant, and the corresponding languages based
on solid theory. But then XML came along and the whole thing got messy.
Sounds familiar, no?

-- Jan Hidders
Timothy J. Bruce - 15 Apr 2004 19:47 GMT
Mr. Carter:

You have been sent on a fool's errand.  
There is no such animal as an XMLDBMS.  While it is true that anything
which stores data could be called a database, and thus you may indeed
have an XML database by virtue of having any XML file, data existing
on its own in a vacuum is useless w/o a _formal_ theory and definition
(hint: it consists of one sentence and three bullets).  While there
are RDBMS and SQLDBMS systems I have yet to see an XMLDBMS.

Maybe you should make one,
Timothy J. Bruce
uniblab@hotmail.com
</RANT>
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.