Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / March 2004

Tip: Looking for answers? Try searching our database.

Testing Various Data Models?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Dawn M. Wolthuis - 09 Mar 2004 18:12 GMT
If we were to find a sponsor for a competition where the winner would be a
named data model and the primary criteria would be related to the cost tor
an institution choosing to use that data model (including costs if there is
a lack of quality or breach of security or difficulty in maintinaing it,
...), what should that competition include?

I would very much like to see such a competition because my intuition,
without sufficient scientifically gathered emperical data, is that the
relational model ought to be retired.  I realize there are logical arguments
for doing so, but even my own logical arguments don't prove to me that any
other model (particularly those modeled on mathematical graphs or di-graphs)
is any better, only that each has merits and demerits.  I would like some
real data - how could we collect such?

Thanks.  --dawn
Mikito Harakiri - 09 Mar 2004 18:29 GMT
> I would very much like to see such a competition because my intuition,
> without sufficient scientifically gathered emperical data, is that the
> relational model ought to be retired.

Did you receive ACM Turing award recently? Because, this kind of statement
can be taken seriously only as a part of ACM Turing award speech.
Dawn M. Wolthuis - 09 Mar 2004 18:50 GMT
> > I would very much like to see such a competition because my intuition,
> > without sufficient scientifically gathered emperical data, is that the
> > relational model ought to be retired.
>
> Did you receive ACM Turing award recently? Because, this kind of statement
> can be taken seriously only as a part of ACM Turing award speech.

Being one of the few with "women's intuition" who speak up on this list ;-)
I'll add to it that I really, truly did dream that a person was dead and
awoke to hear that they were.  So, even if you don't take my intuition
seriously, I do.  smiles.  --dawn
P.S.  Even if someone has a hunch with no backing whatsoever (which is not
the case here), it is a good idea to find a way to test out such a hunch,
right?
Mikito Harakiri - 09 Mar 2004 19:11 GMT
> P.S.  Even if someone has a hunch with no backing whatsoever (which is not
> the case here), it is a good idea to find a way to test out such a hunch,
> right?

For hunch-backed starting joining things might help.
Dawn M. Wolthuis - 09 Mar 2004 20:09 GMT
> > P.S.  Even if someone has a hunch with no backing whatsoever (which is not
> > the case here), it is a good idea to find a way to test out such a hunch,
> > right?
>
> For hunch-backed starting joining things might help.

The "hunch-backed" is clever, but what in the world does the rest of this
mean?  How would you propose testing out various data model theories for
their usefulness in software applications?  --dawn
Christopher Browne - 09 Mar 2004 21:13 GMT
>> > P.S.  Even if someone has a hunch with no backing whatsoever (which is
> not
[quoted text clipped - 7 lines]
> mean?  How would you propose testing out various data model theories for
> their usefulness in software applications?  --dawn

_You're_ the one who was proposing the grand announcement.

Responsibility for having a strategy to evaluate the merits of
alternatives therefore rests on you, not on anyone else.
Signature

(format nil "~S@~S" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/internet.html
If a cow laughed, would milk come out its nose?

Dawn M. Wolthuis - 09 Mar 2004 23:17 GMT
<snip>
> _You're_ the one who was proposing the grand announcement.
>
> Responsibility for having a strategy to evaluate the merits of
> alternatives therefore rests on you, not on anyone else.

Could you point me to the emperical data that proves that the relational
model is better for someone using it than any other?  I am unaware of any
previous tests, but would be happy to find out that there is such evidence.

Also, I'm quite certain that I am not the only one reading this news group
that believes there are other non-relational models that hold significant
advantages to companies employing them.  So, no matter what model you think
might be a good one, whether relational or not, how would you go about
proving it to those who might wish to employ it?  This isn't my issue -- it
is an industry issue.

--dawn
Christopher Browne - 10 Mar 2004 04:10 GMT
After takin a swig o' Arrakan spice grog, "Dawn M. Wolthuis" <dwolt@tincat-group.com> belched out:
>> Martha Stewart called it a Good Thing when "Dawn M. Wolthuis"
> <dwolt@tincat-group.com> wrote:
[quoted text clipped - 8 lines]
> am unaware of any previous tests, but would be happy to find out
> that there is such evidence.

No, that's your responsibility.

> Also, I'm quite certain that I am not the only one reading this news
> group that believes there are other non-relational models that hold
> significant advantages to companies employing them.  So, no matter
> what model you think might be a good one, whether relational or not,
> how would you go about proving it to those who might wish to employ
> it?  This isn't my issue -- it is an industry issue.

You keep on claiming such a belief; if it be true, then you should
surely be able to locate quantifiable reports to prove your belief.

It's YOUR claim; proof is YOUR responsibility.  Establishing metrics
to support your claims is YOUR responsibility.
Signature

(reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/sgml.html
The way to a man's heart is through the left ventricle.

Dawn M. Wolthuis - 10 Mar 2004 04:20 GMT
> After takin a swig o' Arrakan spice grog, "Dawn M. Wolthuis" <dwolt@tincat-group.com> belched out:
> >> Martha Stewart called it a Good Thing when "Dawn M. Wolthuis"
[quoted text clipped - 24 lines]
> It's YOUR claim; proof is YOUR responsibility.  Establishing metrics
> to support your claims is YOUR responsibility.

You mean just like the relational theorists have proven that the
implementations of the model provide a better bang for the buck for those
who employ them than any other models?  I have never seen any such facts,
have you?  --dawn
Archibald Tuttle - 10 Mar 2004 11:41 GMT
>You mean just like the relational theorists have proven that the
>implementations of the model provide a better bang for the buck for those
>who employ them than any other models?  I have never seen any such facts,
>have you?  --dawn

Let's see: my half-tera operational store sits on an RDBMS in a physical schema
implemented from a relational model, the decision support systems aggregates
data extracted from that schema to a different schema (star) which was
_physically_ implemented in a different way to enhance performance.

Does it work good ? yes.
Did it cost much ? yes.
Does it provide value for money ? yes.
Would other multi-valued, hierarchic, object technologies have provided
a better ROI ? I don't think so.

These are not proofs of course, but at least they are facts for me.

Mr. Tuttle
Marshall Spight - 10 Mar 2004 04:29 GMT
> I would very much like to see such a competition because my intuition,
> without sufficient scientifically gathered emperical data, is that the
> relational model ought to be retired.

That's kind of like saying set theory ought to be retired.

I really think the intuition you're having has nothing to do
with the data model per se. I think it has everything to do
with the kinds of tools and integration you are used to.
These tend not to be so great when mixing, say, C++ or
Java with Oracle or MySQL.

Marshall
Dawn M. Wolthuis - 10 Mar 2004 04:39 GMT
> > I would very much like to see such a competition because my intuition,
> > without sufficient scientifically gathered emperical data, is that the
> > relational model ought to be retired.
>
> That's kind of like saying set theory ought to be retired.

Yes, what I should have said was that it was implementations of the
relational model that should be retired.  However, a theory about mass
murderers that isn't the best at solving such crimes is not the theory I
would want to employ.  So, I think is not a huge stretch to suggest we
retire the relational model for the purpose of modeling data to be stored
and retrieved in databases.

smiles.  --dawn
Eric Kaun - 10 Mar 2004 13:36 GMT
> > "Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message
> news:c2l1dq$2q3$1@news.netins.net...
[quoted text clipped - 7 lines]
> Yes, what I should have said was that it was implementations of the
> relational model that should be retired.

Well... there's only one, and it's a little young for retirement.

> However, a theory about mass
> murderers that isn't the best at solving such crimes is not the theory I
> would want to employ.

Computing depends on many theories, just as criminal investigation depends
on forensics (which incorporates biology, chemistry, physics in general,
fluid dynamics, etc. etc. etc.), psychology, sociology, etc. etc. etc. It's
a complex discipline which relies on the stability and predictability of
underlying science to work properly.

Now, if Zippy the Wonder Psychic comes along and is able to solve crime
after crime using dowsing rods and crystal balls, that doesn't really speak
to the repeatability of the "process." If Zippy (and no, I'm not comparing
Zippy to you) arranges a competition of psychics against legitimate
investigators, and the psychics happen to win, that still suggests very
little. There needs to be an element of fundamental scientific truth
underlying such things. Even investigators with good hunches need to
substantiate their guesses, and they use science for that.

So this competition you propose would be very difficult to orchestrate in a
way that properly measures anything, and you'd still need to somehow explain
a success in terms of... well, something basic and understood like math and
science.

> So, I think is not a huge stretch to suggest we
> retire the relational model for the purpose of modeling data to be stored
> and retrieved in databases.

Many have suggested it. All are wrong, for reasons spelled out many places,
even in these newgroups. The transformations done in computing are for the
most part straightforward, at least in business information systems, and
demand a solid foundation, not a practice founded on inexplicable theory and
needless complexity.

As an aside, another reading suggestion for all: Problem Frames, by Michael
Jackson (the S/W engineer, not the pop star or the beer expert). An
excellent application of domains and predicates to problem analysis rather
than data modeling.

- Eric
Marshall Spight - 10 Mar 2004 22:48 GMT
> Yes, what I should have said was that it was implementations of the
> relational model that should be retired.

I'm on board if we can start with MySQL.

> However, a theory about mass
> murderers that isn't the best at solving such crimes is not the theory I
> would want to employ.

Why are all the examples lately about people with empty eye sockets
or mass murderers or whatever? I regret that I was ever bored by the
parts/suppliers example database. At least it never made me queasy.

Marshall

PS. Sorry for being totally non-responsive; I'm a little woozy from all
the medicines I'm taking for this cold.
Christopher Browne - 11 Mar 2004 00:52 GMT
In the last exciting episode, "Marshall Spight" <mspight@dnai.com> wrote:

>> Yes, what I should have said was that it was implementations of the
>> relational model that should be retired.
>
> I'm on board if we can start with MySQL.

I'm not sure the implementors ever seriously claimed it was one.  They
bill it as "The World's Most Popular Open Source Database," and seem
as uninterested in discussion of things "relational" as any of the
other vendors of databases that are only nominally associated with the
relational model.

And in any case, their next project looks likely to be "MySQL is
obsolete - long live MaxDB!"

> PS. Sorry for being totally non-responsive; I'm a little woozy from
> all the medicines I'm taking for this cold.

I quite understand that...
Signature

let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www3.sympatico.ca/cbbrowne/rdbms.html
'Mounten'  wird  fuer  drei  Dinge benutzt:  'Aufsitzen'  auf  Pferde,
'einklinken'  von Festplatten in  Dateisysteme, und,  nun, 'besteigen'
beim Sex.

Dan - 10 Mar 2004 07:14 GMT
> If we were to find a sponsor for a competition where the winner would be a
> named data model and the primary criteria would be related to the cost tor
> an institution choosing to use that data model (including costs if there is
> a lack of quality or breach of security or difficulty in maintinaing it,
> ...), what should that competition include?

Four things I would like to see included, at the very least:

1.  Costs and resources incurred on applications as a result of changes of
internal database organization, in turn due to evolving business models and
requirements.  An empirical measurement of reengineering costs for both
database and any and all affected applications - for both small and large
changes.  This would include the costs of adding new applications to share
the same data with different access requirements and having competing or
ancillary data quality constraints.

2.  Cost comparisons in providing for direct ad-hoc access of the data model
to users, including implicit requirements of whether the data model is with
or without requiring intervention and high-cost assistance through
programmers.  An emprical analysis of the cost of being able to ask any
relevant question over varying levels of complexity.

3.  Costs associated with integrating data structures across remote systems,
including the presentation of integrated views across distributed model
elements, though not necessarily semantically, structurally, or
intensionally analogous.  Costs could be associated with ease of creating a
distributed system view (i.e. view creation versus hard-coding integrating
middleware).

4.  Capabilities and costs associated with defining, maintaining, and
presenting information from model elements based on ternary keys and higher;
returned with information associated with data elements from referenced data
structures (three if ternary, etc.).

5.  Costs and resources and (damage) associated with the ability or lack of
ability of the model to maintain data integrity, especially in cases where
redundancy exists or cannot be avoided because of the intricacies or
limitations of the model itself.

Smiles.

- Dan

> Thanks.  --dawn
Dawn M. Wolthuis - 10 Mar 2004 14:33 GMT
> > If we were to find a sponsor for a competition where the winner would be a
> > named data model and the primary criteria would be related to the cost tor
[quoted text clipped - 4 lines]
> >
> Four things I would like to see included, at the very least:

Thank you, thank you for actually addressing the question, Dan!

> 1.  Costs and resources incurred on applications as a result of changes of
> internal database organization, in turn due to evolving business models and
[quoted text clipped - 3 lines]
> the same data with different access requirements and having competing or
> ancillary data quality constraints.

If we had an initial set of requirements, a single change to those
requirements prior to the implementation and three changes after the initial
setup, would that be sufficient for such a competition?  One of the
post-implementation requirements changes would be the one related to your
last statement.

> 2.  Cost comparisons in providing for direct ad-hoc access of the data model
> to users, including implicit requirements of whether the data model is with
> or without requiring intervention and high-cost assistance through
> programmers.  An emprical analysis of the cost of being able to ask any
> relevant question over varying levels of complexity.

After the initial implementations, an independent group would then do ad hoc
queries of various types and we would determine the skillsets required to
complete each query.  However, any that require training of the end-users or
where experience really counts might need some pre-trained users???

> 3.  Costs associated with integrating data structures across remote systems,
> including the presentation of integrated views across distributed model
> elements, though not necessarily semantically, structurally, or
> intensionally analogous.  Costs could be associated with ease of creating a
> distributed system view (i.e. view creation versus hard-coding integrating
> middleware).

This one is difficult to determine what to measure.  We could view the
entire collection of implementations as a distributed set of data, but then
if implementation A is better (by whatever measures we include) at
aggregating data across all disperate data sources, what does that say about
data model or even database implementation A?  What if database A is not
involved in the solution any more than database B?  It would say that team
A's overall solution "won" in this part of the competition, but it might say
nothing about data model A, right?

> 4.  Capabilities and costs associated with defining, maintaining, and
> presenting information from model elements based on ternary keys and higher;
> returned with information associated with data elements from referenced data
> structures (three if ternary, etc.).

These are quite implementation-specific criteria, but if all n
implementations of a specific model "lose" to all of another model, then I
guess it does relate to the usefulness of the model, as performance in
general does.

> 5.  Costs and resources and (damage) associated with the ability or lack of
> ability of the model to maintain data integrity, especially in cases where
> redundancy exists or cannot be avoided because of the intricacies or
> limitations of the model itself.

If the solution developed by any team yields a lack of data integrity
according to the constraints indicated in the requirements, that would
surely detract from that team's score.  It is quite likely that all teams
will end up developing rather tight solutions in this regard, however, don't
you think?

Would a "pet store" application be a sufficient as a basis for the
competition, or is there something more rigorous that would yield a more
convincingly serious competition?

Perhaps having 5 databases that are considered to be RDBMS's and 5 that are
not based on the relational model would be a good start?  For the RDBMS's --
Oracle, DB2, SQL Server, and which others?  Informix, Sybase, MySQL?  For
non-relational, we could have a MUMPS implementation -- Cache, a PICK
implementation such as UniVerse, an OO implementation -- which one? an
XML-DB such as Berkeley XML-DB or ? an implementation tied to a particular
vertical market such as that with metadata.com ?  an implementation such as
the e4graph just announced here ... a hierarchical such as IMS, others?

> Smiles.
Yup.  --dawn
Mike Nicewarner - 13 Mar 2004 17:44 GMT
I've participated in data modeling competitions before, both organizing and
participating.
Realize that data modeling is more of an iterative process than a concrete
science.  As the Zippy discussion thread pointed out, just because one
modeler would win doesn't necessarily mean that the modeling tool/style was
inherently better than the others.
I'd be curious to hear more about what you are talking about, and I would
love to see your reasoning (not intuition) for stating that the relational
model should be retired.

Signature

Mike Nicewarner [TeamSybase]
http://www.datamodel.org
mike@nospam!datamodel.org
Sybase product enhancement requests:
http://www.isug.com/cgi-bin/ISUG2/submit_enhancement

> If we were to find a sponsor for a competition where the winner would be a
> named data model and the primary criteria would be related to the cost tor
[quoted text clipped - 11 lines]
>
> Thanks.  --dawn
Bob Badour - 14 Mar 2004 02:16 GMT
> I've participated in data modeling competitions before, both organizing and
> participating.
[quoted text clipped - 5 lines]
> love to see your reasoning (not intuition) for stating that the relational
> model should be retired.

If you expect reason from Dawn, you will be disappointed.

> > If we were to find a sponsor for a competition where the winner would be a
> > named data model and the primary criteria would be related to the cost tor
[quoted text clipped - 14 lines]
> >
> > Thanks.  --dawn
Mike Nicewarner - 14 Mar 2004 16:20 GMT
Well, I don't know Dawn yet, so I'll make up my own mind as time goes by.

Signature

Mike Nicewarner [TeamSybase]
http://www.datamodel.org
mike@nospam!datamodel.org
Sybase product enhancement requests:
http://www.isug.com/cgi-bin/ISUG2/submit_enhancement

> > I've participated in data modeling competitions before, both organizing
> and
[quoted text clipped - 32 lines]
> > >
> > > Thanks.  --dawn
Bob Badour - 14 Mar 2004 21:31 GMT
> Well, I don't know Dawn yet, so I'll make up my own mind as time goes by.

You can get all you need to confirm the facts from google groups right now.

> > > I've participated in data modeling competitions before, both organizing
> > and
[quoted text clipped - 38 lines]
> > > >
> > > > Thanks.  --dawn
Dawn M. Wolthuis - 19 Mar 2004 04:07 GMT
> I've participated in data modeling competitions before, both organizing and
> participating.
[quoted text clipped - 5 lines]
> love to see your reasoning (not intuition) for stating that the relational
> model should be retired.

playing usenet catchup while on the road...

Yes, I recognize that this is similar to any bake-off competition
where it very well could be that the best recipe doesn't win.
Perhaps:
1) This particular instance of using the recipe is less than the norm
2) The judges were looking for qualities that do not translate into
getting the best recipe, but merely the best of some attribute (they
might be big on texture, for example)
3) "Best" is too subjective or cannot be generalized from this one
example.

Also, even if the judges do pick "the best" that doesn't necessarily
lead to better industry acceptance -- if the judges choose a snake
dish as the best, that will not get everyone to start eating snake,
for example.  But a film that wins the oscar for best picture is
likely to have more people check it out and form their own opinion.

It isn't easy to figure out how to get emperical data comparing
implementations of data models, much less the models themselves, but I
think it is important to pursue.  We have read about the mathematics
of the relational model, but where is the scientific evidence of its
usefulness?

It is not that the relational model gives us nothing, but a model is
simply a metaphor and the relational metaphor "seems" not nearly as
rich as the web metaphor, for example (the web being better modeled as
a di-graph).  Relational theorists typically consider "navigation" and
hierarchies as bad but such concepts are apparently not bad enough to
keep the web of data that forms the www from evolving as a highly
useful huge distributed data repository.

Many of my concerns with the relational model could be issues with the
implementations of the relational model because of the brittle nature
of many applications built on current RDBMS's.  That is where I have
no emperical data but many snippets of anecdotal evidence.  Intuition
is not the same as randomly choosing an opinion.  That is why I'm
trying to get some facts -- because after years of experiencing a
variety of information systems techniques & tools, both my gut and my
pocketbook tell me that the PICK approach (which is old, I realize,
but as a data model it is revived with XML) has a lot to offer the
industry.

Relational proponents often give the impression that their model is
beyond reproach because it is based on mathematics.  I might believe
in a triune God, but I sure wouldn't claim that the reason is because
we can come up with a mathematical model/metaphor for such a theory.
Other theories can have mathematical models too.  I checked around in
the Pick world and Henry Eggers told me that his guess was that if
anyone had worked on putting a mathematical model to PICK it would
have been him and he had done very little in that regard.  So, I've
done a little work on that, simply to show that "relational" is not
the only approach that has been implemented for which there is a
mathematical model.

Additionally, relational theorists will often agree that other models
could also be based on mathematics, but they make the not-at-all
mathematical statement that it is somehow clear to them that relations
are the simplest construct with which to model data so it is clearly
the best.  [I'm a bit tired right now so the following might be a poor
analogy -- but are the unitarians obviously right since they can
"model God" with a geometric point where the trinitarians model God
with a triangle?]  When looking at a proposition or predicate, it is
more likely that a person would model it with a tree (remember
diagramming sentences?) than with relations.  Is that relevant?  I'm
not sure, but I think so.

So, yes, I, too, want something more than intuition for my claim that
it is time to retire (or at least significantly enhance) the
relational model as it seems to have an undeserved king-of-the-hill
reputation.  Any insights from others are definitely appreciated.
Cheers!  --dawn
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.