Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / February 2004

Tip: Looking for answers? Try searching our database.

Relational and multivalue databases

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Eric Kaun - 18 Feb 2004 01:55 GMT
This letter (monograph?) is in reply to Dawn Wolthius, who received what
she seems to have considered a short and unhelpful series of responses
from Fabian Pascal between October 9 and November 12, 2002. This was
also motivated by postings on the comp.databases.theory newsgroup,
specifically the “foundations of relational theory?” thread which began
on September 26, 2003 and accumulated over 500 responses before petering
out on November 7, 2003. The length of my response to all this indicates
the depth of my fascination with the entire discussion.

For those who would flame me for opening old wounds and awakening
sleeping dogs, I apologize, but I believe this discussion still has
merit. I can appreciate Fabian Pascal's apparent crankiness, given that
he's been answering the same questions and debating the same issues for
over 20 years now. I do see reason to resurrect this debate (pedagogy),
but it needn't be the same people who do it generation after generation.

And kudos to Dawn for always being willing to ask questions, explain
herself, and to remain cheerful in the face of insults.

* Dawn wrote: “...find that developers are so much more productive when
working with a MultiValue database”
and, on her MultiValue flashcard: “Typical relational databases cannot
match the productivity of the MultiValue database...”

(I'm assuming by “database” you mean “DBMS,” and there are no RDBMSs
yet, excepting the allegedly excellent Dataphor, which I've yet to use.)

It sounds as if the MV development environment (IDE, reporting tools,
etc.) is to credit for productivity gains. I'm far more concerned with
(provable) correctness than productivity (not that the two are
unrelated, unless you're committed to a “code-and-fix” cycle), but have
always found that a logically solid foundation (of which “conceptual
integrity” is only a diluted description) will enhance productivity – in
the short term as well as the long term, though I would also posit that
there's an exponential benefit with the size and complexity of the
system, lifetime of the product, and turnover within the team.

* Dawn wrote, on her example MultiValue flashcard: “They [relational
databases] would also need features such as variable length data
structures, untyped elements, user-defined vocabularies and custom
functions specified as metadata.”

“Variable length data structures” are allowed by the relational model,
which places no restrictions on types (aka domains). It's lousy
“relational implementations” like SQL, and even lousy SQL
implementations (pick any of them), that fail the relational model
utterly by offering only a crippled type system, and until recently,
either nonexistent or highly proprietary type definition facilities. The
type of your attribute determines whether length is significant; for
example, wouldn't you want a U.S. Social Security number to be
restricted to a maximum of 9 digits (as well as a minimum of 9 digits)?
Wouldn't you want a UPC code to be limited to 1-5-5-1 digits (if I
remember correctly), and also to guarantee that the final check digit is
a valid checksum?

Current SQL implementations have DATE, INTEGER, and others. “Data
structure lengths” violate the relational model in at least two ways:
not allowing users to define a type (including operations), and exposing
physical implementation details (of built-in types).

* Dawn wrote: “Also, what is the theory that leads to strong typing and
fixed lengths that are found in many relational databases?”

The databases you referenced are not relational. Fixed lengths I
addressed. As to strong typing, you'll have to be more specific... but
if you like, you're free (even in SQL!) to define every attribute as a
general-use type (e.g. a BLOB / CLOB / string), and then manipulate it
as you like in individual application programs. Every program will have
to be aware of how to properly address characters or bytes within the
general type (since if you've bothered to define a field it typically
has some meaning), and the DBMS will be unable to restrict what's placed
in that “data element.” This loses a great deal, since declaring types
to an RDBMS allows the definition of your “type rules” in one place, and
disallows operations that would corrupt your database (for example,
placing “ABCDEFGHI” into the aforementioned Social Security Number
attribute).

* Dawn wrote: “They [Don Nelson and Richard Pick] based the way the data
was specified (which I'm terming the data model, but that might not be
the right use of the term) on how it was to be queried.”

By “specifying data,” I'm assuming you're referring a combination of
type definition, relation definition (including normalization), and
constraints.

The problem is that we're most of us not psychic, and any database I've
ever developed (or inherited) has outgrown its originating application,
which means that in designing a database, you have to look at the
business domain and not just the queries being asked for right now,
which won't necessarily predict future query needs. Unless, of course,
your application is so awkward that the users have no desire to extend
it, or the company's automation needs are so limited that they have no
need to extend it.

Relational shines in its ability to model data of all sorts in a minimal
and egalitarian fashion which is closely-aligned with logic, so that you
don't repeat yourself (and have to write extra code to synchronize the
redundant elements). This applies to every aspect of the relational
model and its attendant (though orthogonal) techniques: normalization,
type definition, constraints. All of these are designed to allow you to
state a required predicate once and be done with it. Predicates underlie
the relational model at every turn; it's the single driving concept,
rooted in logic and directly applicable to all data. This is the concept
that most people miss, as it's not stressed nearly enough (by enough
people). It's a not-entirely-unexpected nicety that specification
languages like Z and VDM use predicates as well; hence the practical
value of relational in executing queries augments its ability to model
high-level specification abstractions.

The root problem is that the “entity-relationship” view of data is
horribly limited; relationships often require (or acquire) enough
additional attributes and “business logic” to be reasonably viewed as
entities in their own right, and at that point you'll find you've been
treating a first-class citizen as a second-class one, and in addition
marring your syntax with arbitrary (and unnecessary) path expressions.
In “An Introduction to Database Systems (8th Edition),” C.J. Date gives
a good example of this phenomenon: that of marriages. While you could
regard a marriage as a simple “relationship” between two “entities,”
queries like “How many marriages took place between 1972 and 1974 in Old
St. Luke's Episcopal Church?” obviously “view” a marriage as an entity –
and it has attributes such as date, time, venue, etc. Line items on an
order, for example, could just be viewed as a relationship between an
order and a stock item, right?

Don't think of entities and their relationships; those shift. Predicates
are far more stable, and treating them as individual concepts will do
much to enhance their use and reuse.

* Dawn wrote: “...multivalues crop up in the way people talk and think.”

Perhaps, but that seems a highly subjective statement predicated on the
fact that you know (and enjoy) the concept to begin with. I could just
as easily state that people talk and think in terms of predicates, and
that despite being an object-oriented programmer, I find the use of
predicates far more useful and “roll them up” into objects only as late
as humanly possible. I've found that business people talk much more in
terms of raw “rules” - which correlate much more directly to predicates
than anything else. Using objects (as in OOA) can be useful to flush
(flesh?) out additional concepts, but are not the lingua franca of
business, despite what OO pundits will tell you.

Besides, people often talk and think in terms that are various
combinations of profane, vague, contradictory, unrealistic, nonsensical,
wishful, etc... that doesn't mean our logic machines have to work that way!

* Dawn wrote: “It [MultiValued platforms] is old, yet could be revived
as it provides an amazingly productive environment, perhaps because it
is so forgiving and because it resembles XML to quite an extent.”

I would take resemblance to XML to be a damning attribute until
demonstrated otherwise... and as far as productivity, I'll assume you're
right; that doesn't say much about the model itself, at least not to me.
I personally want my DBMS to be as unforgiving as possible, as long as
it's unforgiving of someone trying to violate its rules! The rules
(relations, types, constraints) are how I state what the data “means.”
That's how I protect my company's data from corruption (e.g. from having
its representation or encoding mangled into a “value” that violates the
data's meaning).

“Firewalling” is something every good programmer does, even within his
or her own code. It helps prevent mistakes, it helps protect us from the
logical (though sometimes undesirable and unpredictable) effects of
combining multiple pieces of code, and best of all, it expresses our
intentions. All software is description, and explicit is best.

* Dawn wrote: “It appears there are lower initial costs and lower
ongoing costs for companies using the MV platform over a more standard
RDBMS (Oracle, SQL Server, etc.)

For initial costs, I can't say; in any event, they're dwarfed by ongoing
costs for a system of any degree of importance (and barring license
price gouging, which I've seen). Ongoing costs are tricky to measure;
however, the cost of data inconsistencies (violation of rules) is likely
to dwarf productivity losses. Regarding productivity, this is nearly
impossible to meaningfully compare, but I've been able to develop in 1
day (actually during the meeting where we were discussing the
requirements for it!) an application consisting of 20 tables, a dozen
screens, and several reports, against an MS Access “DBMS” which was
later ported (in very little time) to SQL Server. I don't think this is
either impressive or unusual (it occupied me during a dull meeting), but
the constraints were never a hindrance; in fact, a properly relational
language would have allowed me to express many more “business rules”
directly in the database, rather than in my code, thus saving me time
and potential errors.

* It really isn't very time-consuming to add SQL tables, given even
crude database design tools.

Dawn wrote: “Of what are we [those who would use XML] ignorant? Of some
theory or practical advice?”

Possibly the theory, or possibly of the practical value of the theory,
or possibly neither (I can't make a blanket statement of ignorance, but
do believe that XML is a huge leap backward for this industry, and will
hasten the decline of the reputation of programmers everywhere once
light is shined on the naked Emperor).

Read chapters 3-10 in Date's “Intro to DB Systems (8th edition).” Then
read additional chapters to see how the pure theory translates into
direct benefits including such “pragmatic” topics as database
distribution, concurrency and recovery, type systems (!), and even code
generation (actually declarative programming, but code generation is one
implementation that's more marketing-friendly).

* Dawn wrote: “It is easy to understand why it is so common when we see
that it [a 1-many relationship] is also a generalization of the
relationship between a relation and an element in the relation.”

First, see the dangers of dividing relations (predicates) into
“entities” and “relationships.” And keep in mind the excellent analogy
from Hugh Darwen: “Types are to relations as nouns are to sentences.”
Types (domains / attributes, sort of) are the things we can speak about;
relations are what we can say about them. Since you refer to language
all too frequently in referencing MultiValue, the noun/sentence
distinction should strike a chord.

* Dawn used an example of people owning cars and bikes, and how in
MultiValue the cars and bikes could be multi-valued attributes of the
person.

This might be OK as long as the attribute is one on which you're not
doing a computation (in which case its cardinality can't be easily
increased or decreased), and as long as you never need to track
additional information about those attribute values. How would you
store, for example, the model and year of the car or the color of the
bike? Additional attributes – and if so, how do you correlate those with
the original car and bike attributes? If they're separate attributes,
then what does it mean if I have 3 bike “values” and 2 color “values”?

And the relational model “makes sense” of this sort of “relationship”
perfectly well; in addition, it allows you to formulate many more
constraints about the data, and these constraints are an important
aspect of “business rules” that you would otherwise need to program
(perhaps repeatedly).

Hierarchies are much more uncommon, and much less useful, than most
people seem to acknowledge. Even the cliché “org chart,” while
hierarchical, fails to capture the matrix reporting which occurs in most
organizations.

* Dawn wrote: “...the cascading deletes for referential integrity are a
no-brainer...”

They are in relational too, and even in SQL.

* Dawn wrote: “...switching cardinality of a data element in a
maintenance phase (or post-design) of a project is very easy to do,
breaking very little.”

Only if the data element isn't involved in actual logic (e.g. the text
examples like car names I usually see in Pick examples). If the data
element is numeric, does making it a list mean that reports now have to
sum the elements in the list? Average them? If I'm only displaying them,
why even a list? Why not a string?

And, for the record, the relational model supports lists (or relations!)
as types as well. It just isn't typically a good idea, unless it's an
attribute with which you do nothing but display on reports and screens.

* Dawn wrote: “The database tables could still be stored as in an RDBMS,
but show themselves to the designers and developers in this more
intuitive format.”

Intuition is a poor basis for logical and data decisions, as it's
subjective and prone to change. In particular, a database consisting of
many relations can be used to generate an uncountable (?) number of
hierarchies, depending on which (and how many) joins are done, and in
what order, and on what attributes. Which one you choose depends on your
immediate needs, but see above regarding the durability of “query
needs”; they're prone to change, so why bother with a hierarchy in the
first place when it expresses only one thing you're going to need to do?

* Dawn wrote: “...a column constraint specifying the maximum length of
an element value might be easy for the computer to do, but doesn't make
all that much sense when it comes to all of the integrity constraints
one might put on a field.”

Agreed. The “type constraint” you mention isn't one; it's because the
non-relational SQL doesn't offer real type definition, and exposes
unnecessary physical implementation details, that such things exist at
all. That ain't relational. So how do MultiValue DBMSs enforce types;
for example, that an SS# has to be 9 digits? Does the DBMS know or
enforce this? Does it know or enforce any restriction at all on types?
While unnecessary bounding is a problem, infinite bounding (allowing any
values in any data element) is also a problem, as I'm sure you can see.
Dawn wrote: “I suspect there are times when having a constraint coded
into an RDBMS increases the maintenance work to an extent that is not
cost-justifiable, particularly when the constraint is one that is prone
to change.”

Constraints are or aren't; they're seldom prone to change. Changes like
that usually indicate either an overly-specific constraint (which a good
RDBMS would allow to be generalized) or shallow analysis. If relational
systems (all 1 of them!) had a consistent catalog (which represents, in
relational form, the structure of the database), then you could possibly
add constraints to the catalog itself (since they're just relations
too!), which would further enforce requirements. This is beyond me, but
I theorize that it could both be done and be very expressive, not
implicit in the code of many programs across the business.

* Dawn wrote: “Simplicity: All data are strings with additional type
specifications for external purposes only. Not only simple to specify
but to maintain!”

This places the burden of maintenance in every program that uses your
file/table; every application has to know enough about that data element
not to violate how you intend it to be used. Because those types are not
in the database, you'll need a document of some sort to describe to each
programmer how to use the data! Or, at least, you have to hope your data
element names are suggestive enough to describe to people how to use it.
You risk much with this sort of assumption; I don't think there's
anything simple about it. You're simply delaying paying for the
“simplicity,” but that loan accumulates interest.

Again, if the data is only displayed, this might be acceptable. But type
systems can be very rich, and that richness is extremely useful, and
practical. It lets you (again) say something of critical business value,
once and only once.

* Dawn wrote: “No need to logically retrieve multiple tables when
concerning yourself with a single proposition, simply because the
proposition has conjunctive clauses.”

The hierarchies that MultiValue allows you to establish aren't single
propositions, unless you restrict yourself you propositions about the
“top-level entity” in your file. Attributes in a relation are
conjunctive clauses (e.g. “the employee has id ID and has name NAME and
was born on BIRTHDATE”). If you “nest” the employee's dependents, for
example, you're making a number of additional assertions, and merely
glossing over the fact that they're separate – furthermore, this
approach becomes intractable if one of those dependents becomes an employee.

If you read about “antipatterns,” you might know one called Big Ball of
Mud. The idea is that applications are best factored into multiple
pieces, so no one piece becomes too complex. Propositions are similar –
why lump them together unless you have to?

* Dawn wrote: “...but the only meat I can find is related to practical
issues regarding integrity constraints – not enough to write off the
entire MultiValue platform nor XML.”

On the contrary: integrity constraints, and types, define not just the
spine but the skeleton and muscle of your application. It's only flaccid
relational-scented SQL implementations that have diluted this fact. I
hope for great things (including application generation) from the
relational implementation in Dataphor (and, I hope, soon many others).
To repeat what Mr. Pascal said: “It [XML] was invented by people who
know nothing about data management – text publishers.” Not a solid
foundation for logical machines like programs.

Dawn wrote: “...instead of a programmer coding some procedural code (!)
specific to a certain circumstance, the logic is declared in the
database, but designated as a local constraint where the next developer
working with the database might not want/need to apply it?”

Can you give an example? I suspect your logic isn't sufficiently
general, and that the Alphora folks could volunteer some specifics on
their D4 language's capabilities in this arena.

- Eric
Dawn M. Wolthuis - 18 Feb 2004 17:28 GMT
Wow -- thanks, Eric!  I appreciate your responses.

I'm sure you'll understand if it takes me a little time to address your
points.  Also, you are responding to a dialolg from 2002 when I was starting
to delve further into researching why the practical experience I had did not
align with the database theory I had learned.  I've learned a bit more in
the past year (so I'm still capable of it ;-) and will accept several of
your points without need to respond.

But, I have become more, rather than less, convinced that the relational
model is not the most useful so I could still be classified as a SQL
detractor (which many relational theorists are too) as well as a relational
theory detractor (not that it isn't good as a theory, but that it doesn't
yield productive development environments). So it is useful for me to try
out my objections on interested parties who can correct my logic or agree
that a particular topic is a matter of taste or the requirements being
addressed.

Along with the fact that I'm probably more interested in theories that yield
developer productivity than those which yield a more academic goal, the
biggest areas of disagreement I have relate to
1.  modeling propositions as relations rather than functions that represent
graphs (mathematically)
2.  referential integrity as well as type constraints and where these should
be specified and enforced.

I'm sure there are other areas I can respond to as well.  Thanks for your
interest in my questions and your thoughtful responses.  As you can see, my
attempts to dialog with trees or dogs end up with me talkin' to myself
and/or trying to determine whether I really am stupid or just ignorant ;-)

Cheers! --dawn
Eric Kaun - 18 Feb 2004 22:13 GMT
> Wow -- thanks, Eric!  I appreciate your responses.
>
[quoted text clipped - 4 lines]
> the past year (so I'm still capable of it ;-) and will accept several of
> your points without need to respond.

Not a problem, I know it's a huge posting.

> Along with the fact that I'm probably more interested in theories that yield
> developer productivity than those which yield a more academic goal,

I'll posit that at least in shops where I've worked, provable correctness is
far more than an academic goal. In fact, the lack of such is the biggest
productivity deterrent I see, hands down. Its handmaidens are closure and
referential transparency, to which relational caters nicely.

> the biggest areas of disagreement I have relate to
> 1.  modeling propositions as relations rather than functions that represent
> graphs (mathematically)

Please explain and/or give an example of such a function.

> 2.  referential integrity as well as type constraints and where these should
> be specified and enforced.

This one I'll probably argue about more - suffice it to say that declaring
them once and generating the necessary enforcement (e.g. on the client) will
enable higher productivity and ensurable correctness. Where do you propose
specifying and enforcing them?

- erk
Bob Badour - 18 Feb 2004 18:40 GMT
> This letter (monograph?) is in reply to Dawn Wolthius, who received what
> she seems to have considered a short and unhelpful series of responses
[quoted text clipped - 11 lines]
> over 20 years now. I do see reason to resurrect this debate (pedagogy),
> but it needn't be the same people who do it generation after generation.

Good luck. I predict you will discover an inexhaustible supply of vociferous
ignorami.

> And kudos to Dawn for always being willing to ask questions, explain
> herself, and to remain cheerful in the face of insults.

I predict you will quickly learn not to encourage the vociferous ignorami.

> * Dawn wrote: ?...find that developers are so much more productive when
> working with a MultiValue database?
[quoted text clipped - 6 lines]
> It sounds as if the MV development environment (IDE, reporting tools,
> etc.) is to credit for productivity gains.

What gains? Her alleged productivity advantage is nothing but a myth. She is
like a person who claims: "After making very careful measurements, we have
determined that horses are more productive than cars. A man can ride a horse
ten miles in far less time than he can push a car a similar distance."

> * Dawn wrote, on her example MultiValue flashcard: ?They [relational
> databases] would also need features such as variable length data
[quoted text clipped - 3 lines]
> ?Variable length data structures? are allowed by the relational model,
> which places no restrictions on types (aka domains).

As you notice, her allegation or assumption is false, which renders her
entire point meaningless.

> * Dawn wrote: ?Also, what is the theory that leads to strong typing and
> fixed lengths that are found in many relational databases??
>
> The databases you referenced are not relational.

Again, it suffices to note that Dawn is ignorant and is burning a straw man.

> * Dawn wrote: ?They [Don Nelson and Richard Pick] based the way the data
> was specified (which I'm terming the data model, but that might not be
[quoted text clipped - 3 lines]
> type definition, relation definition (including normalization), and
> constraints.

She refers to exposing every physical implementation detail to the most
causual of users and to tying applications to specific physical artifacts.
Only the profoundly ignorant can consider such a feature advantageous or
productive compared to logical and physical independence.

> * Dawn wrote: ?...multivalues crop up in the way people talk and think.?
>
> Perhaps,

I disagree. Sets crop up in the way people talk and think. Multivalues crop
up in the way the cognitively damaged or mentally injured talk and think.

> * Dawn wrote: ?It [MultiValued platforms] is old, yet could be revived
> as it provides an amazingly productive environment, perhaps because it
> is so forgiving and because it resembles XML to quite an extent.?
>
> I would take resemblance to XML to be a damning attribute until
> demonstrated otherwise...

It suffices to note Dawn's profound ignorance of the Great Debate happening
nearly 30 years ago and that the debate proved pick sucks.

> * Dawn wrote: ?It appears there are lower initial costs and lower
> ongoing costs for companies using the MV platform over a more standard
> RDBMS (Oracle, SQL Server, etc.)
>
> For initial costs, I can't say

Again, it only appears that way to the intellectually crippled. Her quackery
is no different from the homeopaths who think water remembers. I highly
recommend _How We Know What Isn't So_ by Thomas Gilovich ISBN: 0029117062.

Otherwise, it suffices to note that Dawn is an ignorant quack.

> Dawn wrote: ?Of what are we [those who would use XML] ignorant? Of some
> theory or practical advice??
>
> Possibly the theory, or possibly of the practical value of the theory,

Every principle of sound data management and of sound application design.

> * Dawn wrote: ?It is easy to understand why it is so common when we see
> that it [a 1-many relationship] is also a generalization of the
[quoted text clipped - 7 lines]
> all too frequently in referencing MultiValue, the noun/sentence
> distinction should strike a chord.

She is a vociferous ignoramus with an axe to grind. She is impervious to
reason and logic.

> * Dawn used an example of people owning cars and bikes, and how in
> MultiValue the cars and bikes could be multi-valued attributes of the
> person.
>
> This might be OK

No, it's not. Search on "red blue car"

> * Dawn wrote: ?...the cascading deletes for referential integrity are a
> no-brainer...?
>
> They are in relational too, and even in SQL.

Triggered operations of any kind, in fact.

> * Dawn wrote: ?...switching cardinality of a data element in a
> maintenance phase (or post-design) of a project is very easy to do,
> breaking very little.?
>
> Only if the data element isn't involved in actual logic (e.g. the text
> examples like car names I usually see in Pick examples).

Search on "red blue car". Her assertion is fatuous and demonstrates her
unwillingness to acknowledge the serious flaws in the model she espouses.
Like every vociferous ignoramus, Dawn lives in a fantasy world of denial.

> * Dawn wrote: ?The database tables could still be stored as in an RDBMS,
> but show themselves to the designers and developers in this more
> intuitive format.?
>
> Intuition is a poor basis for logical and data decisions

Any usability expert will tell you that intuition is complex and
unpredictable. Any statement to the above effect requires careful empiricism
for backing. In fact, this is true of any complex result of biological
origin.

Frankly, Dawn is simply an ignorant making an absurd claim.

> * Dawn wrote: ?...a column constraint specifying the maximum length of
> an element value might be easy for the computer to do, but doesn't make
[quoted text clipped - 3 lines]
> Agreed. The ?type constraint? you mention isn't one; it's because the
> non-relational SQL doesn't offer real type definition

It suffices to note that Dawn is an ignorant tilting at the windmills of her
imagination.

> * Dawn wrote: ?Simplicity: All data are strings with additional type
> specifications for external purposes only. Not only simple to specify
> but to maintain!?
>
> This places the burden of maintenance in every program that uses your
> file/table;

It also exposes her deceit regarding productivity. By ignoring integrity and
by discounting the cost of corruption, she pretends--in her own mind--that
she can increase productivity as if the only measure that counts is the time
until the first compilation that reports no errors instead of the time until
the system runs correctly.

> * Dawn wrote: ?No need to logically retrieve multiple tables when
> concerning yourself with a single proposition, simply because the
> proposition has conjunctive clauses.?
>
> The hierarchies that MultiValue allows you to establish aren't single
> propositions

It suffices to note that Dawn is an ignorant who fails to comprehend even
what she, herself, writes.

> * Dawn wrote: ?...but the only meat I can find is related to practical
> issues regarding integrity constraints ? not enough to write off the
> entire MultiValue platform nor XML.?
>
> On the contrary: integrity constraints, and types, define not just the
> spine but the skeleton and muscle of your application.

Again, it suffices to note that Dawn is an ignorant making absurd,
nonsensical statements.

> Dawn wrote: ?...instead of a programmer coding some procedural code (!)
> specific to a certain circumstance, the logic is declared in the
> database, but designated as a local constraint where the next developer
> working with the database might not want/need to apply it??
>
> Can you give an example?

If the data require the constraint, it doesn't matter whether the next
application programmer who comes along finds the constraint convenient. The
constraint keeps the fool from corrupting the data.

If requirements have changed, then it is much more productive to reflect
that change in one central location, ie. the database, than in every
application. If the change is such that it will require changing
applications, it makes sense to centralise the error-detection logic to
prevent costly mistakes in one application from damaging all the others.

Dawn is a chronic ignoramus whose nonsense does not warrant a reply.
Dawn M. Wolthuis - 18 Feb 2004 22:23 GMT
OK, I'll bite, but only for the purpose of entertaining myself and others
who find this amusing.

<snip>
<snip>
>Good luck. I predict you will discover an inexhaustible supply of
vociferous ignorami.

I am surely ignorant, but not so ignorant that I believe I have nothing to
learn from the perspectives of others and from asking questions when I'm
perplexed, Bob.  My ignorance doesn't compare to someone who thinks they
have all of the answers, however.

> > And kudos to Dawn for always being willing to ask questions, explain
> > herself, and to remain cheerful in the face of insults.
>
> I predict you will quickly learn not to encourage the vociferous ignorami.

Employing sticks and stones rather than logical arguments is a long-standing
technique in the bag of tricks used to dislodge women, among others, in
business situations.  It is a sub-cateogory of the wider collection of
intimidation techniques and is rarely as effective as some of the more
subtle approaches.  You should have learned long ago not to be a bully.

> > * Dawn wrote: "...find that developers are so much more productive when
> > working with a MultiValue database"
[quoted text clipped - 11 lines]
> determined that horses are more productive than cars. A man can ride a horse
> ten miles in far less time than he can push a car a similar distance."

I will be the very first to state that I do not have enough emperical data
to support this -- it is what I have found in my experience and when I
compare with others, there are many such anecdotes.  That is not conclusive.
Given that most (all?) benchmarks for databases these days require that the
database be SQL-based, it isn't even easy to get comparisons on what should
be somewhat straight-forward to measure between the implementations of
relational theory and that of PICK.

There are some facts that perhaps we could measure at some point related to
the total number of software developers required to write and also to
support systems with similar functionality, but whoever loses will make
arguments about the differences in functionality, claiming that the
additional resources required are related to an equivalent gain for the
business.  So, how would you propose testing out a hypothesis, such as mine,
that non-1NF implementations that are not based on the relational model,
such as PICK, provide a bigger bang for the buck for the entity paying the
bills than do the current implementations that call themselves RDBMS's?.

> > * Dawn wrote, on her example MultiValue flashcard: "They [relational
> > databases] would also need features such as variable length data
[quoted text clipped - 6 lines]
> As you notice, her allegation or assumption is false, which renders her
> entire point meaningless.

You can say that I'm wrong, but you have given no proof.

> > * Dawn wrote: "Also, what is the theory that leads to strong typing and
> > fixed lengths that are found in many relational databases?"
> >
> > The databases you referenced are not relational.
>
> Again, it suffices to note that Dawn is ignorant and is burning a straw man.

I've already agreed that I am not all-knowing, but I'm addressing both the
theory and implementations of RDMBS -- to what straw man are you referring?
Even if RDBMS's do not require fixed length fields, for example, the % of
variable length fields is quite low, I suspect.  I do not have hands-on
experience with more than ten applications implemented in an RDBMS, however,
so perhaps I'm wrong.  If my assumptions are incorrect, I'm more than happy
to be corrected.

> > * Dawn wrote: "They [Don Nelson and Richard Pick] based the way the data
> > was specified (which I'm terming the data model, but that might not be
[quoted text clipped - 8 lines]
> Only the profoundly ignorant can consider such a feature advantageous or
> productive compared to logical and physical independence.

As I type this, the person on CNN just said "that's comically pompous".  It
rolled off his tongue so well, I'll use it here.  Is it even mildly
interesting that IBM is pushing their Informix users to DB2, but is
retaining their U2 users in U2?  Why?  Dollars.  So, I'm apparently not the
only "profoundly ignorant" person out there.

> > * Dawn wrote: "...multivalues crop up in the way people talk and think."
> >
> > Perhaps,
>
> I disagree. Sets crop up in the way people talk and think. Multivalues crop
> up in the way the cognitively damaged or mentally injured talk and think.

Is it time for me to say "Shut up, Bob" yet?

> > * Dawn wrote: "It [MultiValued platforms] is old, yet could be revived
> > as it provides an amazingly productive environment, perhaps because it
[quoted text clipped - 5 lines]
> It suffices to note Dawn's profound ignorance of the Great Debate happening
> nearly 30 years ago and that the debate proved pick sucks.

I have studied both the history of PICK and the history of SQL and RDBMS's
to varying degrees.  I'm aware of the Bachman/Codd debates as well as
challenges to SQL by QBE, for example.  I also know that Pick and Codd had
no fondness for the thinking of the other.  I am unware of any proof that
"pick sucks" however -- please enlighten me (and IBM, for that matter).

> > * Dawn wrote: "It appears there are lower initial costs and lower
> > ongoing costs for companies using the MV platform over a more standard
[quoted text clipped - 5 lines]
> is no different from the homeopaths who think water remembers. I highly
> recommend _How We Know What Isn't So_ by Thomas Gilovich ISBN: 0029117062.

You have probably figured out that in spite of being appalled by your lack
of basic manners in such a discourse, I am amused at being called
"intellectually crippled" and the like.  I'm thinking that if I'm now stupid
(or perhaps always was) then maybe now I can be good looking (I sortof
figured I had to choose one and I was told I was smart more often than
beautiful, so ...).

> Otherwise, it suffices to note that Dawn is an ignorant quack.

Ignorant, yes -- a "quack" -- nope, guess again.

<snip>

> She is a vociferous ignoramus with an axe to grind. She is impervious to
> reason and logic.

I have no axe to grind, nor financial investment to protect in this regard.
I'm curious and rational and would like to get a better understanding
related to relational and non-relational databases -- both theory and
implementation.

I would like to hear one statement of reason or logic, along with the axioms
from which you think it arises, that you believe I disagree with.  I don't
think I'm incapable of following reason.  I'm teaching two weeks of Calculus
in a few weeks to fill in for a paternity leave and haven't taught it in 20
years.  However, I can still prove theorems related to continuity using
either the analyt's tools of epsilon-delta proofs or the logician's
non-standard analysis tools that include definitions of infinitessimals.  I
suspect that few illogical people could do this.  I might have some brain
blips in that peri-menopause state, but if you pass me some specific reason
or logic to which you believe I am impervious, I will attempt to understand
it.

> > * Dawn used an example of people owning cars and bikes, and how in
> > MultiValue the cars and bikes could be multi-valued attributes of the
[quoted text clipped - 3 lines]
>
> No, it's not. Search on "red blue car"

Gotta admit I miss your point here, Bob.

It is typically considerably easier to query non-1NF structures than to use
SQL on anything.  Here's a common type of query:

LIST STUDENTS WITH EVERY MAJOR NOT EQUAL "MATH"

Think this easy query through in your typical RDBMS (SQL) implementation.
This is not an isolated case.

And it isn't just single fields that can be nested, but fields can be
grouped together and nested as a "nested function" or "nested relation" (if
you prefer).

<snip>
> > Intuition is a poor basis for logical and data decisions

But a good basis for a hypothesis

<snip>
> It also exposes her deceit regarding productivity. By ignoring integrity and
> by discounting the cost of corruption, she pretends--in her own mind--that
> she can increase productivity as if the only measure that counts is the time
> until the first compilation that reports no errors instead of the time until
> the system runs correctly.

Finally, I almost stopped reading, but here you actually have some meat.  I
disagree with your statement and will address this when responding to Eric

<snip>
> Dawn is a chronic ignoramus whose nonsense does not warrant a reply.

and yet you just can't help yourself, can you, Bob?
and this time I gave you the satisfaction of a reply too, which might not
have been the better part of wisdom, but I felt like it.  So, have a good
day and don't forget to smile a little.

--dawn
Eric Kaun - 19 Feb 2004 13:50 GMT
> OK, I'll bite, but only for the purpose of entertaining myself and others
> who find this amusing.

*sigh* And I thought I was going to be able to resurrect the debate sans
flaming. Oh well.

> It is typically considerably easier to query non-1NF structures than to use
> SQL on anything.  Here's a common type of query:
[quoted text clipped - 3 lines]
> Think this easy query through in your typical RDBMS (SQL) implementation.
> This is not an isolated case.

So how would this "look" in Pick? Are your assumptions that a student has
multiple majors? Once I see what you're getting at, I (we?) can respond with
relational counterparts. This doesn't look especially troubling, even for
SQL. I'm assuming (until I hear otherwise) that you'd have a Student
relation, a Major relation, and a StudentHasMajor relation. Depending on how
you "knew" it was "Math", you could omit the Major relation from your query.
Join Student to StudentHasMajor, limit based on major, and project over
student ID (or whatever).

In Pick I'm guessing you'd say you have just 1 file, with a list of majors
as an attribute. But consider the following:
1. To what would you attach, for example, requirements for a major? Surely a
major is more than a piece of text?
2. To what would you attach, for example, the date that the student picked
up (or completed!) a major?
3. Do I really need to loop through every student in the file to determine
how many math majors there are?

And many such others.

> And it isn't just single fields that can be nested, but fields can be
> grouped together and nested as a "nested function" or "nested relation" (if
> you prefer).

Can you explain further? What does the grouping mean, and how does a
function figure into this?

- Eric
Bob Badour - 19 Feb 2004 16:14 GMT
> > OK, I'll bite, but only for the purpose of entertaining myself and others
> > who find this amusing.
[quoted text clipped - 12 lines]
>
> So how would this "look" in Pick?

That was Pick. What Dawn omits is the same query might ask different
questions depending on the physical file structure. See "red blue car"

In D:

STUDENTS WHERE ( MAJOR WHERE MAJOR = 'MATH' ) = TABLE_DUM

or:

STUDENTS WHERE NOT ( 'MATH' IN MAJOR )

In SQL:

SELECT *
FROM STUDENTS
WHERE 'MATH' != ALL (
 SELECT SUBJECT
 FROM MAJOR
 WHERE STUDENTS.ID = MAJOR.STUDENT_ID
)

Dawn is a vociferous ignoramus. First, she ignores how easy the 'challenge'
is in a truly relational language. Second, she demonstrates profound
ignorance of the importance of precision and explicitness. Terse is not
necessarily good. Especially if terse leads to a reduction in expressiveness
as it does in Pick or if it obfuscates meaning as it does in Pick or if it
increases the likelihood of difficult to discover errors as it does in Pick.

For your own benefit, I suggest you assume all Pick users--much like all
crack users--have been cognitively damaged by their use. Thus far, I have
seen no evidence of any Pick user who has survived unscathed.

> Are your assumptions that a student has
> multiple majors?

The WITH keyword anticipates either multiple values or multiple pointers to
a file of majors.

> In Pick I'm guessing you'd say you have just 1 file, with a list of majors
> as an attribute.

Not necessarily. One might. One might have pointers to a file of majors with
the join hard-coded in the dictionary for all users. Users have no ability
to express joins unless hard-coded in the dictionary. Alternatively, one
might have multiple mv attributes where values might or might not be
associated by physical order.
Dawn M. Wolthuis - 19 Feb 2004 17:24 GMT
<snip>
> > It is typically considerably easier to query non-1NF structures than to
> use
[quoted text clipped - 13 lines]
> Join Student to StudentHasMajor, limit based on major, and project over
> student ID (or whatever).

First note that the corresponding SQL statement is not complicated for a SQL
coder, but notice how English-like the PICK counterpart sounds.  Compare:

LIST STUDENTS WITH EVERY MAJOR NOT EQUAL "MATH"

with the SQL corresponding statement that Bob wrote for this:

SELECT *
FROM STUDENTS
WHERE 'MATH' != ALL (
 SELECT SUBJECT
 FROM MAJOR
 WHERE STUDENTS.ID = MAJOR.STUDENT_ID
)

Think of a "file" as a function that maps an identifier to a set of
attributes. For example, the STUDENTS function could be

STUDENT(identifier)={string-of-attribute-data-with-delimiters}

This string of data could be:
Joan<field-delimiter>
Doe<field-delimiter>
6165551234<value-delimiter>7615552222<field-delimiter>
MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
field-delimiter>

There are also
<record-delimiter>
<file-delimiter>

and one can define delimiters to any desired level, with typically at least
this many included in the packaged functions for the database implementation

Then associate this with vocabulary functions such as

FirstName(STUDENTS, identifier) = string-field-in-location-1
SecondaryPhoneNumber(STUDENTS, identifier) = string-value-in-field-3-value-2

So it is the vocabulary functions that make the queries very easy for users.
A vocabulary entry can contain many other types of functions, including
those that reach to any other files within the system.

So, the details about a major, for example, would be in a subject file such
as MAJORS, which would also be a function

MAJORS(identifier) = {string}

Then a vocabulary entry could be defined for students such as

MajorRequirement(STUDENTS, identifier) = MAJORS(StudentMajor, field n)

So, everything is defined in terms of functions including stored data and
any other vocabulary (for stored or virtual fields)

> In Pick I'm guessing you'd say you have just 1 file, with a list of majors
> as an attribute. But consider the following:
[quoted text clipped - 14 lines]
> Can you explain further? What does the grouping mean, and how does a
> function figure into this?

I think the above example shows this.  Please let me know if I should
clarify anything here.  Thanks.  --dawn

> - Eric
Mikito Harakiri - 19 Feb 2004 18:10 GMT
> This string of data could be:
> Joan<field-delimiter>
> Doe<field-delimiter>
> 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> field-delimiter>

How about:

<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
miter>25

Am I expert Pick programmer already?
Dawn M. Wolthuis - 19 Feb 2004 18:38 GMT
>  > This string of data could be:
> > Joan<field-delimiter>
> > Doe<field-delimiter>
> > 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> > field-delimiter>
>
> How about:

<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> miter>25
>
> Am I expert Pick programmer already?

You got it -- simple, right?  It's just a function that represents a graph.
We could call it a web!  But on the outside to the user, it is a vocabulary
with functions (also part of the vocabulary).

How do you decide whether info is a sub-table of an existing table or should
go in its own table?  If the information is functionally dependent.  My
phone numbers are information that one might think of as being in the
relationship between me and the telecom industry.  But my phone numbers have
no meaning apart from me -- sure they are phone numbers that exist in the
world, but the point of capturing the data is to capture information about a
person.  So, don't stick them in some other function (aka file) -- put them
with the person, even if there are more than one of them.

Make sense?  --dawn
Mikito Harakiri - 19 Feb 2004 20:37 GMT
<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> > miter>25
> >
[quoted text clipped - 3 lines]
> We could call it a web!  But on the outside to the user, it is a vocabulary
> with functions (also part of the vocabulary).

We must hire somebody in this group to do subtitles for humor impaired.
Bob Badour - 19 Feb 2004 20:46 GMT
<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> > > miter>25
> > >
[quoted text clipped - 9 lines]
>
>[It was a joke dumbass!]

How can everything be a vocabulary? I thought everything was an object! Does
this mean vocabulary is synonymous with object?

[Pick folks are as flaky and stupid as object folks.]

[Closed captioning for the mentally impaired brought to you by the letter D
and the number 9.]
Mikito Harakiri - 20 Feb 2004 21:47 GMT
<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> > > > miter>25
> > > >
[quoted text clipped - 17 lines]
> [Closed captioning for the mentally impaired brought to you by the letter D
> and the number 9.]

Let me express myself little bit more politely. Yes, 1NF could be considered
not quite as sound theoretical basis. But, what is your idea? Files and
delimiters? Any database researcher would stop reading your manuscript right
there.
Dawn M. Wolthuis - 21 Feb 2004 00:17 GMT
> <snip>
> Let me express myself little bit more politely. Yes, 1NF could be considered
> not quite as sound theoretical basis. But, what is your idea? Files and
> delimiters? Any database researcher would stop reading your manuscript right
> there.

Yes, it is simply a key-value -- that is function(key) = value type of data
storage.  In fact, it makes more sense to call it a file structure than a
database.  When I first saw PICK (as a manager at a new place of employment)
I remarked "that is NOT a database"!  I don't care what we call it, but I
wouldn't want to go back to the DBMS's that my teams and I had worked with
in the past.  It would simply not be stewardly (in terms of dollars and
people time) to do so.  But I'll keep working at determinig whether this is
a fluke or whether there really is a good logical reason for non-relational
data storage to be advisable in more instances than, perhaps, a relational
model is appropriate.

Thanks for engaging.  I'm still a student and not trying to sound like I
have all the answers -- just a bunch of questions after considerable
research and experience.  Cheers!  --dawn
Mikito Harakiri - 21 Feb 2004 01:13 GMT
> Yes, it is simply a key-value -- that is function(key) = value type of data
> storage.  In fact, it makes more sense to call it a file structure than a
[quoted text clipped - 6 lines]
> data storage to be advisable in more instances than, perhaps, a relational
> model is appropriate.

Dawn,

Relational is not about the cost. It is not about storage either. Some folks
say it's about data management, but I would disagree. It's just a high level
programming model. Unless, you demonstrate some innovative Pick methods in
that area, you'll have hard time finding good listeners here on cdt.

Speaking about cheap solutions, there are open source databases...
Dawn M. Wolthuis - 21 Feb 2004 01:37 GMT
> > Yes, it is simply a key-value -- that is function(key) = value type of
> data
[quoted text clipped - 19 lines]
>
> Speaking about cheap solutions, there are open source databases...

Mikito -- I might not have been clear about the theory side on this.  Here
is my dillemma -- I have studied database theory and I have used many
databases and data storage approaches.  It seemed to me that there are many
folks who have been taught or who believe that the relational data model
actually leads to a better solution in data quality, database maintenance
over time, etc.  That is, it seemed from my studies that a company would be
the best steward of their financial resources if they were to employ a
relational database.

However, my experience tells me that the implementations of the relational
model "seem to" (I admit I have no concrete proof of this) be more costly,
without corresponding benefits, to the corporate owner.

If the relational model is not intended to yield a better solution, when
taking into consideration all factors, than a non-relational model, then I
could care less about it -- would you still care about it then?

So, let's look at the big picture of requirements for an application that
includes data storage -- if we look at the overall cost of ownership
(including data quality, ongoing support costs etc) of an RDBMS is it lower
or higher than the implementations of other models such as PICK.  My
hypothesis is that it is more expensive (often considerably more) to employ
an RDBMS.  Is this irrelevant?  I don't think so -- I think it tosses into
question what the purpose of the theory is in the first place.

So, what is our goal in having a good theory of how to store and retrieve
data?  --dawn
Mikito Harakiri - 21 Feb 2004 02:07 GMT
> Mikito -- I might not have been clear about the theory side on this.  Here
> is my dillemma -- I have studied database theory and I have used many
[quoted text clipped - 4 lines]
> the best steward of their financial resources if they were to employ a
> relational database.

I have been on nonrelational database implementation side as well. But
nobody on this group could care less what my relational experience is, let
alone nonrelational.

> However, my experience tells me that the implementations of the relational
> model "seem to" (I admit I have no concrete proof of this) be more costly,
> without corresponding benefits, to the corporate owner.

Let managers worry about the cost, and, as technical people, let us be
fascinated with technology.

> If the relational model is not intended to yield a better solution, when
> taking into consideration all factors, than a non-relational model, then I
[quoted text clipped - 10 lines]
> So, what is our goal in having a good theory of how to store and retrieve
> data?  --dawn

The purpose of the theory is leading industry to high-tech solutions, rather
than surrendering to chaos of ad-hock approaches.
Dawn M. Wolthuis - 21 Feb 2004 02:39 GMT
> > Mikito -- I might not have been clear about the theory side on this.  Here
> > is my dillemma -- I have studied database theory and I have used many
[quoted text clipped - 17 lines]
> Let managers worry about the cost, and, as technical people, let us be
> fascinated with technology.

I was once told that there are people who think of cars as fascinating
machines and those who think of cars as (perhaps fascinating) machines that
get us from one place to another.  This person then said that we call the
first group "men" and the second group "women".  I certainly disagree with
that as a blanket statement, but I suspect there could be emperical data to
suggest something similar related to computers.

I have no fascination for technology outside of how it provides improvements
for people or creation.  I do have a love of mathematics for mathematics
sake, however.  And if that is what is going on with discussions of
relational theory, then continue the game, by all means.  There is no need
to have implementations of the theory, however, unless it is useful.  Surely
implementations of relational databases are useful and have been accepted by
IT professionals.  However, some of the reaons they are employed have to do
with the false notion that there is something about relational theory that
is "more mathematical" or more orthodox or pure.  Companies then employ
these beasts thinking they are more cost-effective.

So, if we are tinkering with cars in our garage, count me out -- I just
don't care.  If we are telling people that cars are better when they use
more gas, and if people then believe these claims and buy bigger and
"better" cars, then I do care and I feel a need to speak up and say it ain't
so.

> > If the relational model is not intended to yield a better solution, when
> > taking into consideration all factors, than a non-relational model, then I
[quoted text clipped - 15 lines]
> The purpose of the theory is leading industry to high-tech solutions, rather
> than surrendering to chaos of ad-hock approaches.

The purpose of tinkering with our car is to have really cool tires, bumpers,
etc?  I sure hope not.  I think Codd really thought he was coming up with a
better theory that would translate into something good in database
implementations.  There are good things about the relational model and its
implementations and there are failures as well.  There is nothing anointing
the relational database model from above as being the best approach to
managing data in a software application.

So, testing out various database theories and finding the pros and cons of
each as it relates to the actual USE of products that attempt to implement a
theory seems like what one might want to do with a discussion related to
database theory.  A theory related to serial killers that doesn't actually
help us find serial killers is just not interesting to me.  Make sense?
Thanks.  --dawn
Mikito Harakiri - 21 Feb 2004 03:05 GMT
> I was once told that there are people who think of cars as fascinating
> machines and those who think of cars as (perhaps fascinating) machines that
> get us from one place to another.  This person then said that we call the
> first group "men" and the second group "women".

;-)

> I certainly disagree with
> that as a blanket statement, but I suspect there could be emperical data to
> suggest something similar related to computers.

You didn't have to spoil the effect of the previous paragraph with this
sentence.

> I have no fascination for technology outside of how it provides improvements
> for people or creation.  I do have a love of mathematics for mathematics
[quoted text clipped - 6 lines]
> is "more mathematical" or more orthodox or pure.  Companies then employ
> these beasts thinking they are more cost-effective.

No, people who tinker with things sometimes come up with very effective
solutions. Hint: L.Torvald. [Subtitles: I don't mean they should necessarily
be role models, either]

> So, if we are tinkering with cars in our garage, count me out -- I just
> don't care.  If we are telling people that cars are better when they use
> more gas, and if people then believe these claims and buy bigger and
> "better" cars, then I do care and I feel a need to speak up and say it ain't
> so.

No, if people tell you they want to save the world, that is most often is
b**t.

> The purpose of tinkering with our car is to have really cool tires, bumpers,
> etc?  I sure hope not.  I think Codd really thought he was coming up with a
[quoted text clipped - 8 lines]
> theory seems like what one might want to do with a discussion related to
> database theory.

No, you don't have to spend your lifetime at customer site in order to be
useful. You can't possibly have all time in the universe to check out all
the crank theories out there.
Dawn M. Wolthuis - 21 Feb 2004 03:18 GMT
> > I was once told that there are people who think of cars as fascinating
> > machines and those who think of cars as (perhaps fascinating) machines
[quoted text clipped - 11 lines]
> You didn't have to spoil the effect of the previous paragraph with this
> sentence.

Sorry to tease you with a little humor and then toss you into some
pseudo-feminist garbage, Mikito -- my favorite table tennis technique is
side to side.  ;-)  I won't comment on any of my other games.
smiles.  --dawn
Eric Kaun - 23 Feb 2004 12:29 GMT
> However, my experience tells me that the implementations of the relational
> model "seem to" (I admit I have no concrete proof of this) be more costly,
> without corresponding benefits, to the corporate owner.

Fair enough. I've had (and continue to have) the opposite experience, for
example:
- poorly-normalized databases causing data integrity problems and query
nightmares
- "normalizable" business logic stuffed into baroque procedural code
- new business requirements forcing database reorganization because of poor
decisions which were easy to see as such even at the time they were done

Relational, done right, would yield practical value. SQL, done as
"relationally as possible", yields some practical value.

Obviously, then, our mileage varies...

> So, what is our goal in having a good theory of how to store and retrieve
> data?  --dawn

Theory is fun for its own sake, but in the case of relational is also
intended to help people at a practical level. Codd developed it in direct
response to many, many problems with network and hierarchic databases.

- erk
Dawn M. Wolthuis - 19 Feb 2004 21:45 GMT
<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> > > miter>25
> > >
[quoted text clipped - 7 lines]
>
> We must hire somebody in this group to do subtitles for humor impaired.

So you are unable to see the humor in my responses, Mikito?  Rest assured
that I can tell when I'm responding to tongue-in-cheek responses.
smiles. --dawn
Eric Kaun - 19 Feb 2004 22:01 GMT
> "Mikito Harakiri" <mikharakiri@iahu.com> wrote in message

> You got it -- simple, right?  It's just a function that represents a graph.
> We could call it a web!  But on the outside to the user, it is a vocabulary
> with functions (also part of the vocabulary).

Isn't it a tree or hierarchy rather than a graph?

> How do you decide whether info is a sub-table of an existing table or should
> go in its own table?  If the information is functionally dependent.

Normalization is based entirely on this same principle, so this should be an
interesting point of comparison.

> My phone numbers are information that one might think of as being in the
> relationship between me and the telecom industry.  But my phone numbers have
> no meaning apart from me -- sure they are phone numbers that exist in the
> world, but the point of capturing the data is to capture information about a
> person.

The phone numbers could have a great deal of meaning, and nesting them
limits you unnecessarily. I can think of much better examples from my
experience, but this one is more obvious and accessible. If I wanted to
keep, for example, a phone record for each employee (what outside numbers
they called), I'd then have to move the data, right? Since they're "only"
attributes, the moment I want to enrich them (i.e. I think of any other
predicate that involves them, other than "Joe Blow has phone number
123-456-7890"), I need to change my model.

Functional dependency as you've illustrated it (not that espoused in
relational theory / first-order predicate logic) is in the eye of the
beholder, and in any system I've ever worked on, that changes. Different
people see different parts of the DB, and different people consider
different predicates ("entities", if you must) to be central. The local
telecomm guy might want an application to look at phone usage, and to enable
multiple people to share a line - then the number itself (actually a
connection, not necessarily a number) becomes "central."

In relational, your model needn't change. And you haven't spent much time
setting it up right, in my opinion - we're talking about a few extra tables
for a huge flexibility advantage.

> Make sense?  --dawn

Not yet...

- erk
Dawn M. Wolthuis - 19 Feb 2004 23:55 GMT
> > "Mikito Harakiri" <mikharakiri@iahu.com> wrote in message
>
[quoted text clipped - 5 lines]
>
> Isn't it a tree or hierarchy rather than a graph?

A tree is a special case of a graph.  Yes, this can be modeled as a tree
graph or it can be modeled as a graph that is not a tree, with or without
cycles, once the links are made more explicit (and they are just functions
and represent connections from one node to another).  It is a di-graph
(directed graph).

> > How do you decide whether info is a sub-table of an existing table or
> should
> > go in its own table?  If the information is functionally dependent.
>
> Normalization is based entirely on this same principle, so this should be an
> interesting point of comparison.

Yes, indeed.  In fact, data normalization without 1NF works from my
perspective -- it is 1NF that is very flawed.  With the definition of each
subsequent normal form including that the data must be in 1NF has a domino
effect of corrupting the whole lot of these rules, however.

> > My phone numbers are information that one might think of as being in the
> > relationship between me and the telecom industry.  But my phone numbers
[quoted text clipped - 12 lines]
> predicate that involves them, other than "Joe Blow has phone number
> 123-456-7890"), I need to change my model.

Oddly enough, this is where the model excels --
first of all, in your example, outgoing calls (rather than "phone numbers")
is quite a different matter and could very well warrant a separate table
without changing anything about what the phone numbers (and e-mail addresses
for that matter) are that refer to this person.

Additionally, it is changes to the data that are very easy in PICK.  If you
need to change the cardinality of any field (column-ish) in any file
(table-ish), you just do it.  Sometimes nothing at all need changing to
coorespond, but often an input screen needs a field attribute changed along
with the field attribute in the file.  Sure it would be better if these were
in synch, and I'm sure some tools have that, but it is not standard in
implementations.  Anyway, if you have a report that asks for a person and
their phone numbers when the phone number was single-valued (which it was in
many systems in the 70's and 80's) and then you permit multiple values for
the phone number, the report prints out the new phone numbers too, without
any changes.

> Functional dependency as you've illustrated it (not that espoused in
> relational theory / first-order predicate logic) is in the eye of the
[quoted text clipped - 4 lines]
> multiple people to share a line - then the number itself (actually a
> connection, not necessarily a number) becomes "central."

Exactly!  This idea of democracy of data is silly - data has meaning.  If it
didn't, then there would be no attributes for entities -- everything would
be a top level entity.  If the phone company system is added to the database
then a link between their phone numbers and the people who have them could
be made without harming anything at all (add a link to the person web page
to navigate you to the info for the telco about that phone number and a link
set to the telco system that shows all people associated with a phone number
if desired)

> In relational, your model needn't change. And you haven't spent much time
> setting it up right, in my opinion - we're talking about a few extra tables
[quoted text clipped - 5 lines]
>
> - erk

Well, did that help or not? --dawn
Marshall Spight - 21 Feb 2004 16:40 GMT
> Additionally, it is changes to the data that are very easy in PICK.  If you
> need to change the cardinality of any field (column-ish) in any file
[quoted text clipped - 7 lines]
> the phone number, the report prints out the new phone numbers too, without
> any changes.

It strikes me that this paragraph describes features of Pick's GUI builder
integration. Now, GUI builder integration is a good thing, but I don't
believe it has anything to do with the qualities of the data model.

I think we should take care to separate our discussions of data models
and application integration issues.

Marshall
Dawn M. Wolthuis - 21 Feb 2004 16:53 GMT
> > Additionally, it is changes to the data that are very easy in PICK.  If you
> > need to change the cardinality of any field (column-ish) in any file
[quoted text clipped - 14 lines]
> I think we should take care to separate our discussions of data models
> and application integration issues.

Nope -- nothing to do with any GUI, but it does have to do with a database
retrieval language -- not SQL.  To the extent that SQL discussions have
something to do with relational database modeling, the query lanuage that
has gone by so many names that it is unknown by any (started out as GIRLS in
the 60's and now includes JQL for jBASE, UniQuery for UniData, Retrieval for
UniVerse, etc) is a way of talking about the data model underlying PICK.
Make sense?

--dawn
Bob Badour - 21 Feb 2004 17:16 GMT
> > Additionally, it is changes to the data that are very easy in PICK.  If you
> > need to change the cardinality of any field (column-ish) in any file
[quoted text clipped - 14 lines]
> I think we should take care to separate our discussions of data models
> and application integration issues.

The really astounding thing about Dawn's paragraph above is the "you just do
it" part. As I explained at length and ad nauseum in the "red blue car"
thread, when you just do it, you just change the meaning of existing
queries.
Dawn M. Wolthuis - 21 Feb 2004 17:35 GMT
> > "Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message
> news:c13ieq$27a$1@news.netins.net...
[quoted text clipped - 29 lines]
> thread, when you just do it, you just change the meaning of existing
> queries.

Bob -- I'll admit I do not understand your red blue car issue.  Could you
spell it out for me?  I would appreciate it.  thanks.  --dawn
Dave Rolsky - 22 Feb 2004 06:50 GMT
> How do you decide whether info is a sub-table of an existing table or should
> go in its own table?  If the information is functionally dependent.  My
[quoted text clipped - 4 lines]
> person.  So, don't stick them in some other function (aka file) -- put them
> with the person, even if there are more than one of them.

So you've never encountered two people who share the same phone number?

-dave

/*=======================
House Absolute Consulting
www.houseabsolute.com
=======================*/
Dawn M. Wolthuis - 22 Feb 2004 15:05 GMT
> > How do you decide whether info is a sub-table of an existing table or should
> > go in its own table?  If the information is functionally dependent.  My
[quoted text clipped - 8 lines]
>
> -dave

Sure, but what do you want to do, Dave -- get a random identifier for each
phone number and then since people have more than one have a link tabke that
links people with each of the keys to their phone numbers?  I guess that
might make sense to someone in the RDBMS world, but step back a minute and
look that -- the not-terribly-technical-term "silly" comes to my mind.

smiles.  --dawn
Marshall Spight - 23 Feb 2004 00:44 GMT
> Sure, but what do you want to do, Dave -- get a random identifier for each
> phone number and then since people have more than one have a link tabke that
> links people with each of the keys to their phone numbers?  I guess that
> might make sense to someone in the RDBMS world, but step back a minute and
> look that -- the not-terribly-technical-term "silly" comes to my mind.

That argument may carry some weight when the data type involved, a phone
number, is approximately the same size as a foreign key. For example,
one could imagine using the phone number itself as the key to the
phone numbers table (it is unique, after all) and at that point, there's
no value to the association table any more.

But that argument stops working as soon as the amount of information
grows a bit. Consider even something as simple as addresses. They
aren't all that much more complicated than a phone number: line1,
line2, city, state, zip. Repeating that in the person record for each
person at the house isn't efficient, and it makes correcting typographic
errors more error-prone. (One can imagine each person in the house
having the same address but with the street spelled differently.)

For my personal experience, the day someone explained association
tables to me was the first day that I began to think that the database
world might really have something interesting to say. (This was some
time ago, but I still remember the exact moment of realization.) It
is a significant achievement, and I know of no other system that has
something that handles many:many relationships as well. Certainly
no OO language, for all the emphasis on container classes, has ever
handled them as well.

Marshall
Bob Badour - 23 Feb 2004 02:41 GMT
> > Sure, but what do you want to do, Dave -- get a random identifier for each
> > phone number and then since people have more than one have a link tabke that
[quoted text clipped - 4 lines]
> That argument may carry some weight when the data type involved, a phone
> number, is approximately the same size as a foreign key.

As you point out, the suggestion to have a surrogate for a simple, familiar,
stable candidate key is simply fatuous. But what can one expect from someone
like Dawn?
Dawn M. Wolthuis - 23 Feb 2004 03:19 GMT
> > Sure, but what do you want to do, Dave -- get a random identifier for each
> > phone number and then since people have more than one have a link tabke that
[quoted text clipped - 24 lines]
> no OO language, for all the emphasis on container classes, has ever
> handled them as well.

Yes, Marshall, you are right that I answered the specific instance and not a
generalization.  When it comes to addresses, they are many-to-many with
people, while 1-1 with places (if we can make the assumption that each place
has one and only one address).  I know the language I'm about to use doesn't
play well with the RDBMS theorists, but I would think of my entities as
people, places, and things.  Addresses go with a "Places" entity, perhaps
named "Addresses".

So, we have people and we have addresses with a M-M relationship.  In a PICK
model, the implementation would likely include a PEOPLE function and an
ADDRESSES function (called "files").  The PEOPLE function would map to a
list of ADDRESSES identifiers.  Two files with a link between them.

Often the implementation would include return-links to improve performance
if there would be queries not just about what addresses a person has, but
also who lives at a particular address.  Then extensions could be made to
the vocabulary of each function so that queries like this would be the norm:

LIST PEOPLE FULL-ADDRESSES

LIST ADDRESSES FULL-NAMES

This is a different approach to creating a VIEW with COLUMNS from each TABLE
against which the user would query.  In the case of the RDBMS, we then come
up with a new vocabularly for an entity, while often retaining the names of
the attributes.  PICK lets the user of the database think of each
file/function as a portal into the database, increasing the language for
that file/function but not altering the name of the file unless there is
some purpose to doing so.  Each "view" in PICK looks through the eyes of one
of the implemented functions (files).

This approach is much more like the web, where there are documents with
links to other documents, which might then have return links to all docs
that point to them.  Part of the charm is in the simplicity of the approach.

--dawn
Dave Rolsky - 23 Feb 2004 05:39 GMT
> Yes, Marshall, you are right that I answered the specific instance and not a
> generalization.  When it comes to addresses, they are many-to-many with
[quoted text clipped - 17 lines]
>
> LIST ADDRESSES FULL-NAMES

So basically what you're saying is you'd write custom code (if I read the
word "function correctly" here) to do what a relational database would you
let you do with it's query language?

And you'd do that _every_ time you had to express an M-M relationship?

And if you _don't_ implement these performance improving return links, the
database cannot optimize that query?

> This approach is much more like the web, where there are documents with
> links to other documents, which might then have return links to all docs
> that point to them.  Part of the charm is in the simplicity of the approach.

I don't see the charm in having to hand-code the same thing over and over
again.

I'd rather declaratively say "this is my data, these are the relationships
between them", and have the DBMS take care of the optimization.

Granted, today's SQL databases don't handle this perfectly, but they're
not all that bad at it either.

-dave

/*=======================
House Absolute Consulting
www.houseabsolute.com
=======================*/
Dawn M. Wolthuis - 23 Feb 2004 13:48 GMT
> > Yes, Marshall, you are right that I answered the specific instance and not a
> > generalization.  When it comes to addresses, they are many-to-many with
[quoted text clipped - 21 lines]
> word "function correctly" here) to do what a relational database would you
> let you do with it's query language?

Sorry-- I should hvae defined my terms -- a mathematical function is a
relation that has only one value for each element in the domain.  I'm
modeling the data in functions (which are then necessarily relations) and I
use the word "functions" for two reasons:
1) because if I were to use the term "relations" then I would be corrected
on the modeling into these relations because I do not use all of "relational
database theory" and implementations of this "functional model" are not
RDBMS's.
2) because I would rather work with functional databases than relational
databases ;-)

> And you'd do that _every_ time you had to express an M-M relationship?
>
> And if you _don't_ implement these performance improving return links, the
> database cannot optimize that query?

Oddly enough, having a database that does query optimization was not
something I ever heard of until RDBMS's came about -- OF COURSE databases
optimize the processing of the query -- that's one of their jobs, right?
However, no matter how optimized queries are, stored data is faster to query
than derived data..  Since the vocabulary for an entity is the combination
of words for stored data and words for derived data, having a vocabularly
element for the list of people who have this address, for example, is going
to yield query results faster if the data is located there (in that same
stored record) than if the data needs to be found and then derived from
stored data in many other records..

> > This approach is much more like the web, where there are documents with
> > links to other documents, which might then have return links to all docs
> > that point to them.  Part of the charm is in the simplicity of the approach.
>
> I don't see the charm in having to hand-code the same thing over and over
> again.

definitely agree -- that is not the case.

> I'd rather declaratively say "this is my data, these are the relationships
> between them", and have the DBMS take care of the optimization.
> Granted, today's SQL databases don't handle this perfectly, but they're
> not all that bad at it either.

Each has its proc and cons, just as other models do.  When I read books that
talk about databases, they seem to indicate that relational databases are
whats good and non-relational have somehow been proven to be the bad, old
stuff.  With OO, XML, and OLAP discussions added to some text books, this is
starting to move a bit -- I'm just trying to help it move faster.
Relational database are neither mathematically nor emperically proven to be
superiour to anything, as best I can tell.  They are simply one approach and
not, in my opinion, the best that we, as an industry, can offer.
smiles.  --dawn
Marshall Spight - 25 Feb 2004 03:53 GMT
> So, we have people and we have addresses with a M-M relationship.  In a PICK
> model, the implementation would likely include a PEOPLE function and an
> ADDRESSES function (called "files").

If I understand correctly, your use of the word "function" here
means a mapping from one set to another. In the PEOPLE case,
it would be from something analogous to a primary key, to
all the other attributes of a person. Yes?

Doesn't this mean that a Pick, uh, "file" is the same thing as
an SQL relation with the restriction of only having one
unique attribute? Again, I'm not sure I understand, but it
sounds like another difference is that (no distinction is made |
it's easy to change) in deciding if an attribute is single valued
or set-valued. Does that mean that every attribute has zero-or-more
values?

>  The PEOPLE function would map to a
> list of ADDRESSES identifiers.

Among other things, one assumes?

> Two files with a link between them.
>
[quoted text clipped - 6 lines]
>
> LIST ADDRESSES FULL-NAMES

Okay, but this introduces all kinds of opportunities for inconsistent
data. Is there anything keeping the two sets of data in sync? Is it
automatic or manual?

> This is a different approach to creating a VIEW with COLUMNS from each TABLE
> against which the user would query.  In the case of the RDBMS, we then come
[quoted text clipped - 8 lines]
> links to other documents, which might then have return links to all docs
> that point to them.  Part of the charm is in the simplicity of the approach.

I note that the web is a disaster from a data management point of view.

Marshall
Bob Badour - 19 Feb 2004 19:40 GMT
>  > This string of data could be:
> > Joan<field-delimiter>
> > Doe<field-delimiter>
> > 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> > field-delimiter>
>
> How about:

<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> miter>25
>
> Am I expert Pick programmer already?

I don't know. Is your nose bleeding yet?
Dawn M. Wolthuis - 19 Feb 2004 20:15 GMT
> >  > This string of data could be:
> > > Joan<field-delimiter>
> > > Doe<field-delimiter>
> > > 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> > > field-delimiter>
> >
> > How about:

<sub-value-delimiter>2002<sub-sub-value-delimiter>Jan<sub-sub-sub-value-deli
> > miter>25
> >
> > Am I expert Pick programmer already?
>
> I don't know. Is your nose bleeding yet?

Maybe if there were no relational-zealot-bullies we wouldn't have to live
with bloody noses.

Seriously, Bob -- do you know what you talking about?  Have you worked with
any implementations of the Nelson-Pick model?  If so, which one(s).  Have
you studied the model?  Your responses to Pick all seem so emotionally
charged and not based on logic -- what is the basis for your claims?

I've worked with hierarchical, network, relational dbms's as well as various
file systems along with PICK.  There are pros and cons to each of the
environments I've worked with, but the PICK advantage is the "big bang for
the buck" advantage.  The core of the implementations that are out there
today are quite dated in this distributed computing world (so I wouldn't
call it state of the art), but the data model is definitely making a
comeback, and for good reason.  --dawn
Eric Kaun - 23 Feb 2004 12:39 GMT
> [...]
> Think of a "file" as a function that maps an identifier to a set of
[quoted text clipped - 6 lines]
> Doe<field-delimiter>
> 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> field-delimiter>
>
[quoted text clipped - 25 lines]
> So, everything is defined in terms of functions including stored data and
> any other vocabulary (for stored or virtual fields)

But the functions are only naming things, right? They don't really establish
any typing, nor are they computational functions (or can they be?). It's
just to establish that field 3 is phone number? In any event, should you
choose to, relational can do exactly the same thing - establish a PHONELIST
type, and write SECONDARY_PHONE as a function over it. Or even a LIST type,
and SECOND_ELEMENT on it, if you care to.

- erk
Dawn M. Wolthuis - 23 Feb 2004 13:54 GMT
> > [...]
> > Think of a "file" as a function that maps an identifier to a set of
[quoted text clipped - 6 lines]
> > Doe<field-delimiter>
> > 6165551234<value-delimiter>7615552222<field-delimiter>

MATH<sub-value-delimiter>2002<value-delimiter>PHIL<sub-value-delimiter>2003<
> > field-delimiter>
> >
[quoted text clipped - 34 lines]
> any typing, nor are they computational functions (or can they be?). It's
> just to establish that field 3 is phone number?

These are actually able to define any derived data from any files across the
system.  So, they do linking, computation, etc (by use of subroutines
employing procedural code -- feel free to groan, but ...).

> In any event, should you
> choose to, relational can do exactly the same thing - establish a PHONELIST
> type, and write SECONDARY_PHONE as a function over it. Or even a LIST type,
> and SECOND_ELEMENT on it, if you care to.

I know that PICK can emulate a relational database (at least to the extent
that it can be SQL-compliant) and I know that relational db's can employ
user-defined functions or stored procedures, or whatever, to add to the
language.  However, in implementations like Oracle, a stored procedure still
doesn't return a list of values to my knowledge (SQL Server does have that
capability).  I recall a few years ago someone saying that PICK can pretend
to be relational, but relational cannot pretend to be PICK.  That might not
be the case now with some of the additional types added to some rdbms's for
non-simple values.
Cheers!  --dawn
Paul - 18 Feb 2004 22:59 GMT
bbadour@golden.net says...

> > The databases you referenced are not relational.

> Again, it suffices to note that Dawn is ignorant and is burning a straw man.

Just a quibble, but doesn't one demolish a straw man?

Paul...

Signature

plinehan  y_a_h_o_o  and d_o_t  com
C++ Builder 5 SP1, Interbase 6.0.1.6 IBX 5.04 W2K Pro
Please do not top-post.

"XML avoids the fundamental question of what we should do,
by focusing entirely on how we should do it."

quote from http://www.metatorial.com 

Christopher Browne - 19 Feb 2004 03:29 GMT
> bbadour@golden.net says...
>
[quoted text clipped - 3 lines]
>
> Just a quibble, but doesn't one demolish a straw man?

He's presumably mixing metaphors intentionally, and actually I'd quite
enjoy demolishing a straw man by setting fire to it; they certainly
are vulnerable to fire :-).
Signature

(format nil "~S@~S" "cbbrowne" "cbbrowne.com")
http://cbbrowne.com/info/languages.html
I hate wet paper bags.

rkc - 19 Feb 2004 14:35 GMT
> bbadour@golden.net says...
>
[quoted text clipped - 3 lines]
>
> Just a quibble, but doesn't one demolish a straw man?

"They tore my legs off and they threw them over there."
"Then they took my chest out and they threw it over there."

The Scarecrow - The Wizard of Oz
Marshall Spight - 21 Feb 2004 16:42 GMT
> "They tore my legs off and they threw them over there."
> "Then they took my chest out and they threw it over there."

"That's you, all over."

Marshall