Database Forum / General DB Topics / DB Theory / January 2004
Stored fields ordered left to right
|
|
Thread rating:  |
Dawn M. Wolthuis - 27 Dec 2003 02:37 GMT Marshall agreed with me on something, so I'll dare post one of the many questions I have related to relational database theory.
In Date's June 2003 paper entitled "What First Normal Form Really Means" he asks questions about MultiValue systems, which he (and no one else I've read) abbreviates as "MVS". I am preparing answers to these questions. Date writes "...MVS fields are ordered left to right (and so MVS files are certainly not relations, and the system is certainly not relational)."
I've been puzzled by this for quite a while, just figuring that relational theorists have this wrong. But the writings seem so sure of this. I had thought that the relations in relational database theory were mathematical relations, but I am beginning to think that might not be the case. My masters in mathematics was quite some time ago, so I hauled out some books and googled a bit and everything I find that is mathematics, rather than database theory, indicates what I thought about relations -- a relation is a set of ordered tuples -- right? What am I missing here? An element of a relation would be of the form (a1, a2, ... an) where a1 is an element of S1 etc. They ARE ORDERED LEFT TO RIGHT. Am I misunderstanding something or is there some other mathematical definition of "relation" that is the one on which relational database theory is based?
Thanks in advance for your help. --dawn
Joe \ - 27 Dec 2003 03:36 GMT > Marshall agreed with me on something, so I'll dare post one of the many > questions I have related to relational database theory. [quoted text clipped - 17 lines] > there some other mathematical definition of "relation" that is the one on > which relational database theory is based? A tuple is a set of ordered pairs of the form (attribute, value). Defining an ordering for the attributes would be superfluous.
URL:http://en2.wikipedia.org/wiki/Ordered_pair
You have much to unlearn, Grasshopper.
-- Joe Foster <mailto:jlfoster%40znet.com> On the cans? <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Dawn M. Wolthuis - 27 Dec 2003 04:17 GMT > > Marshall agreed with me on something, so I'll dare post one of the many > > questions I have related to relational database theory. [quoted text clipped - 22 lines] > > URL:http://en2.wikipedia.org/wiki/Ordered_pair So a tuple is NOT ordered? Why not? Why even call an unordered set a tuple? Tuples imply ordering, right? They are elements of Set1 x Set2 x Set3 x ... Setn. It is fine with me if you want those sets to be sets of ordered pairs -- they can be sets of whatever, but the relation is then a set of tuples (s1, ... sn) where s1 is an element of S1 (and in your def, that means it would be an ordered pair).
But a RELATION itself is a set of ORDERED TUPLES -- RIGHT? Else please point me to a MATHEMATICS definition that allows for relations that are not ordered. I'm not finding any such definitions.
> You have much to unlearn, Grasshopper. Ditto, methinks. Smiles. --dawn
> -- > Joe Foster <mailto:jlfoster%40znet.com> On the cans? <http://www.xenu.net/> > WARNING: I cannot be held responsible for the above They're coming to > because my cats have apparently learned to type. take me away, ha ha! Jerry Gitomer - 27 Dec 2003 04:58 GMT >>"Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message > > <news:bsira5$jlt$1@news.netins.net>... <snip>
>>>Date writes "...MVS fields are ordered left to right (and so MVS files > > are > >>>certainly not relations, and the system is certainly not relational)." <big snip>
Allow me to play the role of the fool jumping in where angels fear to tread.....
Two points which may clarify RDBMS implementation (as opposed to theory).
1. The relationships are imposed externally to the data in the form of indexes and/or foreign keys. The data itself is unordered. If I remember correctly Codd specifically stated that the data was not ordered. From an implementation point of view (circa 1970 when the largest mainframes weren't much faster than my Palm IIIc) this allowed signifiantly better performance when adding rows that should logically be anyplace other than the end of the table.
2. Within a table row the physical order of the columns as stored on disk need not conform to the logical order of the columns as specified in the CREATE TABLE statement. Again looking at the mainframe computers of 1970 when all but the very largest had less than 16MB of RAM, the highest capacity disk drives only had 33MB of storage and there were arcane rules about floats and integers starting on word boundaries while short integers and strings could start on any byte in a word it became desirable to store all of the floats and integers at the beginning of the physical disk record and the shorts and strings after them in order not to waste any space in either memory or on disk.
HTH
Dawn M. Wolthuis - 27 Dec 2003 15:40 GMT > >>"Dawn M. Wolthuis" <dwolt@tincat-group.com> wrote in message > > [quoted text clipped - 39 lines] > > HTH Yes, this is most helpful. This is PRECISELY my understanding -- that deciding to remove the ordering from relational tuples is an implementation issue and not about the logical theory of relations.
I work with relations that are mathematical relations and are therefore ordered tuples. The model behind XML documents is also one of ordered tuples. So, if you hear of folks who might sometimes spout that their database model is "more relational" than RDBMS's it sometimes is due to this particular issue.
Based on this, it sounds like a response to Date that says that mathematical relations are ORDERED and not unordered tuples so that this particular point is irrelevant (and, in fact, wrong) would be an accurate response, right?
Thanks a bunch! --dawn
Jonathan Leffler - 28 Dec 2003 03:26 GMT >>>"Joe "Nuke Me Xemu" Foster" <joe@bftsi0.UUCP> wrote: >><snip> [quoted text clipped - 32 lines] > this particular point is irrelevant (and, in fact, wrong) would be > an accurate response, right? It depends on the premises from which you work.
One of the documented differences between mathematical relations and relations used in database theory is precisely this one - that the elements in a tuple of a mathematical relation are ordered (usually ordered pairs, in fact) but database theory uses unordered tuples, where each element logically consists of the combination attribute name, attribute type and attribute value. Of course, in a system without inheritance to complicate matters, the attribute type associated with a given attribute name is the same for all tuples in the relation (but the converse is not true). The difference between ordered mathematical relations and the unordered database equivalent is clearly stated in Codd's original (1970) paper, incidentally:
Accordingly, we propose that users deal, not with relations that are domain-ordered, but with relationships which are their domain-unordered counterparts.
[Note the implied distinction between what users see and what the system manages, too.]
Many practical systems store each record (physical analogue of a tuple) with the fields (the physical analogue of an attribute) stored in the same order, which makes it easier to locate a given field within a given record. And many systems make life still easier by storing the data for a given field in a constant width, so it is trivially possible to pre-calculate the offset into the record for a given attribute value.
To get back to the question - if you change the premises on which your version of relational theory is based to state that your tuples are indeed ordered, then of course Date's statements no longer apply. The theory about which he is making statements states that tuples are unordered. Both are valid sets of premises, but they are different sets of premises, and statements made about one are not valid for the other.
As to which set of premises is better - that is a separate discussion. I strongly suspect there are a rather large number of issues that have to be resolved when you use ordered tuples rather than unordered tuples. Most notably, A JOIN B is not the same a B JOIN A under the ordered scheme - with consequences that need to be considered very carefully.
And, reverting to the final question again - no, Date's comments are neither irrelevant nor wrong in the system about which the comments were made.
 Signature Jonathan Leffler #include <disclaimer.h> Email: jleffler@earthlink.net, jleffler@us.ibm.com Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/
Dawn M. Wolthuis - 28 Dec 2003 03:41 GMT <big snip>
> One of the documented differences between mathematical relations and > relations used in database theory is precisely this one - that the > elements in a tuple of a mathematical relation are ordered (usually > ordered pairs, in fact) but database theory uses unordered tuples, Be careful not to equate Codd's database theory with all database theory. I work with alternative database theories. The primary one that I work with has, in the logical model, the concept of ordered tuples, aka mathematica relations. In fact, they are relations which are mathematical functions as well. It appears to me that relational database theory works with relations that are not, by definition, mathematical relations while at least one database theory that actually uses mathematical relations is often accused by relational theorists of not being relational. Am I the only one intrigued by this?
> where each element logically consists of the combination attribute > name, attribute type and attribute value. Of course, in a system [quoted text clipped - 26 lines] > sets of premises, and statements made about one are not valid for the > other. I agree, so I'm trying to get some common language. Taking mathematical vocabulary and then changing it (and in this particular case changing relations to require that they be unordered when mathematically they are ordered !!) is more than likely to cause problems when discussing database theories (as I have seen time and again, which is why I'm trying to align my language with that of relational theorists EXCEPT THAT I AM NOT WILLING TO SACRIFICE ACCURATE MATHEMATICS, if we can avoid that)
> As to which set of premises is better - that is a separate discussion. > I strongly suspect there are a rather large number of issues that > have to be resolved when you use ordered tuples rather than unordered > tuples. Most notably, A JOIN B is not the same a B JOIN A under the > ordered scheme - with consequences that need to be considered very > carefully. Yes, you are absolutely right. In fact, the model I work with is more of a di-graph of data and instead of joins, it uses data trees (much like the web). This is accomplished with functions as well. Instead of using two different concepts -- sets/relations combined with operators, the model I work with uses functions in both cases. All work with computers can be seen in terms of functions that map one "object" to another, in fact. Both functions and objects can be stored in memory, disk or wherever. [sorry, I'm digressing]
> And, reverting to the final question again - no, Date's comments are > neither irrelevant nor wrong in the system about which the comments > were made. I can see that he is not using the mathematical term "relation" but a definition that has evolved from the work of Codd and has redefined relation so that it means something different in database theory. Don't you hate it when that happens!!!??? Now I need to have a term that stands for database theory that is based on the mathematical definition of relation and distinguish that from what is currently called relational database theory that does not match the mathematical definition of relation. UGH! Please help. --dawn
Adrian Kubala - 09 Jan 2004 21:51 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> Yes, you are absolutely right. In fact, the model I work with is more of a > di-graph of data and instead of joins, it uses data trees (much like the > web). This is accomplished with functions as well. Instead of using two > different concepts -- sets/relations combined with operators, the model I > work with uses functions in both cases. Functions are one-way mappings. Many relationships in the world work both ways. It certainly seems useful to distinguish these two kinds of relationships, which relations + functions does but functions alone does not.
Dawn M. Wolthuis - 10 Jan 2004 03:18 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > Yes, you are absolutely right. In fact, the model I work with is more of a [quoted text clipped - 7 lines] > relationships, which relations + functions does but functions alone does > not. This must be another issue of definitions because there are no functions in mathematics that are not relations. Functions are a particular type of relation. Without looking up Codd's redefinition of a function, if he has one, but just sticking to pure mathematical definitions, a function is necessary a relation, so your statement could not possibly be true in that realm. In areas other than mathematics, I'm sure that people can redefine these concepts any way they wish.
But are we in agreement that using mathematics terminology and working from the standpoint of mathematics, all functions are, by definition, also relations? Thanks. --dawn
Adrian Kubala - 10 Jan 2004 22:59 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> Functions are one-way mappings. Many relationships in the world work >> both ways. It certainly seems useful to distinguish these two kinds of [quoted text clipped - 4 lines] > functions in mathematics that are not relations. Functions are a > particular type of relation. I had assumed you were talking about representation. There is clearly some difference between, i.e. the function y = x and the relation {<0,0>, <1,1>, <2,2>, ...}. For example, it would be impossible to enumerate that relation explicitly in memory. On the other hand, some functions are such that if you express them implicitly as equations it is harder to solve for some variables than others. That's why I say both representations have merit, but that for the kinds of relations represented in a database it's usually simplest to express them explicitly.
Since you were not talking about relations vs functions in terms of their representation, I don't understand your original point. All functions are relations but not all relations are functions, therefore a function-only database is strictly less expressive than a fully-relational one with no benefits.
On the other hand, if a database allowed you to describe *some* relations as functions and took advantage of algebraic reasoning when creating derived relations from these functions, that would be really neat. But then you'd basically have a kind of Prolog, right?
Dawn M. Wolthuis - 11 Jan 2004 00:41 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > >> Functions are one-way mappings. Many relationships in the world work [quoted text clipped - 26 lines] > creating derived relations from these functions, that would be really > neat. But then you'd basically have a kind of Prolog, right? With this discussion, I've been focussed on one specific issue, where the database model I am using has been taken to task for not employing relations. I have no problem stating that it does not 100% follow a relational database model, however, this one point -- that it does not employ relations is entirely false. Everything in the model is a function and since all functions, by definition, are relations. Chris Date asks questions about the MultiValue database model in his papers on 1NF this summer and he and Pascal are likely correct that nothing they have read is precise enough to respond to. The argument again crops up that the Nelson-Pick/MultiValue model is not based on relations and I intend to state that because it is based on functions and functions are, by definition, relations, that it is most definitely based on relations. Other arguments that it does not abide by every relational database rule are likely accurate, but this indictment is not.
So, that's my point -- the MultiValue model is based on the mathematical definition of relations, and should not be accused of not being "relational" in that regard, even if it is not based on Codd's relational database model. Cheers! --dawn
Adrian Kubala - 11 Jan 2004 02:30 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> With this discussion, I've been focussed on one specific issue, where the > database model I am using has been taken to task for not employing > relations. I have no problem stating that it does not 100% follow a > relational database model, however, this one point -- that it does not > employ relations is entirely false. That's like calling a black and white camera a "color camera" because black and white are colors.
Dawn M. Wolthuis - 11 Jan 2004 03:45 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > With this discussion, I've been focussed on one specific issue, where the [quoted text clipped - 5 lines] > That's like calling a black and white camera a "color camera" because > black and white are colors. Most certainly not. First of all, black and white are not typically considered colors by color professionals, I believe.
I will be the first to say that the Nelson-Pick model does not meet the criteria of the relational database model. But it is absolutely the case (if you accept my analysis that it is based on functions, I think you will agree) that it is a mathematically relational model, right? I'm trying to use terminology that would be agreeable to all and using mathmatical terms in order to ensure precision seems like the best place to go for definitions of mathematical terms.
However, I don't want to use emotive language. I do want to be true to Codd's interest in being precise. If using mathematical language that has been co-opted by various "sides" in some "debate" will trip people up, then I need to come up with new language that will not trip people's emotional buttons.
Given that, should I completely avoid the word "relation" when referring to the mathematics of data modeling? I have already decided to set the word "domain" aside since it is a completely abused term. Relations in mathematics are quite consistently defined (as are domains, but, ah well), so if people are willing to put on mathematics "hats" and accept mathematical definitions, I think I can be precise without redefining these terms.
But I am curious -- if I prove that a database model is based on mathematical relations, and from that perspective is a "relational model", when both you and I would agree (even though IBM's marketing material does not) that it is not based on THE relational database model as specified by Codd, so that databases that implement this model should not be called RDBMS's (for example) -- is that likely to cause relational theorists to bit-flip (and switch from logical thinking to emotional reactions) and disregard anything else that is said?
I'm trying to choose my terminology wisely, but if I call a five a five and it trips emotions, well then what's a girl ta do, eh? smiles. --dawn
Adrian Kubala - 11 Jan 2004 22:05 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: >> > With this discussion, I've been focussed on one specific issue, [quoted text clipped - 13 lines] > (if you accept my analysis that it is based on functions, I think you will > agree) that it is a mathematically relational model, right? It does not allow you to express *any* relation, therefore it is not relational, in exactly the same way that a camera which only lets you take pictures of pink is not a color camera. If your model let you express mathematical relations instead of "Codd relations", then I would agree with you, but since there are mathematical relations which your model cannot express, it's wrong to call it relational.
Especially since it seems your intent in doing so is to imply that it is just as expressive as the relational model, when in fact it is strictly less expressive. (Not to imply any value judgement, but simply to convey the fact that there are relations which your model cannot possibly express but any relational model can, whereas any function that your model can express, any relational model can as well.)
Dawn M. Wolthuis - 12 Jan 2004 03:06 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > [quoted text clipped - 22 lines] > agree with you, but since there are mathematical relations which your > model cannot express, it's wrong to call it relational. Every relation where each tuple has a unique identifier or candidate key can also be represented as a function. Can you give an example of a set of propositions that can be modeled as relations but not as functions? These functions are not limited to 1NF and, as such, typical propositions can be modeled much more handily than in a relational model. But if you can produce an example that can be expressed using relations and not functions or that is even easier to work with when represented as relations rather than functions, I am VERY interested.
> Especially since it seems your intent in doing so is to imply that it is > just as expressive as the relational model, when in fact it is strictly > less expressive. (Not to imply any value judgement, but simply to convey > the fact that there are relations which your model cannot possibly > express but any relational model can, whereas any function that your > model can express, any relational model can as well.) Yes, you are right that I intend to state that it is just as expressive, but no, it does not simply follow from the statement that it uses functions that it is more expressive. I will try not to jump steps like that. I think you are incorrect in assuming that functions are less expressive for data modeling purposes than relations, but if I am wrong about that, I definitely would like to be corrected. Thanks for your help. --dawn
Adrian Kubala - 12 Jan 2004 06:35 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> Every relation where each tuple has a unique identifier or candidate key can > also be represented as a function. Can you give an example of a set of > propositions that can be modeled as relations but not as functions? {<1, 1> <1, 2> <2, 1>}
Dawn M. Wolthuis - 12 Jan 2004 10:28 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > Every relation where each tuple has a unique identifier or candidate key can > > also be represented as a function. Can you give an example of a set of > > propositions that can be modeled as relations but not as functions? > > {<1, 1> <1, 2> <2, 1>} Yeah, thanks Adrian, but I've got the mathematics part down on the difference between a function and a relation. I can graph such. But when it comes to what we once called data processing using a computer, is it feasible to have a pure relational implementation (no, I'm not confusing implementations with the model -- stick with me) using a database without actually ending up with a function? In the relational model, how do you implement this without either a) an identifier tagged on by the database which could be kept out of the model or b) a candidate key?
The function this would correspond to in the model I work with is one where either the designer would implement it with a random or sequential key, such as:
ID: 1 INFO: <1 ,1>
ID: 2 INFO: <1, 2>
ID: 3 INFO: <2, 1>
Or perhaps:
ID: 1 INFO-A: 1 INFO-B: 1
ID: 2 INFO-A: 1 INFO-B: 2
ID: 3 INFO-A: 2 INFO-B:1
You can see the function outright in the above design. Or the designer could add it without such a key, by making the entire tuple a "candidate key" by implementing it as
ID:<1,1> ID:<1,2> ID<2,1>
In this design, the function maps each of these IDs to the null set.
Whether one chooses to reflect that it is actually a function that gets implemented in a database or opts to leave out that information and treat the function logically as a relation (which it also is) by removing any unique identifier or candidate keys from the implementation (can you really implement this in an RDBMS without any candidate keys?), another model could just as accurately reflect the fact that this is a function within the model.
I might have said that poorly -- did that make sense? Thanks a bunch -- I truly do want to understand and be corrected if I have this wrong. --dawn In
Adrian Kubala - 12 Jan 2004 22:31 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> You can see the function outright in the above design. Or the designer > could add it without such a key, by making the entire tuple a [quoted text clipped - 5 lines] > > In this design, the function maps each of these IDs to the null set. In that case, I will grant you that this approach is equally expressive, but I think that calling this a "function" is an abuse of the term. The range of this function, and therefore the function itself, in it's function-ness, is logically meaningless -- you're actually using the domain as a way to encode a list of tuples, a relation, in an obscure way.
A modelling tool should deliver a straightforward mapping between the parts of the model and the parts of the system we're modeling. With relations it is easy to make such a mapping by treating relations as logical predicates: "so-and-so LIVES AT such-and-such-a-place". I don't see how such a mapping would work in general for functions. In our above example, you can see that the null range of the function corresponds to nothing in our actual system, it is just there to satisfy the formalism. That's a bad sign.
> Whether one chooses to reflect that it is actually a function that > gets implemented in a database or opts to leave out that information [quoted text clipped - 3 lines] > candidate keys?), another model could just as accurately reflect the > fact that this is a function within the model. So my complaint is that calling relations functions is a bad idea. I think that if you tried to translate relational formalisms into terms of equivalent "functions from tuples to the null set" the result would be more complicated, harder to understand and apply. Which is why mathematicians created the concept of relations to begin with, so I'm surprised you disagree.
Dawn M. Wolthuis - 13 Jan 2004 03:34 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > You can see the function outright in the above design. Or the designer [quoted text clipped - 37 lines] > mathematicians created the concept of relations to begin with, so I'm > surprised you disagree. I understand your point that a mapping from a bunch of "keys" to empty sets sounds like a tacky component of a model. However, NULLS in this environment can be modeled as null sets and play prominently in the model.
More importantly, I would argue that calling functions relations removes some of the specificity, making the model more abstract than is necessary. If you got back to my list of "input, output, processing, and storage" your basic components are objects and functions -- subjects and predicates -- input to a function, the processing of the function, and the output of the function, which may or may not be stored. That's the simple version for modeling a computer.
We have object languages and function languages, as well as languages that walk you through operations.
Functions are one type of object and objects are the input of and output from functions. What needs to be stored for the purpose of retrieval are objects and functions (handled as objects). From these functions & objects, one can perceive various types of functions as graphs, including trees & di-graphs. We can also apply predicate logic to the functions, which are also predicates where instances are propositions.
Introducing the mathematical concept of a relation really adds nothing to the mix that cannot be done more simply by viewing everything as an object or a function. The model I'm working with includes functions on functions that are not present in the relational operators. These include at least a) navigation by way of a link -- analogous to what is done when one navigates from a web page clicking on a link and getting to another web page (by the way, notice that web pages are all functions -- they have a URL mapping to a page) and b) unnesting and nesting of data, which TTM refers to as GROUP and UNGROUP, IIRC.
I'm still working this through, so I don't claim to have all of the answers for what is the right way to do things -- I'm working on describing one way that has been in production for more than 30 years, with many systems that are 20 years old still hanging around and moving forward. This model accounts for an exceedingly quiet multi-billion dollar industry of players who have not (yet?) succumbed to the relational model. The more I learn, the more convinced I am that they are right not to move to 1NF, at the very least, and that is only one of the advantages they currently enjoy.
Thanks for helping me make sure that I'm not illogical in the details, even if my conclusions seem off the mark given the details presented to ate. --dawn
Adrian Kubala - 13 Jan 2004 07:09 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> More importantly, I would argue that calling functions relations removes > some of the specificity, making the model more abstract than is necessary. I'm starting to slightly get a hint of what you mean, and I think we have covered the logical bases and it sounds like it boils down to a religious disagreement along very similar lines to imperative versus functional programmers, so I'm not going to pursue it more. Personally I like abstraction and I find I make better and easier-for-me-to- understand models with the right abstract tools but I know not everybody feels that way. I also think people would be better off if they all agreed with me, but who doesn't feel that way? :)
Marshall Spight - 13 Jan 2004 05:30 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > You can see the function outright in the above design. Or the designer [quoted text clipped - 9 lines] > In that case, I will grant you that this approach is equally expressive, > but I think that calling this a "function" is an abuse of the term. Au contraire; it fits the definition of function perfectly: a mapping from a set of values to another set of values. In this case, the second set of values happens to be the empty set.
> The > range of this function, and therefore the function itself, in it's > function-ness, is logically meaningless -- you're actually using the > domain as a way to encode a list of tuples, a relation, in an obscure > way. But Dawn didn't pick the example; you did. If the example is wonky, why not pick a real example, and Dawn can show you how that, too, is a function, provided you limit yourself to a single key?
Nothing in the definition of function prevents the range of a function from being the empty set. (Nothing prevents the domain from being the empty set, either. Neat, eh? Nullology is the most fun part of relational theory.)
> A modelling tool should deliver a straightforward mapping between the > parts of the model and the parts of the system we're modeling. With > relations it is easy to make such a mapping by treating relations as > logical predicates: "so-and-so LIVES AT such-and-such-a-place". I don't > see how such a mapping would work in general for functions. It's the *exact* *same* *mapping.* The only difference is that you are limited to a single key.
> In our above > example, you can see that the null range of the function corresponds to > nothing in our actual system, it is just there to satisfy the formalism. > That's a bad sign. Again, you picked the example. You happened to pick an example in which every attribute was included in the key. Is this a "bad sign?" No; that's a totally fine situation to have. Consider: your example could be a many-to-many table for some other table, and each of the two columns could be foreign keys. The key would be a composite key over the two columns. Bad sign? Not at all. It's actually not even that uncommon.
Marshall
Adrian Kubala - 13 Jan 2004 07:56 GMT Marshall Spight <mspight@dnai.com> schrieb:
>> A modelling tool should deliver a straightforward mapping between the >> parts of the model and the parts of the system we're modeling. With [quoted text clipped - 4 lines] > It's the *exact* *same* *mapping.* The only difference is that you > are limited to a single key. It can't be the same mapping, as proven by example; the 3-part predicate "person LIVES AT place DURING time period" has an obvious mapping to the columns of a relation. To which parts of this predicate do the DOMAIN and RANGE of a function correspond?
Dawn M. Wolthuis - 13 Jan 2004 14:13 GMT > Marshall Spight <mspight@dnai.com> schrieb: > >> A modelling tool should deliver a straightforward mapping between the [quoted text clipped - 10 lines] > columns of a relation. To which parts of this predicate do the DOMAIN > and RANGE of a function correspond? A predicate would look like this:
person IDENTIFIED BY id LIVES AT place DURING time period
I'll name the function using the plural rather than singular (for reasons I won't go into and of course there are differences of opinion on this) and the function is
PEOPLE(id) = { (place, time period) }
Marshall noted that this requires a single value as a candidate key. Functions can, however, be from a set that includes tuples as well, so even though it is correct that it is a single value for the key, that value could be a tuple.
Although with this model there are fewer tables and in particular relationship tables (as in ERD relationships) are typically lists (nestedt tables) within an entity file, when it makes sense to have a relationship table separately the domain would be a set of tuples from the domains of the entities:
STUDENT-SCHEDULE(student,term) = { (courses, days of week, start time, end time)* } using the XML notation of * to indicate a repeating group, i.e. nested table.
Thank you gentlemen, I think I have the information I need on this one. Cheers! --dawn
Adrian Kubala - 13 Jan 2004 20:25 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> It can't be the same mapping, as proven by example; the 3-part predicate >> "person LIVES AT place DURING time period" has an obvious mapping to the [quoted text clipped - 10 lines] > > PEOPLE(id) = { (place, time period) } That's wrong because one person can live at many different places at different times. So you need to map each person to a list of tuples -- in which case you have hidden the information necessary to make queries like "who all has every lived at this place" -- or you need to use a null range, in which case the range of the function is modelling nothing and is useless baggage.
The point is that in most interesting relations there is no obvious way to split the predicate into TWO parts where one is more important than the other; you have N parts which are all equally-important.
I still haven't heard (or overlooked) your explanation of why you think that whatever rationalization lead mathematicians to invent relations as a separate concept from functions does not apply just as well to databases? They're too abstract?
Dawn M. Wolthuis - 14 Jan 2004 00:03 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > >> It can't be the same mapping, as proven by example; the 3-part predicate [quoted text clipped - 18 lines] > null range, in which case the range of the function is modelling nothing > and is useless baggage. I didn't take into account an entire specification. A typical implementation of one person living at different places, not taking into account past addresses might be modeled as:
PEOPLE(id) = { (place, start date, end date)* } where date is month/day only if it is an annual date and is month/day/year or as PEOPLE(id) = { (place, start date, end date, annual start, annual end)* }
No matter how you cut it, you are mapping something to something else.
> The point is that in most interesting relations there is no obvious way > to split the predicate into TWO parts where one is more important than [quoted text clipped - 4 lines] > a separate concept from functions does not apply just as well to > databases? They're too abstract? It is not a matter of being two abstract but the fact that with computers we are just talking about applying functions (processing) to objects. If we want to go into depth on objects that are relations and are not functions for some particular application, it is fine to define an object that is a relation, but I have never seen any need for that with retrievable stored data since by nature of it being accessible, it almost always (?can you think of cases this is not true) can be easily represented as a function operating on a key or reference id of some sort to retrieve the data.
While there are many types of propositions that are difficult to encode in an RDBMS and easy to encode in implementations of the Nelson-Pick model. These include, but are not limited to, multivalued attributes. How often to those implementing a relational model have to think hard about whether to pull out an attribute because it is possible or likely that in the future one might want to have more than one of 'these' for this entity. Even today, with cell phones having been out for over a decade, I've had someone tell me on the phone that their system permitted one phone number for home and one for business, but didn't have a spot for an additional phone number. This is highly unlikely in a system where cardinality of attribute values can be changed with the blink of an eye from 1 to a variable (not fixed number) "many".
So, where I have seen no instances that cannot be modeled equally well with functions as with relations, I have seen many times when a flexible function is a much better way to model the propositions.
Take some example propositions:
Hope has a cat named Geneva and a dog named Rugby. Shanna has no pets, but did have a dog named Monte who died in 2002.
Given only these statements, I might immediately come up with something like this:
a function named PEOPLE and the assignment of an arbitrary (or sequentially assigned) id for each person PEOPLE("12345") = { "Hope", { ("cat", "Geneva", NULL) , ("dog", "Rugby", NULL)} } PEOPLE("12346") = ( "Shanna", { ("dog", "Monte", "2002") } }
I've modeled this with a single function and I think I can even leave out telling you the metadata (although I wouldn't typically) because this so closely models the way we speak and think about these propositions. Now take these same propositions and model them with a relational model. This would require at least two relations and the act of splitting up these propositions into two separate propositions adds to the complexity for both the developer and unfortunately typically also for the end-user trying to do queriest, without any noticable gains.
Or am I wrong? Thanks. --dawn
Dawn M. Wolthuis - 14 Jan 2004 04:06 GMT Goodness Gracious -- way too many typos in that previous post of mine since I did it hastily and didn't proof it before sending. My apologies, but I trust that you can make your way through it. In case you prefer to read a cleaner version, I've cleaned it up below. --dawn
> > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > >> It can't be the same mapping, as proven by example; the 3-part [quoted text clipped - 22 lines] > > null range, in which case the range of the function is modelling nothing > > and is useless baggage. The following is "cleaned up"
> I didn't take into account an entire specification. A typical > implementation of one person living at different places, not taking into [quoted text clipped - 17 lines] > > It is not a matter of being too abstract but the fact that with computers we
> are just talking about applying functions (processing) to objects. If we > want to go into depth on objects that are relations and are not functions, [quoted text clipped - 7 lines] > an RDBMS and easy to encode in implementations of the Nelson-Pick model. > These include, but are not limited to, multivalued attributes. How often do
> those implementing a relational model have to think hard about whether to > pull out an attribute into a separate table because it is possible or likely that in the future
> one might want to have more than one of 'these' attributes for this entity.
> Even > today, with cell phones having been out for over a decade, I've had someone > tell me on the phone that their system permitted one phone number for home > and one for business, but didn't have a spot for an additional phone number. > This is highly unlikely in a system where cardinality of attribute values > can be changed with the blink of an eye from 1 to a variable number "many" (not fixed
> number). > [quoted text clipped - 19 lines] > telling you the metadata (although I wouldn't typically) because this so > closely models the way we speak and think about these propositions.
> Now > take these same propositions and model them with a relational model. This > would require at least two relations. The act of splitting up these > propositions each into two separate propositions adds to the complexity for both
> the developer and, unfortunately, typically for the end-user trying to do > queries. This complexity is introduced without sufficient gains, in my opinion.
> Or am I wrong? Thanks. --dawn Adrian Kubala - 14 Jan 2004 06:53 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> So, where I have seen no instances that cannot be modeled equally well > with functions as with relations, I have seen many times when a [quoted text clipped - 13 lines] > NULL)} } > PEOPLE("12346") = ( "Shanna", { ("dog", "Monte", "2002") } } This is not modeling, because all you've done is associate some lists with each person, without any formal way to reason about what the lists MEAN. I could just as well "model" the first proposition as:
CATS("Geneva") = { {("owner", "Hope")} } etc. or even PEOPLE("12345") = { "has a cat named Geneva and a dog named Rugby" }
Any extra flexibility you get is by delegating more of the semantics/interpretation to the clients of the database.
Dawn M. Wolthuis - 14 Jan 2004 17:42 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > So, where I have seen no instances that cannot be modeled equally well [quoted text clipped - 18 lines] > with each person, without any formal way to reason about what the lists > MEAN. I could just as well "model" the first proposition as: Are you saying it is not modeling because I did not show the logical steps I took to arrive at this or is it that modeling with a function is necessarily not a model or what? I didn't search for a UML to text conversion utility ;-) but I could show this as an object that is a function, has methods, etc. What would it take for this to be accepted as a model? --dawn
> CATS("Geneva") = { {("owner", "Hope")} } etc. > or even > PEOPLE("12345") = { "has a cat named Geneva and a dog named Rugby" } > > Any extra flexibility you get is by delegating more of the > semantics/interpretation to the clients of the database. Adrian Kubala - 14 Jan 2004 23:36 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
>> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: >> > Hope has a cat named Geneva and a dog named Rugby. [quoted text clipped - 14 lines] > steps I took to arrive at this or is it that modeling with a function > is necessarily not a model or what? You are not modeling with functions, you are modeling with lists. I can tell because there was no mention of lists in the original preposition, and lists are not required to describe functions, but nevertheless lists have snuck into your model. On the other hand, the above would be a perfectly good function-based model of "Person 12345 has the list (hope, (cat, geneva, null), (dog, rugby, null))".
I say it's not modeling because I don't believe you have a general theory for how prepositions (in this case) can be mapped to and from lists in a general way, with a useful algebra on lists which preserves the truth values of prepositions. It is not enough to provide a post-hoc rationalization for why you chose these particular lists for these particular examples. But if you do have such a theory I am excited to hear it.
Dawn M. Wolthuis - 15 Jan 2004 00:14 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > >> Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: [quoted text clipped - 22 lines] > perfectly good function-based model of "Person 12345 has the list (hope, > (cat, geneva, null), (dog, rugby, null))". My function is named PERSON and it maps the string "12345" to a set of strings or string tuples. I can use more precision in the future to ensure that you can see that this function provides a model for a plausible implementation.
> I say it's not modeling because I don't believe you have a general > theory for how prepositions (in this case) can be mapped to and from [quoted text clipped - 3 lines] > particular examples. But if you do have such a theory I am excited to > hear it. That is one of the things I have been working on, starting with studying how the many developers who have worked with this model since 1965 have "post-hoc" been doing this and pulling out common, repeatable processes that are used. Although there is no written document on how to do this (to my knowledge) there also was no such specification for "secretaries" who set up filing systems in days before computers -- and yet the job gets done and the solutions seem to be quite flexible in meeting the needs of a company often for many years (many instances of > 20 years of such databases). It does seem that experience makes a difference in how well the developer (or secretary) has been able to set up such solutions. I'm trying to capture the knowledge of expert developers who use the Nelsson-Pick model as I expect such information will also help with the development of XML documents (which use a model that is very similar).
--dawn
Adrian Kubala - 15 Jan 2004 01:41 GMT Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb:
> "Adrian Kubala" <adrian@sixfingeredman.net> wrote in message >> You are not modeling with functions, you are modeling with lists. I can [quoted text clipped - 8 lines] > that you can see that this function provides a model for a plausible > implementation. I must have failed to communicate my point clearly, because I'm not contesting whether the function maps a string to a set, but whether it accurately models the predicate it claims to.
>> I say it's not modeling because I don't believe you have a general >> theory for how prepositions (in this case) can be mapped to and from [quoted text clipped - 8 lines] > "post-hoc" been doing this and pulling out common, repeatable processes that > are used. I understand this kind of experience, and its lack is not what I'm complaining about. If you are programming in OO, it takes experience to decide which objects to include in your model. If you are creating a relational database, it takes experience to decide which predicates to include. In both these systems, once you have done that, there already exists a theory telling you what it MEANS to be an object or predicate, and how you can reason about them in a formal way.
> Although there is no written document on how to do this (to my > knowledge) there also was no such specification for "secretaries" who > set up filing systems in days before computers -- and yet the job gets > done and the solutions seem to be quite flexible in meeting the needs > of a company often for many years (many instances of > 20 years of > such databases). Just because something gets the job done doesn't mean it's a good model... the Babylonians, for example, could predict solar eclipses accurately but they did not understand that the earth orbits the sun and the moon the earth.
Bob Badour - 15 Jan 2004 02:09 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > So, where I have seen no instances that cannot be modeled equally well [quoted text clipped - 25 lines] > Any extra flexibility you get is by delegating more of the > semantics/interpretation to the clients of the database. Since the "lists" are unnamed, one wonders how one does anything with them. I see no flexibility in Dawn's idiocy. Pick lacks flexibility. It is a chained straightjacket with an anchor attached.
Dawn M. Wolthuis - 15 Jan 2004 02:24 GMT > > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > > So, where I have seen no instances that cannot be modeled equally well [quoted text clipped - 31 lines] > I see no flexibility in Dawn's idiocy. Pick lacks flexibility. It is a > chained straightjacket with an anchor attached. You might take note that I stated that I was leaving out the metadata, although I don't usually do that. The attributes are also unnamed in my message, but not in a full-blown model, just like the attributes that are associated with each other are also named as a tuple in the actual model.
Just because you see no flexibility and continue to let me know regularly that I am an idiot (even though your statements likely tell readers more about you than about me -- you might consider seeing a shrink to find out why you feel a need to belittle me and others you have never met, in this fashion), I see no logic in your counter arguement here into which I can sink my teeth. I do gather, however, from your comments that you have some experience with the Nelson-Pick model or an actual PICK implementation. If so, please let me know what in your experience has shown inflexibility and if not, please let me know what information you have that leads you to believe that PICK is a "straightjacket"....
Thanks. --dawn
Bob Badour - 12 Jan 2004 22:07 GMT > Dawn M. Wolthuis <dwolt@tincat-group.com> schrieb: > > [quoted text clipped - 29 lines] > express but any relational model can, whereas any function that your > model can express, any relational model can as well.) Even if it could express them, it expresses them with greater complexity and less facility.
Bob Badour - 28 Dec 2003 09:08 GMT > >>>"Joe "Nuke Me Xemu" Foster" <joe@bftsi0.UUCP> wrote: > >><snip> [quoted text clipped - 80 lines] > neither irrelevant nor wrong in the system about which the comments > were made. Mathematical relations do not rely on attribute order. The physical representation of mathematical relations using written symbols on planar surfaces conventionally uses attribute order for succinctness.
Marshall Spight - 28 Dec 2003 22:54 GMT > Mathematical relations do not rely on attribute order. The physical > representation of mathematical relations using written symbols on planar > surfaces conventionally uses attribute order for succinctness. This doesn't seem correct to me.
Consider the union of two sets of ordered pairs:
{ (1,2) } union { (2,1) }
The union of those sets is a set with two elements, yes? How are we distinguishing between (1,2) and (2,1) if not by order? Is there some hidden information?
Marshall
Joe \ - 28 Dec 2003 23:35 GMT > > Mathematical relations do not rely on attribute order. The physical > > representation of mathematical relations using written symbols on planar [quoted text clipped - 11 lines] > are we distinguishing between (1,2) and (2,1) if not by order? > Is there some hidden information? What if we assume these n-tuples aren't really ordered, but instead, the attribute names just so happen to be successive ordinal numbers, which we just so happen to omit for brevity in Usenet postings? >=)
-- Joe Foster <mailto:jlfoster%40znet.com> DC8s in Spaace: <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Bob Badour - 29 Dec 2003 05:25 GMT > > Mathematical relations do not rely on attribute order. The physical > > representation of mathematical relations using written symbols on planar [quoted text clipped - 10 lines] > The union of those sets is a set with two elements, yes? How > are we distinguishing between (1,2) and (2,1) if not by order? The order is implicit in the physical representation.
> Is there some hidden information? Yes. The semantics of the pairs. Why is the first one { (1,2) } and not { (2,1) } ?
The order in which we choose to list the dimensions is entirely arbitrary.
Joe \ - 27 Dec 2003 06:02 GMT > > A tuple is a set of ordered pairs of the form (attribute, value). > > Defining an ordering for the attributes would be superfluous. [quoted text clipped - 11 lines] > point me to a MATHEMATICS definition that allows for relations that are not > ordered. I'm not finding any such definitions. Yes, a "relation" built from ordered n-tuples would be quite broken! Consider an ordered 4-tuple, (a, b, c, d), or, { {a}, {a, b}, {a, b, c}, {a, b, c, d} }. What would happen if any two attributes were *equal*? A set of ordered pairs, OTOH, wouldn't lose information if any or all of its ordered pairs collapsed to sets containing just a single element. Even if {{{a}, {a, a0}}, {{b}, {b, b0}}, {{c}, {c, c0}}, {{d}, {d, d0}}} collapses to just { {a}, {b}, {c}, {d} } you could still figure out what was what. It's a good thing relational tuples aren't ordered n-tuples!
-- Joe Foster <mailto:jlfoster%40znet.com> Sign the Check! <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Marshall Spight - 27 Dec 2003 15:32 GMT > I've been puzzled by this for quite a while, just figuring that relational > theorists have this wrong. But the writings seem so sure of this. I had [quoted text clipped - 4 lines] > database theory, indicates what I thought about relations -- a relation is a > set of ordered tuples -- right? What am I missing here? I'll give you my perspective, but I don't know how much it'll help.
The question is, how does one distinguish the attributes in the relation? There are two choices: numerically/positionally, or by name. That is, one either has a mapping from 1, 2, ... n to attribute, or one has a mapping from name1, name2, ... namen to attribute. To me, it's not all that great a difference. It's just a question of what the application is.
If one is writing a page of equations, the convenience of using positional identification is high. One is likely working with a single relation at a time, and it is relatively simple to keep the ordering straight in the author's and the reader's head.
If one is part of a team building an enormous software system, then by-name is the better way to go, because of the mnemonic value of the names. There are likely a lot of relations with a lot of attributes.
It doesn't affect the semantics of relations or relational operators; it just affects how attributes are identified.
Marshall
Dawn M. Wolthuis - 27 Dec 2003 16:21 GMT <snip>
> It doesn't affect the semantics of relations or relational operators; > it just affects how attributes are identified. > > Marshall OK. Then a model in which there are relations where the tuples are ordered but where values can be retrieved either by name or by position would be different from a relational model where they are not accessible by position. Then Date's point is not that the MultiValue model is not made up of mathematical relations, but that it is a different model based on relations than what Codd and company call the relational model.
Then we are in agreement. Now I just need words to make it clear that a particular model is based on mathematical relations (ORDERED tuples) even though it is not based on all of the other rules imposed within a theory that names itself the "relational theory" (even though it is based on not perceiving the relation's tuples as ordered).
The developers of MultiValue databases (such as IBM) promote them as relational databases and they are mathematically correct. These databases stem from the Nelson-Pick data model, however and not from the work of Codd. If any company is permitted to decide what is relational and what is not, it would be IBM, I would think, so I'll figure they have as much a right as Date does to define what is and is not relational.
So, it seems to me we have a problem with our vocabularly. One group of database theorists have taken a mathematical theory -- that of relations -- and have named their theory after it, even though they do not stick to the mathematical definition and they extend the mathematics with many other rules. They then tell people working with other models that are equally mathematically relational that they do not conform to relational database theory.
To try to straighten this out, I have referred to "the relational database theory" instead as the RDBMS theory or the SQL database theory, but because those are both implementation-based terms they do not sit well with "relational databsae theorists". Is there a term for the "relational database theory" that we (at least I) could use to indicate that it is the relational database theory that does not include all databases that conform to the mathematical theory of relations?
I want to be able to agree wtih IBM that both DB2 and U2 are relational databases (since I DO agree with them) and also agree with Date that the U2 databases are not based on his version of a "relational database theory". Thanks in advance for any help with this vocabulary issue. --dawn
Bob Badour - 27 Dec 2003 22:39 GMT > > I've been puzzled by this for quite a while, just figuring that relational > > theorists have this wrong. But the writings seem so sure of this. I had [quoted text clipped - 13 lines] > not all that great a difference. It's just a question of what the application > is. Physical dependence vs. physical independence is not that big a difference?!?
> If one is writing a page of equations, the convenience of using > positional identification is high. You are confusing an external physical representation with a logical representation.
> It doesn't affect the semantics of relations or relational operators; > it just affects how attributes are identified. Huh? Of course it affects the semantics if positional ordering has meaning!
Dawn M. Wolthuis - 27 Dec 2003 22:46 GMT Do you agree, Bob, that relational database theory seems to require constructs that are NOT mathematical relations (because they have no logical ordering among attributes)?
Do you also agree that some of the data models that are not considered by relational theorists to be relational are, in this way, actually MORE relational in that they do require logical constructs (ordered tuples) that correspond to mathematical relations?
This is one of several issues I'm trying to square away or at least get a vocabulary that is not so contradictory. Thanks --dawn
Lauri Pietarinen - 28 Dec 2003 07:56 GMT > Do you agree, Bob, that relational database theory seems to require > constructs that are NOT mathematical relations (because they have no logical [quoted text clipped - 7 lines] > This is one of several issues I'm trying to square away or at least get a > vocabulary that is not so contradictory. Thanks --dawn To clear this very issue Date & Darwen define a relation in a relational database as a set of sets, each set consisting of a set triplets. For example, we could have the relation value
Person = { { (person_id, integer, 1), (person_name, string, 'Jill') }, { (person_id, integer, 2), (person_name, string, 'Joe') } }
(the second component is there to allow for subtyping)
So, I guess, strictly speaking, you could argue that this is not really a relation. However, the distinction is superficial, since there is a simple mapping from this structure to a mathematical relation by just mapping each attribute name to an ordinal number, e.g.
Person_mapping = {(person_id, 1), (person_name, 2)} and simply obtain the "true" mathematical relation
Person_math = { (1, 'Jill'), (2, 'Joe') }
or, if you want to include type information,
Person_math = { (1, integer, 'Jill', string ), (2, integer 'Joe', string) }
Given a certain ordering there is a trivial 1:1 mapping from the value of Person to the value of Person_math.
Since the ordinal position is immaterial for the user (or it should be) we can assume any order of attributes, and, hence, just forget about the whole issue.
The conversion from the "Codd" or "D&D" "relation" to the mathematical relation is (logically) done by the system. Note, that all relational operators (UNION, PROJECTION, RESTRICTION, etc..) are expressed very simply even with these "relational database relations" and the mapping to the corresponding mathematical operations is trivial.
The reasons (in my view) why Codd got rid of ordinal positions in his definition of a relation in a relational database were
1) so as not to burden the user with ordinal positions (e.g. column number 87, instead of, say, column named 'discount')
2) to not imply that the system actually stores the values in a certain order (=the system is free to map the columns to physical storage as it wishes)
However, there is no easy mapping between SQL-tables and mathematical relations. Take the SQL-table
SQL-Person
person_id person_name --------- ----------- 1 Joe 1 Joe
In order to get a mathematical relation out of this we have to number the duplicates, or include some hidden values in the original table. However way this is done, it complicates the mapping.
best regards, Lauri Pietarinen
Bob Badour - 28 Dec 2003 09:04 GMT > > Do you agree, Bob, that relational database theory seems to require > > constructs that are NOT mathematical relations (because they have no logical [quoted text clipped - 7 lines] > > This is one of several issues I'm trying to square away or at least get a > > vocabulary that is not so contradictory. Thanks --dawn Apparently, Dawn does not realise she has been in my twit-filter for months now.
For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical relations do not have any particular attribute order. The physical representation of mathematical relations using written symbols on planar surfaces relies on physical order for succinctness.
Dawn M. Wolthuis - 28 Dec 2003 14:53 GMT <snip> For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical
> relations do not have any particular attribute order. The physical > representation of mathematical relations using written symbols on planar > surfaces relies on physical order for succinctness. Your opinions about relations are welcome, but please refrain from telling me again what an idiot I am as I know your opinion already. If you want others to join you in thinking I'm an idiot, use logic to indicate the error of my thinking rather than telling people they ought to ignore me. Thankfully, I can summon up the self-esteem to continue discussing in this forum. I wonder how many people you have bullied or offended enough to leave OR decided this list was not about rational thinking, but rather personal attacks.
As for mathematical relations, they ARE sets of ORDERED TUPLES, whether on paper or in concept. The ordering is important for mapping to the domains, however, it is the case as one person pointed out that with database relations as laid out by Codd, there is a simple mapping from those relations to mathematical relations. There is still a vocabulary problem when working with database models that did not stem from Codd and that DO have actual mathematical relations in their model. "Relational Database" is not a useful designation when both DB2 and U2 have this applied to them. They are based on two very different models.
As someone once said (googling it indicates it was George Box?) "All models are flawed, but some are useful". The "relational database model" has pros and cons. The Nelson-Pick model has pros and cons. The XML model (very similar to the Nelson-Pick model) has pros and cons. One of the advantages of the relational model is that it has a wealth of documentation spelling it out as a logical theory. Almost everything written about the Nelson-Pick model is directed to the implementations (PICK) rather than the abstracted logical model. I'm attempting to provide more of an abstracted model so that the pros and cons of the model and not just the implementations can be discussed. I am guessing that most on this list are big fans of the relational database model, while I prefer other models. I think that is what leads Bob to his false conclusion.
--dawn
Joe \ - 28 Dec 2003 18:30 GMT > For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical > relations do not have any particular attribute order. The physical > representation of mathematical relations using written symbols on planar > surfaces relies on physical order for succinctness. Heh. If she thinks the relational tuple vs. ordered n-tuple difference is insurmountable, she ought to try getting a mathematician and a theoretical physicist to communicate. They use different definitions for so many of the same terms that they'd have to invent an entirely new language to talk about much of anything besides last night's game!
-- Joe Foster <mailto:jlfoster%40znet.com> Sacrament R2-45 <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Dawn M. Wolthuis - 28 Dec 2003 19:06 GMT <snip>
> Heh. If she thinks the relational tuple vs. ordered n-tuple difference > is insurmountable, she ought to try getting a mathematician and a [quoted text clipped - 6 lines] > WARNING: I cannot be held responsible for the above They're coming to > because my cats have apparently learned to type. take me away, ha ha! Nope -- nothing insurmountable about it. It is a matter of language. I have been told that a particular model does not include relations because the tuples are ordered. I am also told that the relational model is based on mathematical relations. However, the model I'm trying to describe is based on mathematical relations and is not, by my calculations, at all based on the relational model as understood by Codd and company.
So, I need some new language in order to communicate this. This has nothing to do with insurmountable differences, but I am hopeful someone can help me come up with a way to state this that is fair to both models (doesn't make the Nelson-Pick model sound holier just because it is based on mathematical relations, nor the Codd model sound better because it is based on a definiton of relations that it created which has become the database industry standard language). Are you able to understand the question I'm raising?
Once I understood why relational theorists think that mathematical relations are not relations, I was able to narrow this down to an issue of vocabulary. Would it sit OK with relational theorists if I refer to their def of relation as "unordered relations" or "Codd relations"? I don't want to call them database relations because I'll be talking about databases that are using mathematical (ordered) tuples as well. I'm sure I can make something up,. but I don't want the language to obscure the information.
--dawn
Marshall Spight - 28 Dec 2003 22:30 GMT > Nope -- nothing insurmountable about it. It is a matter of language. I am sympathetic to your cause, but I am not optimistic about its chances for success. Still, nothing to lose, eh?
> Would it sit OK with relational theorists if I refer to their def of > relation as "unordered relations" or "Codd relations"? I don't want to call > them database relations because I'll be talking about databases that are > using mathematical (ordered) tuples as well. I'm sure I can make something > up,. but I don't want the language to obscure the information. Since the current issue under discussion is how attributes are logically identified, something more like "relations with named attributes" might be useful. Probably that's too long, and "Codd relations" will have to do, though.
Maybe it is useful to consider this from the standpoint of the tuple?
Marshall
Bob Badour - 29 Dec 2003 05:12 GMT > > Nope -- nothing insurmountable about it. It is a matter of language. > [quoted text clipped - 13 lines] > > Maybe it is useful to consider this from the standpoint of the tuple? Dawn is an idiot. I suggest you ignore her.
Relations are relations. She is talking about physical representations of relations and not about relations themselves. She is too stupid to understand the difference between a thing and its picture.
Paul G. Brown - 28 Dec 2003 20:30 GMT > For anyone who doesn't know, ignore Dawn--she's an idiot. Mathematical > relations do not have any particular attribute order. The physical > representation of mathematical relations using written symbols on planar > surfaces relies on physical order for succinctness. Not quite right, Bob.
Mathematical relations *do* have an attribute order[1] (or else the term 'mathematical relation' is used in another context entirely: to refer to relationships between maps [can't find a cite]). One of the ways in which Codd's relational model distinguishes itself is that it names relation attributes and thereby does away with the need for ordering. Some interpretations of the relational model retain the attribute order property(Datalog, for example[2]) or require the use of an index offset as an attribute identifier[3].
Mind you, multi-value data management systems don't comply even in spirit with any of these models.
See:
[1] http://en.wikipedia.org/wiki/Mathematical_relation
[2] http://www.cs.buffalo.edu/~chomicki/635/datalog-h.pdf
[3] Abiteboul et al. _Foundations_of_Databases_ Addison-Wesley Publishing Company. 1995. (Specifically comments on the 'named' verse 'unnamed' perspectives in Section 3.2)
Joe \ - 27 Dec 2003 22:58 GMT > > It doesn't affect the semantics of relations or relational operators; > > it just affects how attributes are identified. > > Huh? Of course it affects the semantics if positional ordering has meaning! SELECT * really ought to rearrange the attributes each time it's used! >=) However, alignment can still matter in modern system architectures. Intel CPUs might still be happiest when floating point values are 8-byte-aligned, but this is an implementation, not a logical, detail.
-- Joe Foster <mailto:jlfoster%40znet.com> DC8s in Spaace: <http://www.xenu.net/> WARNING: I cannot be held responsible for the above They're coming to because my cats have apparently learned to type. take me away, ha ha!
Marshall Spight - 28 Dec 2003 22:24 GMT > > The question is, how does one distinguish the attributes in the relation? > > There are two choices: numerically/positionally, or by name. That is, [quoted text clipped - 5 lines] > Physical dependence vs. physical independence is not that big a > difference?!? I was speaking of logically distinguishing attributes. I don't see how the physical level is even relevant here.
> > If one is writing a page of equations, the convenience of using > > positional identification is high. > > You are confusing an external physical representation with a logical > representation. Interesting distinction, but not one that I can follow without further information. Do you have a reference for further reading?
> > It doesn't affect the semantics of relations or relational operators; > > it just affects how attributes are identified. > > Huh? Of course it affects the semantics if positional ordering has meaning! Mumble. Operations like union, intersection, difference, are identical either way. Join needs some work, but it's not what I'd call a huge issue.
Marshall
Bob Badour - 29 Dec 2003 05:11 GMT > > > The question is, how does one distinguish the attributes in the relation? > > > There are two choices: numerically/positionally, or by name. That is, [quoted text clipped - 8 lines] > I was speaking of logically distinguishing attributes. I don't > see how the physical level is even relevant here. You don't see how logically distinguishing attributes by physical position violates physical independence and confuses logical and physical issues?!?
> > > If one is writing a page of equations, the convenience of using > > > positional identification is high. [quoted text clipped - 4 lines] > Interesting distinction, but not one that I can follow without further > information. Do you have a reference for further reading? Um, everything that has ever been written on logical data models and the relational model in particular. What exactly do you not understand? Do you understand external vs. internal? Do you understand physical vs. logical? Actually, you don't have to answer the last question because it is clear you do not.
> > > It doesn't affect the semantics of relations or relational operators; > > > it just affects how attributes are identified. [quoted text clipped - 3 lines] > Mumble. Operations like union, intersection, difference, are identical > either way. Join needs some work, but it's not what I'd call a huge issue. No, they are not identical. Consider the following:
R1 = { { A=1, B=2 } } and R2 = { { B=2, A=1 } }
What is R3 = R1 union R2? What is R4 = R1 intersect R2? What is R5 = R1 minus R2?
If position matters, the answers are: R3 = { { A=1, B=2 }, { A=2, B=1 } } R4 = { } R5 = { { A=1, B=2 } }
If position does not matter, the answers are: R3 = { { A=1, B=2 } } R4 = { { A=1, B=2 } } R5 = { }
Marshall Spight - 29 Dec 2003 17:20 GMT > > I was speaking of logically distinguishing attributes. I don't > > see how the physical level is even relevant here. > > You don't see how logically distinguishing attributes by physical position > violates physical independence and confuses logical and physical issues?!? Again, I don't see the physical level being discussed here. I don't see that position or order are necessarily physical; they can be logical. In this case, they are.
> > > You are confusing an external physical representation with a logical > > > representation. [quoted text clipped - 4 lines] > Um, everything that has ever been written on logical data models and the > relational model in particular. Your citation lacks a certain hoped-for specificity.
> What exactly do you not understand? Do you > understand external vs. internal? Do you understand physical vs. logical? > Actually, you don't have to answer the last question because it is clear you > do not. I don't understand why you believe that order is necessarily physical.
> > > > It doesn't affect the semantics of relations or relational operators; > > > > it just affects how attributes are identified. [quoted text clipped - 19 lines] > R4 = { } > R5 = { { A=1, B=2 } } That's not correct. If we are to do an apples-to-apples comparison of relations with named attributes vs. ordered attributes, the corresponding ordered relation would be:
R1 = { (1,2) } R2 = { (1,2) }
(I chose attribute A to map to first posititon, and attribute B to map to second position.)
In which case:
R3 = { (1,2) } R4 = { (1,2) } R5 = {}
Which, using A -> first, B -> second, exactly corresponds to you answers for named attributes:
> If position does not matter, the answers are: > R3 = { { A=1, B=2 } } > R4 = { { A=1, B=2 } } > R5 = { } As I said, it is simply a question of how one identifies the attributes.
Marshall
Bob Badour - 29 Dec 2003 18:51 GMT > > > I was speaking of logically distinguishing attributes. I don't > > > see how the physical level is even relevant here. [quoted text clipped - 3 lines] > > Again, I don't see the physical level being discussed here. If implicit order is not physical, where does it come from?
> I don't > see that position or order are necessarily physical; they can be > logical. In this case, they are. Nothing is more physical than position ie. location.
> > > > You are confusing an external physical representation with a logical > > > > representation. [quoted text clipped - 6 lines] > > Your citation lacks a certain hoped-for specificity. As does your claimed lack of comprehension.
> > What exactly do you not understand? Do you > > understand external vs. internal? Do you understand physical vs. logical? > > Actually, you don't have to answer the last question because it is clear you > > do not. > > I don't understand why you believe that order is necessarily physical. The only logical orders are conventional collating sequences of domains. All other order is physical. It implies physical location whether absolute or relative. If not physical, where does the order come from?
> > > > > It doesn't affect the semantics of relations or relational operators; > > > > > it just affects how attributes are identified. [quoted text clipped - 23 lines] > of relations with named attributes vs. ordered attributes, the > corresponding ordered relation would be: If order has meaning, stick with the order I gave you for the operations.
> (I chose attribute A to map to first posititon, and attribute B to map to > second position.) It is not your choice. I already gave you the order.
> > If position does not matter, the answers are: > > R3 = { { A=1, B=2 } } > > R4 = { { A=1, B=2 } } > > R5 = { } > > As I said, it is simply a question of how one identifies the attributes. You argue for implicit meaning encoded in order. I correctly identified the attributes in my example, and I showed that if order has meaning the results differ. Order requires an additional step and greater user knowledge to achieve correct results, which means the operations differ.
Omitting the semantic identifiers for the attributes only confuses matters.
Dawn M. Wolthuis - 30 Dec 2003 04:12 GMT > > "Bob Badour" <bbadour@golden.net> wrote in message > news:Ws6dnWXlao4FKnKiRVn-tA@golden.net... [quoted text clipped - 10 lines] > > If implicit order is not physical, where does it come from? <snip>
Just so it is clear, Bob, since you are filtering out my responses anyway, Marshall is correct and you are wrong on this one, darlin'
Ordering of values is logically a function (=operator=map=procedure=method) from (a subset of) the positive integers to a set of values (and likewise ordering of attributes is a function from the positive integers to a set of attributes). One can talk about ordering logically if one can talk about numbers logically (and most of us can!).
--dawn
Mike Preece - 09 Jan 2004 03:06 GMT > > > "Bob Badour" <bbadour@golden.net> wrote in message > news:Ws6dnWXlao4FKnKiRVn-tA@golden.net... [quoted text clipped - 23 lines] > > --dawn Well - that's easy for you to say! ;)
I don't get it - but then, I'm not a mathematician. Make it easy for me, please. Call me an idiot if you like - or let Bob do it. I'd like to learn though.
If I'm following this discussion correctly, you're saying the word "relational", in the generally accepted context of "relational databases", has a different meaning when used in a strictly mathematical context.
The difference has something to do with "ordering". I don't understand. Sorry.
Is it important? and if so, why?
Mike.
Jonathan Leffler - 09 Jan 2004 04:39 GMT > [...various non-illuminating diatribes omitted...] > If I'm following this discussion correctly, you're saying the word [quoted text clipped - 6 lines] > > Is it important? and if so, why? Possibly the simplest thing to do is look at the original paper, which is available online at:
http://www.acm.org/classics/nov95/toc.html
As I pointed out earlier in one of these threads (possibly even this one), there is a section in this about the difference between ordered mathematical relations and unordered 'relationships' used in RDBMS, and why that is important. My take on it is that the primary issue is usability - people have a harder time using numbers to identify columns than using names.
 Signature Jonathan Leffler #include <disclaimer.h> Email: jleffler@earthlink.net, jleffler@us.ibm.com Guardian of DBD:
|
|