Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / December 2007

Tip: Looking for answers? Try searching our database.

Character string relation and functional dependencies

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Tegiri Nenashi - 03 Dec 2007 19:16 GMT
The connection between objects and databases seems to be a recurring
theme here on c.d.t, so I'd like to contribute one observation.
Consider a Substring relation:

Substring
=======
data | fragment
-------------------
abab | ab
abc | ab
abc | bc
....

This is pretty unintersting relation until we add more attributes: the
"from" and "to" indicating the position of the fragment inside the
data, and "#" -- the occurence number. (All numbers are 0-based,
intervals boudaries are including-excluding). The example becomes:

Substring
=======
data | fragment | from | to | #
------------------------------------
abab | ab | 0 | 2 | 0
abab | ab | 2 | 4 | 1
abc | ab | 0 | 2 | 0
abc | bc | 1 | 3 | 0
....

There are at least 3 functional dependencies:
fragment, from -> to
data, fragment, # -> from
data, from, to -> fragment

Informally these FDs correspond to the length(), instr(), and substr()
functions. So instead of talking about the class String with the
length(), instr(), and substr() member functions, we can focus on a
relation and functional dependencies....
rpost - 06 Dec 2007 17:40 GMT
[...]

>Informally these FDs correspond to the length(), instr(), and substr()
>functions. So instead of talking about the class String with the
>length(), instr(), and substr() member functions, we can focus on a
>relation and functional dependencies....

Certainly.  But we can't describe the full semantics of strings
in that way.  How do you represent concatenation?

Another difference is that database tables are finite and variable,
while the set of strings is infinite and fixed.  But your idea is used
in theoretical papers, where concrete domains are sometimes introduced
in terms of fixed infinite tables.

Signature

Reinier Post
TU Eindhoven

Tegiri Nenashi - 06 Dec 2007 18:15 GMT
> [...]
>
[quoted text clipped - 5 lines]
> Certainly.  But we can't describe the full semantics of strings
> in that way.  How do you represent concatenation?

That was not the right representation, consider the relation:

str | prefix | suffix | pos
-----------------------------
abcd | ab | cd | 2
abac | ab | ac | 2
....

Functional dependencies:
prefix & suffix -> str       (corresponds to concatenation)
prefix -> pos                 (length)
str, pos -> prefix           (prefix substring operation)
str, pos -> suffix           (suffix substring operation)

The combination of the latter two (first extract suffix, and then
extract prefix) is the usual substring operation:  substring(str, pos,
length).

Question: how do we represent the "instr" operator?

> Another difference is that database tables are finite and variable,

Oh, relations in database world are certainly not restricted by finite
cardinality.
Jonathan Leffler - 06 Dec 2007 22:38 GMT
>> Another difference is that database tables are finite and variable,
>
> Oh, relations in database world are certainly not restricted by finite
> cardinality.

I thought that computers are finite, so the relations containable in
them are too - even if damn large.  There's a big difference between
very large and infinite.

One ultimate limitation is the uniqueness requirement.  Suppose you have
a table with two integer columns.  Since the range of the integer types
are finite (even if your DBMS handles multi-precision integers), then
the maximum number of distinct rows in the relation is also finite.

Signature

Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler@earthlink.net, jleffler@us.ibm.com
Guardian of DBD::Informix v2007.0914 -- http://dbi.perl.org/

publictimestamp.org/ptb/PTB-1963 whirlpool 2007-12-06 21:00:03
7275FCF2EDF20725F56D081C5528272FCD717970FCEE5D43EDE454007D35AD3E246B5C
FA6134D8AD5E2A60C6F43508ADFAE3E632D92807ED38395BFD14DA3EA

Tegiri Nenashi - 07 Dec 2007 00:24 GMT
> >> Another difference is that database tables are finite and variable,
>
[quoted text clipped - 4 lines]
> them are too - even if damn large.  There's a big difference between
> very large and infinite.

This doesn't really matter. You can still reason about infinite
relations with finite resources available on you computer platform.

> One ultimate limitation is the uniqueness requirement.  Suppose you have
> a table with two integer columns.  Since the range of the integer types
> are finite (even if your DBMS handles multi-precision integers), then
> the maximum number of distinct rows in the relation is also finite.

All computer algebra systems work with numbers which are not
restricted by a whim of hardware architects. 16/32/64 bit integer
numbers (let alone floats)? give me a break!
Marshall - 07 Dec 2007 02:09 GMT
> > >> Another difference is that database tables are finite and variable,
>
[quoted text clipped - 16 lines]
> restricted by a whim of hardware architects. 16/32/64 bit integer
> numbers (let alone floats)? give me a break!

In fact, it is quite possible to treat n-bit "integers" in an
algebraically pleasant way: as the ring of integers mod 2^n.

Marshall
V.J. Kumar - 09 Dec 2007 04:05 GMT
>> >> Another difference is that database tables are finite and
>> >> variable,
[quoted text clipped - 8 lines]
> This doesn't really matter. You can still reason about infinite
> relations

You can do that with your brain...

with finite resources available on you computer platform.

but not with that.  The computer is an intrinsically finite gadget.
Therefore,  you'd better use the finite model apparatus to reason about
things like the impossibility of expessing transitive closure in the
relational algebra.  A lot of stuff like the compactness theorem does
not work with  finite models which makes infinite model proofs
inapplicable in the finite case.

>> One ultimate limitation is the uniqueness requirement.  Suppose you
>> have a table with two integer columns.  Since the range of the
[quoted text clipped - 5 lines]
> restricted by a whim of hardware architects. 16/32/64 bit integer
> numbers (let alone floats)? give me a break!

Jonathan is of course right, the set of 'floats' is clearly finite that
somewhat clumsily approximates real numbers !
Marshall - 09 Dec 2007 04:38 GMT
> >> >> Another difference is that database tables are finite and
> >> >> variable,
[quoted text clipped - 32 lines]
> Jonathan is of course right, the set of 'floats' is clearly finite that
> somewhat clumsily approximates real numbers !

You fail.

Marshall
Tegiri Nenashi - 10 Dec 2007 18:59 GMT
> >> >> Another difference is that database tables are finite and
> >> >> variable,
[quoted text clipped - 19 lines]
> not work with  finite models which makes infinite model proofs
> inapplicable in the finite case.

Let's make a discussion al little bit more specific. Consider language
recognition problem: given a word (that is a string of symbols) and a
grammar, check if the word generated by the grammar. A naive
evaluation would simply start by generating all the words of a
(potentially infinite) langauge generated by the grammar, and stops as
soon the given word is derived.

Likewise, given a query which involves infinite relations we can have
multiple execution strategies, some of them are not terminating. For
example, a join of the finite relation

x=1

with infinite relation

x=y

has two alternative executions. We can start enumerating all the
tuples in the infinite relation x=y (assuming a countable domain)....

... and will never finish.

However, if we start with the finite relation x=1, the we can fetch
all the matching tuples from x=y by "the index lookup".
V.J. Kumar - 11 Dec 2007 14:08 GMT
>> Tegiri Nenashi <TegiriNena...@gmail.com> wrote
>> innews:cd2ad8bd-78ca-433d-bc0f-3e7ef0c0fe2a@e6g2000prf.googlegroups.co
[quoted text clipped - 48 lines]
> However, if we start with the finite relation x=1, the we can fetch
> all the matching tuples from x=y by "the index lookup".

The formulas look very cute,  no doubt about that,  but immediate questions
are:

how do you propose to index infinite relations ?
what do you do when you join two infinite relations and the result is also
an infinite relation:  x < 5  join x > 1 where x ranges over reals ?
Marshall - 11 Dec 2007 14:57 GMT
> >> Tegiri Nenashi <TegiriNena...@gmail.com> wrote
> >> innews:cd2ad8bd-78ca-433d-bc0f-3e7ef0c0fe2a@e6g2000prf.googlegroups.co
[quoted text clipped - 53 lines]
>
> how do you propose to index infinite relations ?

By formula. For the infinite relation x=y, given x, the formula
for y is straightforward. :-)

> what do you do when you join two infinite relations and the result is also
> an infinite relation:  x < 5  join x > 1 where x ranges over reals ?

You leave it unevaluated until something extensional comes along.

Marshall
David Cressey - 11 Dec 2007 16:50 GMT
> >> Tegiri Nenashi <TegiriNena...@gmail.com> wrote
> >> innews:cd2ad8bd-78ca-433d-bc0f-3e7ef0c0fe2a@e6g2000prf.googlegroups.co
[quoted text clipped - 53 lines]
>
> how do you propose to index infinite relations ?

With an infinite index, of course.
Tegiri Nenashi - 11 Dec 2007 17:20 GMT
> The formulas look very cute,  no doubt about that,  but immediate questions
> are:
>
> how do you propose to index infinite relations ?

First, let's establish the idea that an index for a finite relation
R(x,y) is a function. Given x=1 an index function returns all the
"matching" tuples, or a single tuple if there is a functional
dependency x->y. For an equality relation x=y this function is
identity: given x, return x.

> what do you do when you join two infinite relations and the result is also
> an infinite relation:  x < 5  join x > 1 where x ranges over reals ?

Well, let's not get into incountable domains, because there appeared
no way to apply the ideas of pipelined execution flow there. Also, as
Marshal noted, it is perfectly allright for an intermediatory query
evaluation result to be infinite; it is the final result where a user
is expected to see the finite set.
V.J. Kumar - 11 Dec 2007 19:29 GMT
>> The formulas look very cute,  no doubt about that,  but immediate
>> questions are:
[quoted text clipped - 3 lines]
> First, let's establish the idea that an index for a finite relation
> R(x,y) is a function.

First of all,  if you introduce the arbitrary function as an additional
primitive,  deus ex machina as it were,  then in many cases you do not
need any of the finitely definable infinite relation apparatus.  SQL
already has built-in or user-defined functions that make possible
evaluating queries over a finite tuple domain with embedded infinitary
structures.  You need to try and express query evaluation in your model
with relations only -- that's the whole point !

>Given x=1 an index function returns all the
> "matching" tuples, or a single tuple if there is a functional
[quoted text clipped - 7 lines]
> Well, let's not get into incountable domains, because there appeared
> no way to apply the ideas of pipelined execution flow there.

I do not know what "pipelined execution flow" is, but the uncountable
case is what might actually motivate your research,  as in the spacial-
temporal database area.  The countable case by itself is not very
interesting especially if the user "is expected to see the finite set".  

If you admit first-order formulas ("extended tuples") as a legitimate
query evaluation result, then your ideas coincide with Kanellakis's ideas
wrt constraint databases, a tree that has yet to bear practically usable
fruits !

> Also, as
> Marshal noted, it is perfectly allright for an intermediatory query
> evaluation result to be infinite; it is the final result where a user
> is expected to see the finite set.
Tegiri Nenashi - 11 Dec 2007 20:37 GMT
> >> The formulas look very cute,  no doubt about that,  but immediate
> >> questions are:
[quoted text clipped - 11 lines]
> structures.  You need to try and express query evaluation in your model
> with relations only -- that's the whole point !

Well, in traditional databases index structures are auxiliary.
Likewise, I would like to keep functions hidden. After all there is
one relation

x + y = z

but there are three functions that can index it.

Anyway, this whole sub thread is a digression off the initial thought
about connection between objects and relations on one hand, and object
methods and functional dependencies on the other. I'm surprised nobody
objected the idea (pun intended) that object corresponds to the
relation, and not relation attribute.

In a word, are computers finite machines? Yes. Does it necessarily
restrict relations (and other structures) to be be finite ones? Not
necessarily. I suggest to close this rather unsophisticated
discussion.
Tegiri Nenashi - 11 Dec 2007 20:47 GMT
> Well, in traditional databases index structures are auxiliary.
> Likewise, I would like to keep functions hidden. After all there is
[quoted text clipped - 3 lines]
>
> but there are three functions that can index it.

Let me elaborate a little more. Suppose we are asked to evaluate the
query

x + y = z /\ x=1 /\ z=4

there is no functions there. As usual optimizer engine starts
enumerating and costing every join permutation. It would find out that
the execution below has a finite cost:

1. Evaluate x=1
2. Evaluate z=4
3. Build a Cartesian product result
4. Join with the relation x + y = z using the index (x,z)->z-x
V.J. Kumar - 11 Dec 2007 21:27 GMT
Tegiri Nenashi <TegiriNenashi@gmail.com> wrote in news:38860a21-e69a-
46a4-a798-cc16dde1c4a7@e10g2000prf.googlegroups.com:

>> Well, in traditional databases index structures are auxiliary.
>> Likewise, I would like to keep functions hidden. After all there is
[quoted text clipped - 17 lines]
> 3. Build a Cartesian product result
> 4. Join with the relation x + y = z using the index (x,z)->z-x

What would your optimizer do in the following case:

x^2 + y^2 + z^2 =u^2 /\ u=25

?
Tegiri Nenashi - 11 Dec 2007 21:47 GMT
> Tegiri Nenashi <TegiriNena...@gmail.com> wrote in news:38860a21-e69a-
> 46a4-a798-cc16dde1c...@e10g2000prf.googlegroups.com:
[quoted text clipped - 24 lines]
>
> x^2 + y^2 + z^2 =u^2 /\ u=25

Well how indexes are created in the off-the-shelf databases? DB
administrator creates them, even though the task is rather trivial.
(Remember, that long time ago DBMS were conceived to do it
automatically:-). The case of infinite relations is much more
challenging, therefore I expect a database programmer (or should I
better call him "relational programmer") to code the index function
*manually* and register each index with the corresponding relation.
Tegiri Nenashi - 12 Dec 2007 17:37 GMT
> > Tegiri Nenashi <TegiriNena...@gmail.com> wrote in news:38860a21-e69a-
> > 46a4-a798-cc16dde1c...@e10g2000prf.googlegroups.com:
[quoted text clipped - 32 lines]
> better call him "relational programmer") to code the index function
> *manually* and register each index with the corresponding relation.- Hide quoted text -

The problem is, of course, identifiying the indexes. And it may be
problematic whether more general problems fit into the system R
optimization method. For example given a system of equations:

x + y = 2
x - y = 0

the solution is

(x+y=z) /\ (z=2) /\ (x=y)

Assuming that we have 3 indexes on the first relation:
x,y -> z
x,z -> y
y,z -> x
and two indexes on the last one
x -> y
y -> x
we still can't evaluate this query. BTW, how Kanellakis solves it?
V.J. Kumar - 13 Dec 2007 01:17 GMT
>> > Tegiri Nenashi <TegiriNena...@gmail.com> wrote in
>> > news:38860a21-e69a-
[quoted text clipped - 56 lines]
> y -> x
> we still can't evaluate this query. BTW, how Kanellakis solves it?

He does not !  All you do is just substitute relation definitions for the
query placeholder variables.  You always work with formulas representing
possibly infinite relations,  you do not 'materialize' down to the actual  
tuples; in finite cases such materialization clearly happens
automatically since any finite relation is representable with the
equality predicate and constants.
V.J. Kumar - 11 Dec 2007 21:06 GMT

> Anyway, this whole sub thread is a digression off the initial thought
> about connection between objects and relations on one hand, and object
> methods and functional dependencies on the other. I'm surprised nobody
> objected the idea (pun intended) that object corresponds to the
> relation, and not relation attribute.

That's probably because it is hard to object to or agree with an idea so
vaguely formulated.  Yes,  objects in some sense could be and have been
treated as relations, and vice versa,  so what ?  Some commercial SQL
databases even have both implementations,  'objects as relations' and
'objects as attributes'. Until and unless you define some sort of object
algebra and compare its features with the relational algebra or some
other algebra of relations,  there is nothing to argue about !


> In a word, are computers finite machines? Yes. Does it necessarily
> restrict relations (and other structures) to be be finite ones? Not
> necessarily. I suggest to close this rather unsophisticated
> discussion.
Jan Hidders - 09 Dec 2007 12:47 GMT
> > >> Another difference is that database tables are finite and variable,
>
[quoted text clipped - 7 lines]
> This doesn't really matter. You can still reason about infinite
> relations with finite resources available on you computer platform.

Indeed you can. In case anyone is interested in a concrete example:

http://cse.unl.edu/~revesz/cdb/index.htm

-- Jan Hidders
Marshall - 09 Dec 2007 16:15 GMT
> > > >> Another difference is that database tables are finite and variable,
>
[quoted text clipped - 11 lines]
>
> http://cse.unl.edu/~revesz/cdb/index.htm

Looks very interesting. I bought the intro book.

Marshall
Jan Hidders - 10 Dec 2007 16:19 GMT
> > > > >> Another difference is that database tables are finite and variable,
>
[quoted text clipped - 13 lines]
>
> Looks very interesting. I bought the intro book.

Nice! I only read the other book myself, but I've met Peter and he
writes very nice papers and gives good talks, so I suspect he writes
well.

-- Jan Hidders
V.J. Kumar - 10 Dec 2007 01:09 GMT
>> > >> Another difference is that database tables are finite and
>> > >> variable,
[quoted text clipped - 10 lines]
>
> Indeed you can. In case anyone is interested in a concrete example:

This kind of "reasoning" is an illusion.  The constraint database the
link refers to works with a finite set of first-order formulas ("extended
tuples") that permit quantifier elimination rather than with a finite set
of actual tuples as the relational model would.  It is a simple  
observation that a first-order formula under certain conditions can
"represent" a possibly infinite set.  However,  formula-set
interpretation happens in the user brain only,  not in the database
itself as would be the case with the traditional relational model which
probably explains along with other problems why constraint databases do
not enjoy much success.

> http://cse.unl.edu/~revesz/cdb/index.htm
>
> -- Jan Hidders
Jan Hidders - 10 Dec 2007 17:49 GMT
> >> > >> Another difference is that database tables are finite and
> >> > >> variable,
[quoted text clipped - 19 lines]
> interpretation happens in the user brain only,  not in the database
> itself as would be the case with the traditional relational model [...]

Come on, V.J., I'm quite sure that you can make much more interesting
contributions to this newsgroup than this kind of quibbling. Please
don't let your ego cloud the brightness of your mind.

-- Jan Hidders
paul c - 11 Dec 2007 05:18 GMT
>>>>> Another difference is that database tables are finite and variable,
>>>> Oh, relations in database world are certainly not restricted by finite
[quoted text clipped - 10 lines]
>
> -- Jan Hidders

Because I don't have access to any paid-for resources, I wasn't able to
dig deeper than the page indicated, but when I looked at it, it was
immediately reminiscent to me of type theory (as opposed to relational
theory).  Nothing wrong with that of course.  However, I think it's
important to ask ourselves what it is it that makes sense to implement.

The sense of equality for me has very much to do with one's situation,
ie., the application.  I can see why some apps might see "overlapping"
rectangles, ones that share common vertices if I may use that term, as
equal.  Also how one might consider different photographs of multiple
persons as being the same so far as they might portray the same
individual, among others.

When it comes to a system, I've always had the reaction that a facility
to define domains/types/objects is quite separate from the Rm, even
though any RM implementation must depend on such a facility, whether it
is a mental one or a programmed one.  Usually, from what I've seen, it
is not programmed.

That's about as mystical as I'm prepared to go, outside of a usenet
genealogy group.
Marshall - 09 Dec 2007 08:38 GMT
> There's a big difference between
> very large and infinite.

This gets said a lot, and it's true, but it's not very important.

In practical terms, there is nearly no difference
between a few hundred bits of precision and an infinite
number of bits of precision. Twenty three digits of pi
is sufficient to calculate the circumference of the Milky
Way from the diameter to within a centimeter.

Marshall
Marshall - 07 Dec 2007 02:06 GMT
> > Another difference is that database tables are finite and variable,
>
> Oh, relations in database world are certainly not restricted by finite
> cardinality.

Neither is it restricted to variables--not only in papers and in
theory
but in practical terms.

Marshall
paul c - 07 Dec 2007 03:54 GMT
>>> Another difference is that database tables are finite and variable,
>> Oh, relations in database world are certainly not restricted by finite
[quoted text clipped - 3 lines]
> theory
> but in practical terms.

Marshall, why do I get the feeling you are drifting the theme?  (which
coming from you would be absolutely okay by me, but this seems a bit
cryptic to me and I wonder if you care to expand?)
Marshall - 07 Dec 2007 06:34 GMT
> >>> Another difference is that database tables are finite and variable,
> >> Oh, relations in database world are certainly not restricted by finite
[quoted text clipped - 7 lines]
> coming from you would be absolutely okay by me, but this seems a bit
> cryptic to me and I wonder if you care to expand?)

Well, if nothing else, we can certainly consider relation constants.
DUM and DEE are certainly useful, and of course those are constants,
not variables.

I also don't see anything wrong with considering, say, addition on
arbitrary sized integers as an infinite relation, and reasoning about
it accordingly. Yes, there are machine resource limits, but that
doesn't have to limit our reasoning.

Also, I am working right now on the idea of a relational logic, so
I spend a certain amount of time thinking about equations with
table variables that I don't know much about. Sometimes not
even the predicate.

I guess drifting the theme is something I do a lot. :-)

Marshall
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.