Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / June 2005

Tip: Looking for answers? Try searching our database.

Proving an Upgrade is Possible

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Kenneth Downs - 02 Jun 2005 04:49 GMT
I would love to be able to determine with certainty if a certain operation
can be validated before it is performed.  This is the operation of changing
the state of a database by an upgrade.

Let's define some terms.

First, the state of the database before the upgrade is S(1).  The state
after the upgrade will be S(2).  The operation we wish to pre-validate is
S(1)->S(2).  Let's get back to this after looking at other stuff.

But before we validate the state change, we should validate that the
proposed new meta-State will be valid.  The database's meta-State is
described in normalized tables (real 1NF, no lists please).  The meta-State
before the upgrade is mS(1), which we assume to be valid.  We must validate
the proposed new meta-State, mS(2).  But this easy because it is in
normalized tables, and we can use integrity and referential constraints to
ensure it is valid.

So having validated our destination mS(2), we should also now determine that
the change in meta-State is valid.  This would be to prevent things like
converting a column from char to int (or perhaps defining what should
happen if one were to do that).  Since mS(1) is a set of tables, and mS(2)
is a set of the same structure tables, the operation mS(1)->mS(2) can be
described in a third set of tables, delta-meta-State, or dmS.  We obtain
this set of tables by outer joining all corresponding tables in mS(1) and
mS(2).  We can then have a fourth set of tables that can be used to do
referential checks on dmS and determine if it is valid.

The point that is bugging me is that, if we can prove mS(2) is valid, and
that dmS is valid, does that imply that dS (  S(1)->S(2) ) is valid?   Can
it be proven to be valid without producing database S(2) and then database
dS, because such a task is insane in actual practice.

So I guess I am looking for a theorem that says, "Given S(1), S(2) and mS(1)
and mS(2), where S(1) mS(1), ms(2) and dmS are all known to be valid, it
can be safely concluded that dS and S(2) will be valid".  Wondering if
anybody has seen a rigorous mathematical treatment of this subject, and
what the Google terms might be.
Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

Jan Hidders - 04 Jun 2005 13:45 GMT
> I would love to be able to determine with certainty if a certain operation
> can be validated before it is performed.  This is the operation of changing
> the state of a database by an upgrade.

An upgrade of what? The database? The schema of the database? The DBMS
software? What are you talking about here? And what on earth is a
"meta-state"?

-- Jan Hidders
Kenneth Downs - 04 Jun 2005 15:35 GMT
>> I would love to be able to determine with certainty if a certain
>> operation
[quoted text clipped - 6 lines]
>
> -- Jan Hidders

An upgrade of a database, not the DBMS software.  In the trivial case just
changes to the tables and columns, but also at very least unique and
referential constraints.  

The meta-state is all possible states described by the data dictionary (or
meta-data if you prefer).

If you have a data dictionary containing the meta data that describes the
database, an upgrade = a change to the data dicitonary.  So just as a user
transaction changes the state of a database, an upgrade changes the meta
data, or the meta state.  

My goal is to always know completely whether all structure changes to a db
are valid.

Here is a case of where you can prove an upgrade will fail, or at least you
cannot prove it will succeed.  If the WIDGETS table currently has a unique
constraint on COL_A, COL_B, and COL_C, we know that an upgrade (or any
change to the database) can fail if the new version has the unique
constraint only on COL_A and COL_B.  While it may succeed in some cases, it
will fail in others.

How would you know in advance though that this upgrade could not work?
Well, its simple enough.  You have the current state of the database
described in tables, meta-State 1, or mS(1).  You have the destination
state described in other tables, mS(2).  You join them together and look at
everything that changes, such as key definitions.  The dropping of a column
in a key is not allowed, so the upgrade log puts out an error and refuses
to run.  

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

Jan Hidders - 04 Jun 2005 23:38 GMT
> If you have a data dictionary containing the meta data that describes the
> database, an upgrade = a change to the data dicitonary.  So just as a user
> transaction changes the state of a database, an upgrade changes the meta
> data, or the meta state.

Ok. So you are talking about updates to the database schema.

> My goal is to always know completely whether all structure changes to a db
> are valid.

When you say "structure changes" you don't mean just changes to the
structure, do you? You mean changes to both the database structure and
the database constraints, right? Although below it seems you in fact
exclude changes to the structure and only consider changes to the
constraints.

> Here is a case of where you can prove an upgrade will fail, or at least you
> cannot prove it will succeed.  If the WIDGETS table currently has a unique
> constraint on COL_A, COL_B, and COL_C, we know that an upgrade (or any
> change to the database) can fail if the new version has the unique
> constraint only on COL_A and COL_B.  While it may succeed in some cases, it
> will fail in others.

Ok, so I suspect are you asking the following:

"Given the old database schema and a few updates to its constraints (but
not its structure) is it possible to algorithmically decide (or
mathematically prove) whether all instances of the old schema are also
instances of the new schema?"

I leave it to you whether it is "algorithmically decide" or
"mathematically prove".

Am I close?

-- Jan Hidders
Kenneth Downs - 05 Jun 2005 14:52 GMT
>> If you have a data dictionary containing the meta data that describes the
>> database, an upgrade = a change to the data dicitonary.  So just as a
[quoted text clipped - 11 lines]
> exclude changes to the structure and only consider changes to the
> constraints.

The example considers only a constraint, to keep it simple.  The OP states
structure and constraints.  Also included in my systems are automated
columns and tables, so they must be included alos.

>> Here is a case of where you can prove an upgrade will fail, or at least
>> you
[quoted text clipped - 15 lines]
>
> Am I close?

You are on target, except that constraints are included, and also automation
(columns like price_extended = price * qty).  So while we are at it, this
means things like NULL/NOT NULL rules, DEFAULT values and so forth.

Another way to state the question is:  "What structure of meta-data allows
us to specify completely a database of arbitrary complexity and also allows
us to demonstrate by analysis that any given change in the meta data is
valid or not valid".

Methinks the structure of the meta-data would reduce the truth to a data
validity issue (an invalid transform would fail a validitity check in the
meta data), which is closer to an algorithm than to a mathematical proof.
Yet, I have the strong intuitive sense that a before-hand mathematical
proof could lead to the table structure.

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

Jan Hidders - 05 Jun 2005 18:51 GMT
>>Ok, so I suspect are you asking the following:
>>
[quoted text clipped - 9 lines]
> (columns like price_extended = price * qty).  So while we are at it, this
> means things like NULL/NOT NULL rules, DEFAULT values and so forth.

What do you mean by "constraints are included"? Those were already
included in my formulation. Since you only care about whether instances
are valid or not, automation can in this case be regarded as a special
case of constraints. Default values are completely irrelevant for your
question because they do not change the set of allowed valid instances.

Or do you want to also consider updates that change the structure of the
schema, because in that case my formulation would be completely off track.

> Another way to state the question is:  "What structure of meta-data allows
> us to specify completely a database of arbitrary complexity and also allows
> us to demonstrate by analysis that any given change in the meta data is
> valid or not valid".

And when exactly is a change to the meta-data "valid"? What does the
specification of the meta-data look like? What do you mean by "arbitrary
complexity"? What do you mean by "demonstrate by analysis"?

> Methinks the structure of the meta-data would reduce the truth to a data
> validity issue (an invalid transform would fail a validitity check in the
> meta data), which is closer to an algorithm than to a mathematical proof.
> Yet, I have the strong intuitive sense that a before-hand mathematical
> proof could lead to the table structure.

Whoah, wait right there! You want to derive the table structure? That's
a completely different question! Since you are still a bit sloppy with
your terminology and still have not really given a mathematically exact
definition of the theorem (which is mathematical in nature) that you
want to discuss, I suggest that we first keep things simple and start
with the case where the structure of the tables is fixed. Note that I'm
not even talking about the answer of the problem here, just about
formulating the problem. That's always the first problem you need to
solve. :-)

-- Jan Hidders
Kenneth Downs - 06 Jun 2005 14:29 GMT
>>>Ok, so I suspect are you asking the following:
>>>
[quoted text clipped - 16 lines]
> case of constraints. Default values are completely irrelevant for your
> question because they do not change the set of allowed valid instances.

To provide an analytic and deterministic answer to the viability of any db
change, you must consider constraints, null rules, and defaults.

Consider this case.  Add two columns to a table, COL_A and COL_B, with the
constraint that COL_A < COL_B.  One has a default value, the other does
not, and they may not be null.  What on Earth  do you do to satisfy these
requirements in the rows that already exist in the table?  

I am not here asking for the answer to that question, anybody in this forum
can come up with case-by-case answers to puzzles.  What I am probing is the
ability to determine the answer by analysis in the general case.  This way
a db analsyst punches in what they want to happen, and the system tells
them why it will not work.

> Or do you want to also consider updates that change the structure of the
> schema, because in that case my formulation would be completely off track.

I must not understand your formulation because I thought we were talking
about the same thing: analyzing in advance a change to a database that
might include anything the db server has implemented: tables, columns,
constraints, null rules, and defaults.

Perhaps we should keep automation out of it because that is something I do
that is generally considered heresy and it tends to divert discussions.

>> Another way to state the question is:  "What structure of meta-data
>> allows us to specify completely a database of arbitrary complexity and
[quoted text clipped - 4 lines]
> specification of the meta-data look like? What do you mean by "arbitrary
> complexity"? What do you mean by "demonstrate by analysis"?

The whole point is to arrive at a description of the meta-data that allows
the analytic prediction of the validity of a state change.  The meta-data
is in tables, so it is valid if it satisfies the constraints on those
tables.

Arbitrary complexity means the system you are describing in meta-data has no
limits on complexity, either in number of tables, columns, constraints, or
their interaction.  Put another way, it is not a "toy" system that breaks
down when you try to make real-world mission-critical stuff.

To demonstrate by analysis is to examine the meta-data and know if the
upgrade will succeed.  I did not choose your term "algorithmically decide"
because it is not a step-by-step code thing, the validity or invalidity of
the state change should be captured in the state of the meta-data. For the
same reason I did not choose your term "mathematically prove".  Demonstrate
by Analysis seems closer to the mark because we want to examine (or
analyze) the meta-data.

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

Jan Hidders - 07 Jun 2005 22:36 GMT
>>>>Ok, so I suspect are you asking the following:
>>>>
[quoted text clipped - 24 lines]
> not, and they may not be null.  What on Earth  do you do to satisfy these
> requirements in the rows that already exist in the table?  

You don't. In this case the specification of the schema update is
incomplete because the new instance for the new schema is not completely
determined. There are actually two problems that you would like to be
able to decide in advance. The first is whether the new instance is
actually fully determined or not. The second is (presuming the first is
answered positively) whether the new instance always is a valid instance
of the new schema.

To what extent these problems are solvable or not depends upon a lot of
details such as if you consider each schema update as a sequence of
atomic updates or as one big update, but also the nature of the
constraints that are allowed or the type of queries that may be used for
derived columns/tables. So until you give an idea of what your update
language and its semantics are, how the constraint language looks what
type of queries you allow for derived data it's hard to give a
meaningful answer to your question.

> upgrade will succeed.  I did not choose your term "algorithmically decide"
> because it is not a step-by-step code thing, the validity or invalidity of
> the state change should be captured in the state of the meta-data. For the
> same reason I did not choose your term "mathematically prove".  Demonstrate
> by Analysis seems closer to the mark because we want to examine (or
> analyze) the meta-data.

I suspect that what I describe as "mathematically prove" is the same as
your "analyze". It simply means using some kind of formalism (set
theory, logic, calculus, whatever) to derive true statements about the
formal object you are studying. Note that for certain problems for which
there is no finite axiomatization this is in a certain sense sometimes
not possible.

-- Jan Hidders
Kenneth Downs - 08 Jun 2005 19:19 GMT
>>>>>Ok, so I suspect are you asking the following:
>>>>>
[quoted text clipped - 33 lines]
> answered positively) whether the new instance always is a valid instance
> of the new schema.

Exactly.

> To what extent these problems are solvable or not depends upon a lot of
> details such as if you consider each schema update as a sequence of
[quoted text clipped - 4 lines]
> type of queries you allow for derived data it's hard to give a
> meaningful answer to your question.

Here is our current specification for database specifications:

http://www.secdat.com/dev/androspec.html

It is pretty thorough in terms of what it can define, but it is still
lacking in the topic of this thread: fully and completely determining that
an upgrade is possible.  I usually start a thread on something here before
tackling a problem.

>> upgrade will succeed.  I did not choose your term "algorithmically
>> decide" because it is not a step-by-step code thing, the validity or
[quoted text clipped - 10 lines]
> there is no finite axiomatization this is in a certain sense sometimes
> not possible.

Methinks we are in fact saying the same thing.  My final word on it would be
that we cast the "mathematical proof" of the validity of the upgrade to the
act of examining some meta-data tables for conventional stuff like unique
and referential violations.

The question of infinite possibilities is tough.  You don't have to resort
to Godel to know that any set of features will be unsatisfactory for all
possible cases, but you can at least prove that any operation involving
those features is valid or not valid.  If the primitive operations are
powerful enough in combination, you can safely cast all problems in terms
of those primitives.

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

mountain man - 05 Jun 2005 08:05 GMT
> My goal is to always know completely whether all structure changes to a db
> are valid.
[quoted text clipped - 7 lines]
> it
> will fail in others.

Here it seems as if you are relaxing a constraint.
I would have thought the relaxation of constraints
should always be successful, whereas the creation
of constrainst may be problematic.

Will not the determination of success of failure ultimately also
resolve to the state of the data (in addition to the schema and
its constraints, etc)?

Achieving your stated goal relies upon an assumption
about the data, that *the_data* will be amenable to
the structural changes imposed.  What if it isn't?

Only a test-conversion will tell you that.  This is
not to say that "proving the schema upgrade is
possible" ---- at the schema level alone ---- is
not a worthwhile step in the upgrade path, as
in the end you need to know about any failure,
hopefully in advance.

Signature

Pete Brown
IT Managers & Engineers
Falls Creek
Australia
www.mountainman.com.au/software

Kenneth Downs - 05 Jun 2005 14:46 GMT
>> My goal is to always know completely whether all structure changes to a
>> db are valid.
[quoted text clipped - 12 lines]
> should always be successful, whereas the creation
> of constrainst may be problematic.

Ooops, I gave the example in reverse, you are correct.

> Will not the determination of success of failure ultimately also
> resolve to the state of the data (in addition to the schema and
> its constraints, etc)?

Yes, this is what makes it not possible to prove in advance that it will
work.  You don't want features in the product that are known to be fallible
before they even go out the door.

> Achieving your stated goal relies upon an assumption
> about the data, that *the_data* will be amenable to
> the structural changes imposed.  What if it isn't?

Again, the point it is to know it in the dev shop and not commit changes
that will not work.  The original OP is about hwo to always know whether or
not the proposed change will work.

> Only a test-conversion will tell you that.  This is
> not to say that "proving the schema upgrade is
> possible" ---- at the schema level alone ---- is
> not a worthwhile step in the upgrade path, as
> in the end you need to know about any failure,
> hopefully in advance.

Not true.  In the example above, I showed how you could know before the
conversion that it would fail, without going through the conversion.  

I am seeking the structure of meta data that would offer this ability in the
general case, that the feature set is complete enough to describe any
application of arbitrary complexity, and that any feature in a particular
case can be known to be possible by analysis before attempting the upgrade
operation.

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

mountain man - 06 Jun 2005 00:57 GMT
>> Only a test-conversion will tell you that.  This is
>> not to say that "proving the schema upgrade is
[quoted text clipped - 12 lines]
> case can be known to be possible by analysis before attempting the upgrade
> operation.

It appears that you are trying to develop a generic process
whereby the consistency/validity of schema changes can be
tested at the schema level, before the data is dealt with.
Essentially, a schematic-change exception check between
the old and the new and proposed elements of schema change.

Is this correct?

The testing with any live data happens at a later stage.
Is this correct?

Signature

Pete Brown
Falls Creek
OZ
www.mountainman.com.au

Kenneth Downs - 06 Jun 2005 03:26 GMT
>>> Only a test-conversion will tell you that.  This is
>>> not to say that "proving the schema upgrade is
[quoted text clipped - 20 lines]
>
> Is this correct?

Yes.

> The testing with any live data happens at a later stage.
> Is this correct?

No, I want this to be included in the analysis of the schema change, so it
checks both:

mS(2):  the destination schema (or meta-State)
Delta-mS:  the change in schema (or change in meta-State)
Delta-S: Change in state of user data

Signature

Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.