Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / March 2004

Tip: Looking for answers? Try searching our database.

Multiple specification of constraints

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Dawn M. Wolthuis - 25 Feb 2004 18:39 GMT
This is a partial response to Eric's previous e-mail.

Where should constraints be specified?

Take a services framework that includes an interface for "web services" such
as SOAP-wrapped XML that comes in from some place external to our system and
also a user interface for people doing data entry.  This is an appropriate
design for Orders, for example.

In this case, we could break down the high level services as follows:

Web Service "entry service"
GUI

(those two being the analogous interfaces to what used to be called "batch"
and "interactive" computing)
These services might then both be clients to the same

Data Validation Service

This would apply all business rules and constraints related to the data
entered through one means or the other.  Ignoring workflow types of services
for now, we would add in at least

Data CRUD Services to Create, Read, Update, and/or Delete data stored in a
database

When we use DBMS's, the Data Validation Service and CRUD Services are
tightly coupled.  This gives an advantage of a "lock box" (as Al Gore might
say) for our data. It is not only impossible for a user to write data with
an improper type, for example, it is also impossible for an IT professional
without DBA access to do so.

However, if you look at software applications, it is common, even required,
to have a GUI employ such business rules, for example, in giving users a
list of valid codes from which to choose, or even sometimes in finishing the
data entry based on prior entries (as with Excel) or based on valid entries.
Additionally, once a set of data is passed to the Validation Service, prior
to heading to the CRUD Services (whether Validation and CRUD are tightly or
loosely coupled), these business rules are employed again.

Isn't it rather common, for example, for there to be a drop down box locking
in the user to a valid set of choices, with the specification for this drop
down box in one place/language and then these choices again validated when
changes are made to the database?  We are then taking our industry's data
(specs/code) and duplicating it, thus making it difficult to maintain while
keeping integrity in this data.

My point is that with the way that software is written today, quite often
data constraints are coded in multiple places, even multiple languages.
There are apparently many people as well as exceptional conditions that
might prompt multiple locks on a door, but this need not be the norm, I
would think.

If the front-end services are written in Java, for example, then I would
prefer that the GUI and the Validation Services make use of the exact same
constraint specs and logic to validate the data against the constraints.
Even if we tap into the specifications that are placed in a DBMS, taking the
logic of the DBMS that is used to apply the constraints and using that up
front in the GUI isn't happening in current software applications, in
general, for at least a few reasons I can think of:

1) The DBMS's logic is applied when CRUD services are launched, while the
logic in the GUI is not ready to launch such services, so there is not an
easy means (or perhaps any means) for the application software developer to
tap into the DBMS's logic.  This is true for type constraints as well as ref
int constraints where the GUI might need to come back to a user and ask the
equivalent of  "Are you sure you want to delete this folder and all of its
contents?" when a user requests that a parent be removed, for example.

2) The application developer cannot easily see the specs and logic applied
by the DBMS to know whether that is the same as what they believe they need
for this application.  So they do not even make the decision on whether the
constraints they are applying are tighter in this case than what the
database spec permits.  It is easier for the application developer to treat
the DBMS constraints as a black box where they try not to bump into the
limits, but they don't think of them as related.

3) Reading a database to get constraints when trying to respond quickly to
the end-user is sometimes considered by the application developer to be out
of the question compared to employing perhaps byte code that runs in an app
server JVM.

One solid design could be to have all of the constraints specified and the
validations applied in a Validation Service that is able to reuse the same
constraint specifications that the GUI uses and that the CRUD routines use,
all specified outside of the DBMS.  We could code and run all such services
in a JVM on an app server (rather than the database server).  In this case,
employing a file system that might not be considered a DBMS would be
acceptable provided that we still have the door locked.  This can be done by
permitting CRUD services on the stored data only via the designated
services.

The constraint specs and logic need to be moved right up to the front of the
application -- to the GUI.  I know some have advocated for the database
language to deliver the applications, but that tends to tightly couple way
too much of the code, rather than employing a good design pattern, such as a
services pattern to the application.

I'm just thinking out loud and I'm sure something I wrote was not brilliant,
but help me out here -- isn't this a real issue in application software
development written today?  I'm suggesting something rather radical --
decouple the constraint specification and validation logic from the
database.  Does anyone agree?

Cheers!  --dawn
mountain man - 26 Feb 2004 00:29 GMT
> This is a partial response to Eric's previous e-mail.
>
> Where should constraints be specified?

...[trim]...

> I'm just thinking out loud and I'm sure something I wrote was not brilliant,
> but help me out here -- isn't this a real issue in application software
> development written today?  I'm suggesting something rather radical --
> decouple the constraint specification and validation logic from the
> database.  Does anyone agree?

What happens when you have two sources for controls/constraints/validations?
The answer is that there must exist and be maintained coordination tasks
which
ensure that the two sources are consistent.  This is from a 'flat
operational view'
that looks at the total system of a day-to-day basis.

The other type of view is dynamic, incorporates far longer time scales and
must
face the problems associated with change management and evolution of an
entire
system.  It is from this perspective (change management) that the
coordination
tasks that are described above become extremely complex in themselves, and
often require serious resources (or additional costs).

For example, assume you implement such a non-db centric "minder of controls,
constraints, validations, etc".  After a while, the organisation somehow
expands
its operations into another business area, and you acquire another one, or a
set
of new databases and applications to maintain.  The organisation now wants
to
integrate the lot, and have it operational before it announces its next
acquisition.

Change management is the final testing ground of good ideas, so do your
ideas
still appear sound if the environment is undergoing critical change? How big
do
the coordination tasks become before you hire another person, or buy some
more software?

Pete Brown
Falls Creek
Oz
Dawn M. Wolthuis - 26 Feb 2004 01:27 GMT
> > This is a partial response to Eric's previous e-mail.
> >
> > Where should constraints be specified?
>
> ...[trim]...
<snip>

> What happens when you have two sources for controls/constraints/validations?
> The answer is that there must exist and be maintained coordination tasks
[quoted text clipped - 28 lines]
> the coordination tasks become before you hire another person, or buy some
> more software?

Good point.

If you do not tightly couple any database storage with database validation
and integrity constraints processing, then I'm guessing (based on
experience) that you can much more easily react to changes.  I definitely
agree that holding specifications and the processing for such constraint
specs in multiple places is a setup for complexity, cost, and having your
constraints out of synch.  We cannot remove the constraint processing from
easy access from the GUI because what shows on the screen is based on theose
constraints, so I suggest removing them from being so tightly coupled with
the database that they are inaccessible to the GUI.

--dawn
Eric Kaun - 03 Mar 2004 15:30 GMT
> If you do not tightly couple any database storage with database validation
> and integrity constraints processing, then I'm guessing (based on
> experience) that you can much more easily react to changes.

This also introduces more points of failure (not in operations, but in
logic). More chances of data corruption. Logic phrased in GUI-like terms. If
programmers have the choice of placing logic in the GUI, in a "data
validation" component, or in intermediate objects, you'll get 10 different
answers from 10 different programmers. While code and design review help
somewhat, generating those things from a central point is much clearer and
less prone to abuse.

> I definitely
> agree that holding specifications and the processing for such constraint
[quoted text clipped - 3 lines]
> constraints, so I suggest removing them from being so tightly coupled with
> the database that they are inaccessible to the GUI.

Having the GUI driving things is a bad idea. Which GUI? Do you always have
just one?

Having the constraints accessible to the GUI doesn't mean they have to be
removed from elsewhere.

- Eric
Eric Kaun - 03 Mar 2004 14:37 GMT
A good discussion - first my response to Dawn's original thoughts:

Dawn wrote:
> Where should constraints be specified?

In one place, the best place, where they can be specified declaratively, and
which won't allow violations of them. If you require violations, your
consraints are wrong.

In addition, the constraints must be easy to extract, so that boilerplate
code can be generated from them. I think that's key, and while it's done in
ad hoc ways by many different companies, this type of generation must be
elevated. It is logic in action; the implementation technologies are
irrelevant.

> Take a services framework that includes an interface for "web services"
such
> as SOAP-wrapped XML that comes in from some place external to our system
and
> also a user interface for people doing data entry.  This is an appropriate
> design for Orders, for example.
[quoted text clipped - 5 lines]
>
> (those two being the analogous interfaces to what used to be called
"batch"
> and "interactive" computing)
> These services might then both be clients to the same
[quoted text clipped - 3 lines]
> This would apply all business rules and constraints related to the data
> entered through one means or the other.

This means that your constraints have to be applied for ALL data entry.
(Just repeating because it's critical.)

> Ignoring workflow types of services
> for now, we would add in at least
[quoted text clipped - 4 lines]
> When we use DBMS's, the Data Validation Service and CRUD Services are
> tightly coupled.  This gives an advantage of a "lock box" (as Al Gore
might
> say) for our data. It is not only impossible for a user to write data with
> an improper type, for example, it is also impossible for an IT
professional
> without DBA access to do so.

Any file system does CRUD. No one must be allowed to update data independent
of the "data validation service." Given that we're trying to abstract above
the level of a file system, I see no good reason to differentiate the "CRUD
services" - that's what the DBMS vendor does.

> However, if you look at software applications, it is common, even
required,
> to have a GUI employ such business rules, for example, in giving users a
> list of valid codes from which to choose, or even sometimes in finishing
the
> data entry based on prior entries (as with Excel) or based on valid
entries.
> Additionally, once a set of data is passed to the Validation Service,
prior
> to heading to the CRUD Services (whether Validation and CRUD are tightly
or
> loosely coupled), these business rules are employed again.

The GUI data "validation" isn't really validation in the same sense, though
its rules are derived directly from constraints. I'd say the GUI should be
"declarative" - rather than validating, give the user no choice to enter bad
data. A drop-down list doesn't validate - it's the sole input mechanism for
that field.

> Isn't it rather common, for example, for there to be a drop down box
locking
> in the user to a valid set of choices, with the specification for this
drop
> down box in one place/language and then these choices again validated when
> changes are made to the database?  We are then taking our industry's data
> (specs/code) and duplicating it, thus making it difficult to maintain
while
> keeping integrity in this data.

Agreed - that's the trouble with multi-tier systems, and it's the reason
code generation is so popular these days. Declared constraints give a much,
MUCH stronger basis on which to generate code, boilerplate, data access
mechanisms, etc. They are for more than just validation - they are are the
meaning of your application.

> My point is that with the way that software is written today, quite often
> data constraints are coded in multiple places, even multiple languages.

With multi-tier systems, they need to be in multiple places, in slightly
different forms. But that code doesn't need to be written and maintained
separately.

> There are apparently many people as well as exceptional conditions that
> might prompt multiple locks on a door, but this need not be the norm, I
> would think.

Don't think of it as multiple locks on a door - think of it as channeling
everyone going to a secure room past the cameras, through the patrolled
hallway, etc.

If your data has much value to the organization, then it will be accessed by
a wide variety of programmers. Declaring constraints does more than just
defend against corruption (though that's extremely important). It also
communicates the meaning of the data and the system.

> If the front-end services are written in Java, for example, then I would
> prefer that the GUI and the Validation Services make use of the exact same
> constraint specs and logic to validate the data against the constraints.

Agreed, the specs should be the same. Their implications within a tier are
not. Only the GUI needs a drop-down list, for example, but the contents of
that drop-down are derived using constraints. If you have an HTML and a
Swing interface to your data, you'll need different code for the same
drop-down, but it's still the same constraints which define its essence.

Separating essence (the constraints) from accident (the implementation
language) is critical, and I hope it's the "next big thing" in software
development.

> Even if we tap into the specifications that are placed in a DBMS, taking
the
> logic of the DBMS that is used to apply the constraints and using that up
> front in the GUI isn't happening in current software applications, in
[quoted text clipped - 3 lines]
> logic in the GUI is not ready to launch such services, so there is not an
> easy means (or perhaps any means) for the application software developer
to
> tap into the DBMS's logic.

There can and should be. The system catalog should tell the GUI most of what
it needs to know.

> This is true for type constraints as well as ref
> int constraints where the GUI might need to come back to a user and ask
the
> equivalent of  "Are you sure you want to delete this folder and all of its
> contents?" when a user requests that a parent be removed, for example.

That's derivable.

> 2) The application developer cannot easily see the specs and logic applied
> by the DBMS to know whether that is the same as what they believe they
need
> for this application.  So they do not even make the decision on whether
the
> constraints they are applying are tighter in this case than what the
> database spec permits.  It is easier for the application developer to
treat
> the DBMS constraints as a black box where they try not to bump into the
> limits, but they don't think of them as related.

Code generation (or direct higher-level language execution) based on
constraints doesn't give them as much of a choice in the matter.

> 3) Reading a database to get constraints when trying to respond quickly to
> the end-user is sometimes considered by the application developer to be
out
> of the question compared to employing perhaps byte code that runs in an
app
> server JVM.

The constraints can be read up-front; JSPs and ASPs are
dynamically-generated GUIs, and can read whatever they like on the server,
before the HTML is streamed to the browser. The app build process can use
the DBMS to generate much, so the code won't run slower than something with
hard-coded duplicated constraints. The structure of your database is going
to change far, far less than its contents.

> One solid design could be to have all of the constraints specified and the
> validations applied in a Validation Service that is able to reuse the same
> constraint specifications that the GUI uses and that the CRUD routines
use,
> all specified outside of the DBMS.

I still haven't seen a compelling reason to be outside the DBMS. You're
DESCRIBING a DBMS!. It's more than just a file system!

> We could code and run all such services
> in a JVM on an app server (rather than the database server).

And you were worried about speed before? :-)

> In this case,
> employing a file system that might not be considered a DBMS would be
> acceptable provided that we still have the door locked.

And feel like doing a lot more work.

> This can be done by
> permitting CRUD services on the stored data only via the designated
> services.

Could be, but why?

> The constraint specs and logic need to be moved right up to the front of
the
> application -- to the GUI.

Depends on what you mean by "moved." Certainly they need to be manifest
there.

> I know some have advocated for the database
> language to deliver the applications, but that tends to tightly couple way
> too much of the code, rather than employing a good design pattern, such as
a
> services pattern to the application.

I'm unclear as to why a services pattern is "good," and what the nature of
the coupling you mention is. I don't see it.

> I'm just thinking out loud and I'm sure something I wrote was not
brilliant,
> but help me out here -- isn't this a real issue in application software
> development written today?

It's the very heart of the matter, and I thank you for raising this
discussion.

> I'm suggesting something rather radical --
> decouple the constraint specification and validation logic from the
> database.  Does anyone agree?

Not I. A database is defined by constraints and validation - decouple them,
and you have what we already have: file systems. You have what DBMSs were
created to abstract above! But again, a good topic.

- Eric
Dawn M. Wolthuis - 03 Mar 2004 19:30 GMT
> A good discussion - first my response to Dawn's original thoughts:
>
[quoted text clipped - 4 lines]
> which won't allow violations of them. If you require violations, your
> consraints are wrong.

Agreed.

<snip>
> Any file system does CRUD. No one must be allowed to update data independent
> of the "data validation service." Given that we're trying to abstract above
> the level of a file system, I see no good reason to differentiate the "CRUD
> services" - that's what the DBMS vendor does.

The reason to do so is not to sever the relationship between the database
CRUD services and the validation services, but to ensure the validation
services need not require that CRUD requests be directed to them.  We need
to be able to access validation/constraint information and apply it in cases
that have nothing to do with the CRUD services.  It is typically too tightly
coupled with CRUD to do this today.

> > However, if you look at software applications, it is common, even
> required,
[quoted text clipped - 14 lines]
> data. A drop-down list doesn't validate - it's the sole input mechanism for
> that field.

Agreed.

> > Isn't it rather common, for example, for there to be a drop down box
> locking
[quoted text clipped - 11 lines]
> mechanisms, etc. They are for more than just validation - they are are the
> meaning of your application.

Agreed.

> > My point is that with the way that software is written today, quite often
> > data constraints are coded in multiple places, even multiple languages.
>
> With multi-tier systems, they need to be in multiple places, in slightly
> different forms. But that code doesn't need to be written and maintained
> separately.

Agreed.

<snip>
> If your data has much value to the organization, then it will be accessed by
> a wide variety of programmers. Declaring constraints does more than just
[quoted text clipped - 4 lines]
> > prefer that the GUI and the Validation Services make use of the exact same
> > constraint specs and logic to validate the data against the constraints.

Agreed.

> Agreed, the specs should be the same. Their implications within a tier are
> not. Only the GUI needs a drop-down list, for example, but the contents of
> that drop-down are derived using constraints. If you have an HTML and a
> Swing interface to your data, you'll need different code for the same
> drop-down, but it's still the same constraints which define its essence.

Yup.

> Separating essence (the constraints) from accident (the implementation
> language) is critical, and I hope it's the "next big thing" in software
> development.

This statement is somewhat meaningless to me -- I'm pretty sure you are not
saying that you would not encode these constraints in some language, right?
Whether that language is declarative, procedural, OO, or whatever, I just
don't care -- that's an implementation detail.  It is key that the
constraints and the application of the constraint logic (interpretation of
specs, if that is how it is implemented) be accessible (as "services" the
way I view it) to the database CRUD services and any software application UI
services, for example.

> > Even if we tap into the specifications that are placed in a DBMS, taking
> the
[quoted text clipped - 10 lines]
> There can and should be. The system catalog should tell the GUI most of what
> it needs to know.

Yes, in a similar way that it is "telling" the database CRUD services what
they need to know.

> > This is true for type constraints as well as ref
> > int constraints where the GUI might need to come back to a user and ask
[quoted text clipped - 3 lines]
>
> That's derivable.

Yup.

> > 2) The application developer cannot easily see the specs and logic applied
> > by the DBMS to know whether that is the same as what they believe they
[quoted text clipped - 9 lines]
> Code generation (or direct higher-level language execution) based on
> constraints doesn't give them as much of a choice in the matter.

Of course I would prefer to see it as giving them the full choice in the
matter.

> > 3) Reading a database to get constraints when trying to respond quickly to
> > the end-user is sometimes considered by the application developer to be
[quoted text clipped - 9 lines]
> hard-coded duplicated constraints. The structure of your database is going
> to change far, far less than its contents.

why would we gen code from the DBMS for use with the GUI and not think of
gen'ing code for similar applications of constraint logic with the CRUD
services?  Simply RUN (not gen) the same services for both.

> > One solid design could be to have all of the constraints specified and the
> > validations applied in a Validation Service that is able to reuse the same
[quoted text clipped - 4 lines]
> I still haven't seen a compelling reason to be outside the DBMS. You're
> DESCRIBING a DBMS!. It's more than just a file system!

Fine -- call the entire application software development environment a DBMS.
But I haven't seen such an animal yet, so whether there are, in the future,
pre-packaged DBMS's that include this functionality or whether we write all
such components today, it is still functionality I would not want to
compromise.

> > We could code and run all such services
> > in a JVM on an app server (rather than the database server).
>
> And you were worried about speed before? :-)

smiles.  Unfortunately whether tuning Java code or Oracle, there is still
expertise required to get the desired performance, no matter how much bigger
boxes are each year.

> > In this case,
> > employing a file system that might not be considered a DBMS would be
[quoted text clipped - 7 lines]
>
> Could be, but why?

so as not to duplicate the constraint specs, logic, and engine

> > The constraint specs and logic need to be moved right up to the front of
> the
> > application -- to the GUI.
>
> Depends on what you mean by "moved." Certainly they need to be manifest
> there.

granted.

> > I know some have advocated for the database
> > language to deliver the applications, but that tends to tightly couple way
[quoted text clipped - 4 lines]
> I'm unclear as to why a services pattern is "good," and what the nature of
> the coupling you mention is. I don't see it.

I understand why -- it was poorly stated and perhaps even poorly thought
out -- I'll get back to that someday if words to describe my thoughts are
less muddy.  Sorry not to have provided clean lines for that one.  That
isn't even a hypothesis -- it is just intuition, so ...

> > I'm just thinking out loud and I'm sure something I wrote was not
> brilliant,
[quoted text clipped - 3 lines]
> It's the very heart of the matter, and I thank you for raising this
> discussion.

And thank-you for taking the questions seriously and not responding with any
"stupid" accusations, Eric!

> > I'm suggesting something rather radical --
> > decouple the constraint specification and validation logic from the
[quoted text clipped - 3 lines]
> and you have what we already have: file systems. You have what DBMSs were
> created to abstract above! But again, a good topic.

My intent with the "decoupling" suggestion is not to make it so that CRUD
services do not make use of the validation services (multiple negatives
there) but to ensure that these validation/constraint services are available
wherever they are needed in software applications so they are not
duplicated.

--dawn
Eric Kaun - 04 Mar 2004 13:56 GMT
> > Any file system does CRUD. No one must be allowed to update data
> independent
[quoted text clipped - 10 lines]
> that have nothing to do with the CRUD services.  It is typically too tightly
> coupled with CRUD to do this today.

Oh, I agree completely. I was referring to that when I said (somewhere) that
languages are not nearly relational enough... if I had relations in my
programming language (e.g. for client and non-persistence stuff), I'd be
overjoyed.

> > Separating essence (the constraints) from accident (the implementation
> > language) is critical, and I hope it's the "next big thing" in software
[quoted text clipped - 8 lines]
> way I view it) to the database CRUD services and any software application UI
> services, for example.

Yes, definitely a language, though I think declarative is better because it
is easier to prove and reason about, generally shorter, can be optimized
more easily, etc.

I would be interested in what you mean by "accessible as services." I think
services is an over-used word, and I've yet to see a good definition as
such. What differentiates a service from a non-service?

> > Code generation (or direct higher-level language execution) based on
> > constraints doesn't give them as much of a choice in the matter.
>
> Of course I would prefer to see it as giving them the full choice in the
> matter.

We might be talking about different things - I prefer not to give developers
the choice to get around constraints that the designers and analysts have
agreed are critical for the business.

> > The constraints can be read up-front; JSPs and ASPs are
> > dynamically-generated GUIs, and can read whatever they like on the server,
[quoted text clipped - 3 lines]
> > hard-coded duplicated constraints. The structure of your database is going
> > to change far, far less than its contents.

I was unclear here - I meant specifically generate using the constraints
defined in the (R)DBMS.

> why would we gen code from the DBMS for use with the GUI and not think of
> gen'ing code for similar applications of constraint logic with the CRUD
> services?  Simply RUN (not gen) the same services for both.

Not sure I know what you're saying here, but one reason might be to
accomodate technology evolution: "Hey!" says the CEO. "Let's do .NET instead
of J2EE! Cool!"  Etc.

> And thank-you for taking the questions seriously and not responding with any
> "stupid" accusations, Eric!

Not a problem...

> > Not I. A database is defined by constraints and validation - decouple
> them,
[quoted text clipped - 6 lines]
> wherever they are needed in software applications so they are not
> duplicated.

That sounds different than what you were saying before... I thought you were
specifically separating validation from CRUD.

What I think we agree on:
1. that validation services (I'd say constraints) are useful animals with
implications throughout the app
2. that manually keeping multiple tiers in sync with respect to these
contraints is a poor use of time

- Eric
Marshall Spight - 26 Feb 2004 03:14 GMT
> I'm just thinking out loud and I'm sure something I wrote was not brilliant,
> but help me out here -- isn't this a real issue in application software
> development written today?  I'm suggesting something rather radical --
> decouple the constraint specification and validation logic from the
> database.  Does anyone agree?

It is a real issue, but I don't agree with the direction you're taking.

I think the optimal solution here is to have the constraints specified
declaratively and centrally, but to be able to *additionally* execute
them remotely, i.e., on the client. That last part, on the client, is actually
the least essential place for validation to occur; it's the only one that
can be omitted. It's always essential to have it enforced centrally, on the
server, because clients are typically not running on trusted computing
bases.

Marshall
Dawn M. Wolthuis - 26 Feb 2004 14:01 GMT
> > I'm suggesting something rather radical --
> > decouple the constraint specification and validation logic from the
[quoted text clipped - 9 lines]
> server, because clients are typically not running on trusted computing
> bases.

I agree they should be specified "centrally" but don't necessarily agree
with what you mean by that.  If you mean that they should be specified in a
way that is tightly coupled with database storage services, I disagree.  I
would advocate for use of a design that might include an http server with a
jvm as the location for all central rules/constraints processing.  I
certainly wouldn't locate the validation on the client, but might use the
Service UI libraries with Jini for distributed computing -- in that case the
compiled (to byte code) validation logic is executed on whichever platform
it needs to be.

Additionally, I would argue that the user interface is the MOST important
place for knowing the constraints so that the user interface can be as
intuitive as possible, preferably written so as not to permit any illegal
data from even being entered (but where that isn't feasible, at least
letting giving the user the opportunity to fix it immediately).  The user
interface makes a big difference in the accuracy of the information
collected.

I forget who wrote up the story of a major retail chain that built a new
store in a certain zip code because their data said a lot of people go from
that zip to another to go to this chain.  They had to close the store for
lack of customers.  It turnes out that one of the retail clerks didn't like
the constant question of what zip code the buyer was in, so she always
entered her own to get past that question.  Software usability is key to
clean data.  The GUI having full knowledge of the business rules
(validations & all constraints) is critical to a usable GUI.  If the
validation should be coded only once (which I agree with), it MUST be where
it is usable by the user interface routines.  And if we apply all integrity
logic up front, so that nothing gets past our validation routines into the
database CRUD services, then we don't have to have duplicate logic in the
database, right?

Another advantage of this approach is that there is not a need for both a
DBA who knows one language and an application developer who knows another to
either a) cross train so they are both competent in each or b) wait on each
other, often not exactly in a "team player" sort of way when a small change
is required in an application that requires a change in the type checking.

--dawn
ben brugman - 26 Feb 2004 17:14 GMT
I agree with most that is allready replied.
Constraints should be centralised. But the problem
is often where and how.

Constraints which can be enforced in the RDBMS system
should be enforced there. (If the constraint is not to complex).

Then most applications (server or client) will enforce a constraint
again, because the application does not like the exception handling
when a constraint of the database goes 'off'. For the GUI it is a
lot handier to remark there is no data the 30th of februar
(except in sweden) or that data is on a sunday and you can not
use that date, than to do the database access and get an error.
So most programmers doing the data access still have to know
the constraints to anticipate on them, same goes for the Gui programmers.
(I hate web applications which first ask all the questions and in the end
tell me that I am not allowed to do something because of .....)

Then there are constraints which are more difficult or impossible
to implement in the RDBMS, and although the constraints should
be 'kept' centralised. These have to be enforced outside the RDBMS.
(Some database people say that if a constraint can not be implemented
in a database, then it can not be a real constraint. I do not agree with
that.)

The advantage of enforcing the constraints in the RDBMS over
another centralised system is that it also protects against concurrency and
that it is often better scalable than a 'coding' solution.

> I forget who wrote up the story of a major retail chain that built a new
> store in a certain zip code because their data said a lot of people go from
> that zip to another to go to this chain.

Looks like Urban legend to me.

But I know that there is a similar story. A retail chain did not have a
outlet
in a more posh area, because surveys had shown people in that area would
not buy there. But from that area people did use the retail store in other
area's because they where not recognised there.
This the retail knew from payment information.
(The area and the retail store are both know by me).

ben brugman
Bob Badour - 26 Feb 2004 18:19 GMT
> I agree with most that is allready replied.
> Constraints should be centralised. But the problem
> is often where and how.
>
> Constraints which can be enforced in the RDBMS system
> should be enforced there. (If the constraint is not to complex).

No such complex constraint exists.

> Then most applications (server or client) will enforce a constraint
> again, because the application does not like the exception handling
> when a constraint of the database goes 'off'.

If the application was written by a competent programmer, it likes it just
fine.

> Then there are constraints which are more difficult or impossible
> to implement in the RDBMS, and although the constraints should
> be 'kept' centralised.

No such constraint exists. All you have to do to disprove my contention is
demonstrate a single constraint in a well designed schema that contradicts
my assertion.
ben brugman - 01 Mar 2004 08:48 GMT
> No such constraint exists. All you have to do to disprove my contention is
> demonstrate a single constraint in a well designed schema that contradicts
> my assertion.

1. (Data outside the one database).
If the constraint is over more than one database, there is not ONE database
that can check the constraint. If a link is not possible the constraint has
to be checked outside the database.

2. (Security is kept outside the database.)
Bank, black box identity checking, is done outside the database for several
reasons, one is security. (The encryption and the codes are often kept out
of the server system).

3. (Situation is to complex (performance) to be resolved inside the
database).
Content based authorisation. Allthough the authorisation could be handled
by joining the 'correct' tables, this would make most queries (and views)
prohibitive slow.
(The complexity of this situation is far greater if implemented in the
database than it is if implemented in the server coding, therefore the
server coding is a better place to implement these constraints).

4.(Snapshot isolation does not prevent other transactions to alter read
rows).
Depending on the implementation of the RDBMS, some checks can not
be done within the RDBMS. Read-consistency (Snapshot isolation) in
Oracle prevents that within a transaction you can control the change
of only read rows. (Another concurrent transaction can change rows
that the first transaction has only read and are required to remain the
same during the complete transaction).

>>If the application was written by a competent programmer, it likes it just
>>fine.

I am sorry to say, but I have always worked with real existing and to my
opinion competent programmers. (A varying degree of competence I must
admit). But here (and that has been written before)
I must conclude that by the newsgroup standards my organisation has
always worked with NON-competent programmers.
It is often said that if a certain situation arrises then the cause must
be incompetent programmers. I do not agree with that. But if that is the
definition of competent programmers, so be it. Then I have to accomodate
for the group of programmers we actually are working with.

Back to the argument, I think that in a lot of environments the
communication
between the database and the calling code the error handling and exception
mechanism could be improved.
If no checking would be done by the calling code and all checking would be
done by the database. The errorhandling and differentiating between
different errors would give rise to more complex coding than actually
checking
for those conditions before calling the database code.
Inserting a 'child' record is often only allowed after the 'parent' record
has
been read by the application. The database then takes care of concurrency
problems (disappearing parent), but in general the application takes care
of the fact that the parent should exist before inserting.

ben brugman
Bob Badour - 02 Mar 2004 04:07 GMT
> > > Then there are constraints which are more difficult or impossible
> > > to implement in the RDBMS, and although the constraints should
[quoted text clipped - 5 lines]
> >
> 1. (Data outside the one database).

i.e. One made an arbitrary decision not to manage the data. This does has
nothing to do with difficulty or impossibility.

> 2. (Security is kept outside the database.)

i.e. One made an arbitrary decision not to manage the data's security.

> (The encryption and the codes are often kept out
> of the server system).

I suggest you switch to a secure server. Communicating codes from another
source only adds an opportunity for interception.

> 3. (Situation is to complex (performance) to be resolved inside the
> database).

If it is too complex to resolve within the dbms, it is necessarily too
complex to resolve anywhere else.

> 4.(Snapshot isolation does not prevent other transactions to alter read
> rows).

Poor choice of dbms does not relate to difficulty or impossibility.

> Back to the argument, I think that in a lot of environments the
> communication
> between the database and the calling code the error handling and exception
> mechanism could be improved.

Or removed entirely.

> If no checking would be done by the calling code and all checking would be
> done by the database.

A database is a set of facts. It does nothing on its own. A dbms does
things. A file processor does things. An application does things. A database
just is.
Dawn M. Wolthuis - 26 Feb 2004 20:16 GMT
> I agree with most that is allready replied.
> Constraints should be centralised. But the problem
[quoted text clipped - 13 lines]
> (I hate web applications which first ask all the questions and in the end
> tell me that I am not allowed to do something because of .....)

If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS
COMPLEX DATA -- locating it once in the database and again in a different
language in the business rules used by the GUI.  Is that correct?  It does
seem like you are in the majority, but I just don't see what is gained by
coding these rules twice in different languages -- sounds like busy work
that leads to complex-to-maintain software and is likely to be out of synch
with itself, with app programmers prone to change only the application
software and not the dbms constraints when they can do so.

> Then there are constraints which are more difficult or impossible
> to implement in the RDBMS, and although the constraints should
> be 'kept' centralised. These have to be enforced outside the RDBMS.
> (Some database people say that if a constraint can not be implemented
> in a database, then it can not be a real constraint. I do not agree with
> that.)

"Centralized" is fine, but another minor point today (more major perhaps in
the future) is that the database could be one of many accessed by the
software application, so that a central place for the business logic /
constraints / data validation information would more likely be within the
software code that is external to one of the databases, right?

> The advantage of enforcing the constraints in the RDBMS over
> another centralised system is that it also protects against concurrency and
> that it is often better scalable than a 'coding' solution.

Let's say that we have no issue with scaling on this or even find it better
not to have to go to the database to find that the valid options for radio
buttons on a flag are True & False, or that a valid date is one on a
calendar.  What is an example of the biggest risk related to concurrency?
If the spec of a constraint is that a field needs to have a value that comes
from a list of stored data, then the database would still be read to get
that data -- just not, perhaps, to find out the name of the table that needs
to be used.  But I suspect you are on to something I'm missing, so please
advise.

<snip>

Thanks.  --dawn
ben brugman - 27 Feb 2004 16:35 GMT
> If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS
> COMPLEX DATA -- locating it once in the database and again in a different
[quoted text clipped - 4 lines]
> with itself, with app programmers prone to change only the application
> software and not the dbms constraints when they can do so.

Input of a date for example, it is fairly common dat the Gui only accepts
real date's. The constraint that there is no 30th of februar is checked in
the Gui and it is checked again in the RDBMS.
Often when entering information, for example lending a book, this can
only be entered for an existing lender. The application does not allow
to enter a non existing lender and the RDBMS still checks that that
the lender does exist.

> "Centralized" is fine, but another minor point today (more major perhaps in
> the future) is that the database could be one of many accessed by the
> software application, so that a central place for the business logic /
> constraints / data validation information would more likely be within the
> software code that is external to one of the databases, right?

If the software is used for more than one database and the database is not
used for different types of software then the coding in the software has to
be
done only once. But in the past it has been more likely that the software
will
change than that the database will change.

> Let's say that we have no issue with scaling on this or even find it better
> not to have to go to the database to find that the valid options for radio
[quoted text clipped - 5 lines]
> to be used.  But I suspect you are on to something I'm missing, so please
> advise.

Some constraints are based on data read, if these constraints are done in
the database, there is no concurrency issue. If you check the total amount
of
data in the application the read data has to be locked. (Predicate locking).
Oracle for example does not allow predicate locking so there you have to
find a way around it. Checking the constraint in the database does not
have this problem.

For example two partners have both their own bank account, they are
allowed to have an overdraft, but the two accounts together must be kept
out of the read.
If this constraint is implemented outside the database, depending on the way
it is implemented, both partners will be capable of overdrafting their own
account
if the timing is 'perfect'.
A real 'serializable' level does prevent this, but Oracle does not supply a
real 'serializable' level. So then you might be better of to implement this
constraint
in the database, so that when to transactions occure at the same time the
constraint is still checked against correctly.

ben brugman.

> <snip>
>
> Thanks.  --dawn
Eric Kaun - 03 Mar 2004 16:01 GMT
> If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS
> COMPLEX DATA -- locating it once in the database and again in a different
> language in the business rules used by the GUI.  Is that correct?

Yes, it's a necessity if you want any assurances. If you only have ONE
application, with ONE GUI, then you may not have a problem (for now).

And the rules aren't duplicates - not exactly. For example, the GUI sort of
needs the DBMS to provide rapidly-changing data like currency exchange
rates, right?

The GUI, like an XML doc or a nested data structure, is a single and
simplistic view of things. Real things can support multiple views, from
various angles, which each have utility for different parts of the business.
But even in apps I've written, I've provided GUIs that let them envision the
data in different ways (sometimes they wanted to see paint color variants by
model, sometimes they wanted to see models by color variant.) Letting one
particular GUI govern my rules is putting the cart before the horse.

> It does
> seem like you are in the majority, but I just don't see what is gained by
> coding these rules twice in different languages -- sounds like busy work
> that leads to complex-to-maintain software and is likely to be out of synch
> with itself, with app programmers prone to change only the application
> software and not the dbms constraints when they can do so.

I agree, code them in an RDBMS and then apply simple templates to generate
the UIs. We're in agreement.

> "Centralized" is fine, but another minor point today (more major perhaps in
> the future) is that the database could be one of many accessed by the
> software application, so that a central place for the business logic /
> constraints / data validation information would more likely be within the
> software code that is external to one of the databases, right?

If the application accesses many DBMSs, then the other programs that also
access those DBMSs have rules of their own. You have a serious rule
coordination problem.

If your app is going to be the only one setting the rules, then why have
multiple DBMSs anyway?

If the databases really are separate, then you need conversion logic
(predicates) which allow interpretation in one or both directions.

- Eric
Dawn M. Wolthuis - 03 Mar 2004 19:40 GMT
> > If I understand you correctly, you are actually IN FAVOR OF DUPLICATING
> THIS
[quoted text clipped - 7 lines]
> needs the DBMS to provide rapidly-changing data like currency exchange
> rates, right?

There are definitely local constraints for specific applications that would
not be specified as global constraints.  It would be foolish to specify all
global constraints in one language and apply them with one data integrity
engine, while coding local ones using different tools entirely, I would
think.  Local constraints are just global constraints that have some
additional conditional logic in them.  And we definitely want either type of
constraint to be easy to modify, with quality assurance, of course.

> The GUI, like an XML doc or a nested data structure, is a single and
> simplistic view of things. Real things can support multiple views, from
[quoted text clipped - 3 lines]
> model, sometimes they wanted to see models by color variant.) Letting one
> particular GUI govern my rules is putting the cart before the horse.

You are right that it is not the way that the data is presented that is a
constraint.  You should be able to put different GUI views with the same
constraints.

> > It does
> > seem like you are in the majority, but I just don't see what is gained by
[quoted text clipped - 6 lines]
> I agree, code them in an RDBMS and then apply simple templates to generate
> the UIs. We're in agreement.

I'll go for that.  I don't have a reason to "gen" a lot of intermediate
steps -- as someone once said (I'll have to track it down) "the code is the
spec" so taking that code and gen'ing byte-code for whatever virtual machine
you are running the dbms & UI constraint handling within ...  [This is a
dance we could do all day, of course, but bottom line is that we agree on
the theory and envision different implementation approaches, I think]

I better do some "real work" for a while, so I'm leaving many juicy
responses unanswered for now.
Cheers!  --dawn
Eric Kaun - 04 Mar 2004 13:30 GMT
> There are definitely local constraints for specific applications that would
> not be specified as global constraints.

No, there aren't. If app X deals with a completely different set of data
than app Y, then you can easily apply each app's "individual constraints" as
global ones, qualified by the subset predicates. And if they deal with the
same data, then you can't have X writing data that violates Y's constraints,
can you? That would be nonsense.

Do you have an example? If you're talking about simple data transformation,
then calculating f(a) requires that a meets its constraints, and if you're
storing results in b, that f(a) meet the appropriate constraints for b.

I'm not sure where the need for this division is.

> It would be foolish to specify all
> global constraints in one language and apply them with one data integrity
> engine, while coding local ones using different tools entirely, I would
> think.

I agree.

> Local constraints are just global constraints that have some
> additional conditional logic in them.

Then they're global. For example:

A => X & B=>Y

A and B can be propositions about the same data, and X and Y indicate their
implications. A can be for app A, B for app B. The global constraint is
their conjunction (the above can be written differently, of course).

> I better do some "real work" for a while, so I'm leaving many juicy
> responses unanswered for now.

Yeah, me too... hard to tear myself away, though...

- Eric
Dawn M. Wolthuis - 05 Mar 2004 20:41 GMT
> > There are definitely local constraints for specific applications that
> would
[quoted text clipped - 11 lines]
>
> I'm not sure where the need for this division is.

I think we agree that all constraints can be seen as constraints on the data
globally by simply narrowing the scope of the constraint.  However, this is
what I have seen happen -- tell me if you do not think this is the norm:

Data get specified in requirements, we normalize them and stick 'em in
schemas, with constraints and all.  Then somewhere in the life of the
database, changes are made to the initial applications or new applications
are built where the database constraints are necessary but not sufficient
for taking in the data properly from a user, for example.

Let's say that the developer and DBA get along (there have been cases, as I
understand it) and when a constraint allows for all real numbers to two
decimal positions and it is determined that the requirements change so we
are going to collect only whole numbers, the database constraints are
changed.  However, even this programmer, when coding a portion of the
overall application that is used only by the xyz department where the value
of this attribute must be between 2 & 5 simply adds the logic into that
application for this restriction rather than having any change made at the
database level.

Does this still happen in real software development today or am I stuck in
the 80's?  If it does, you might think that the answer is to get the
programmers to do things right.  Ah, but there are ways to provide carots
rather than sticks even for developers.  If a software developer can handily
change a constraint they write but can't easily change one they don't, then
guess what?

smiles.  --dawn
Eric Kaun - 05 Mar 2004 21:07 GMT
> I think we agree that all constraints can be seen as constraints on the data
> globally by simply narrowing the scope of the constraint.  However, this is
[quoted text clipped - 15 lines]
> application for this restriction rather than having any change made at the
> database level.

But if that application then reads data written by a different application,
which wrote a 6, what happens? If that application WRITES only 2-5, fine...
but the 2-5 range either is a restriction for the data domain or it isn't.
What does it mean that the data "must be between 2 & 5"? That introduces a
problem if that application and others are modifying the same data. What
about the "old" values under 2 or above 5?

> Does this still happen in real software development today or am I stuck in
> the 80's?  If it does, you might think that the answer is to get the
> programmers to do things right.  Ah, but there are ways to provide carots
> rather than sticks even for developers.  If a software developer can handily
> change a constraint they write but can't easily change one they don't, then
> guess what?

Either the meaning of the data has changed (in which case you have to cope
with all the data already out there, or convert it), or the application is
dealing with a more restricted set of the data (which is fine). It's not a
local constraint - it's a query. Maybe I'm just confused by the example.

- Eric
Dawn M. Wolthuis - 05 Mar 2004 21:49 GMT
> > I think we agree that all constraints can be seen as constraints on the
> data
[quoted text clipped - 40 lines]
> dealing with a more restricted set of the data (which is fine). It's not a
> local constraint - it's a query. Maybe I'm just confused by the example.

It is a lousy example, but if you can stick with it this "local constraint"
(one that applies to this application related to the database, but not to
other applications on the same data) is one that has me putting a dropdown
list of values 2, 3, 4, & 5 for the user to choose from.  Where did the
application get those values?  From somewhere in the non-database side of
the application, I'm guessing.  Why?  Because I, the programmer, can control
the constraints here so that when 6 becomes a valid value, I can change my
dropdown list without anyone else involved.  This has to do with human
beings and how they work.  Programmers will almost always do what they see
as having the best quality solution from their perspective and that
perspective is often at odds with what a dba would see as the best quality
solution.  If both could manage validation/constraint specification/logic
within the same environment ...

Anyway, it relates to the same point I started with -- that the way we have
our system development partitioned today and where we have our constraint
processing handled, we end up with multiple specifications of
validation/constraint handling.  I don't think our systems should be
partitioned the way they are, nor do I think the roles on the project team
should be partitioned into dba's and programmers.

--dawn
Jerome H. Gitomer - 06 Mar 2004 19:12 GMT
    [Big Snip]

> It is a lousy example, but if you can stick with it this "local constraint"
> (one that applies to this application related to the database, but not to
[quoted text clipped - 18 lines]
>
> --dawn

You might partition your system development that way, but some
of us use a different model.  For example, I use a model based
on the concept of noun and verb.  Verbs (procedures) operate on
Nouns (Data).  Data is further divided into two categories:
Control Data and Application Data.

Using my model your dropdown list would be Control Data stored
in a table and could only be changed when the installations
change and validation criteria were satisfied.

My model perpetuates the dba and developer project team for good
reason -- Separation of Authority.  It reflects the need for
commercial enterprises to protect their data from change except
through authorized, controlled and monitored processes.  Those
who control the data should not develop the procedures that
operate on the data and those that develop the procedures should
not be responsible for the integrity of the data.  Experience
has proven that it is too risky to do otherwise -- not from a
systems point of view, but from the point of view of the
enterprise that owns and uses the data.

Jerry
mountain man - 07 Mar 2004 01:20 GMT
> You might partition your system development that way, but some
> of us use a different model.  For example, I use a model based
[quoted text clipped - 16 lines]
> systems point of view, but from the point of view of the
> enterprise that owns and uses the data.

Well explained Jerry.
Dawn M. Wolthuis - 08 Mar 2004 01:01 GMT
> [Big Snip]

> My model perpetuates the dba and developer project team for good
> reason -- Separation of Authority.  It reflects the need for
[quoted text clipped - 6 lines]
> systems point of view, but from the point of view of the
> enterprise that owns and uses the data.

I think this is the best reason I can see for separating these roles.  I
don't think it is good enough in most cases, however, to put up with the
side-effects such as:

1) code that has duplication of constraints (as is the topic here)
2) meetings and more meetings to have people attempt to agree when they have
such different goals which is often the case when putting a dba and a
programmer in the same room
3) high cost of ownership of software due to cost of maintenance including
changes to "constraints" in multiple locations, multiple people & skillsets
required to make changes, etc.
4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored
then too).

When the issue is whether to drop a bomb somewhere, then I want checks and
balances.  When it has to do with painting a wall in my house, I really
don't want the speed & quality of committee work on that effort.  I guess
I'm saying that the risks that the average company with the average
information system has to accept in not separating these two roles is worth
it in most cases.  --dawn
Jerome H. Gitomer - 08 Mar 2004 15:51 GMT
>>[Big Snip]
>
[quoted text clipped - 12 lines]
> don't think it is good enough in most cases, however, to put up with the
> side-effects such as:

    Stop and think carefully about what you just wrote.  I don't
think you would be happy if the IT departments of your bank,
your credit card company, your mortgage lender, your automobile
loan provider, and whoever you got your tuition loans from
didn't have separation of duties and extensive checks and
balances built into their systems.

    The best way to enforce the universal application of common
constraints is to incorporate them into the database.  This is
because they then appear once and only once in the application
and, hopefully, in the institution as a whole.

    Allowing individual developers to provide their own constraints
(which implies duplication of constraints) will either have an
exponential effect on the cost of maintaining the application or
make it unmanagable.

> 1) code that has duplication of constraints (as is the topic here)

    If the constraints are in the database there is no reason for
them to be coded in the application.

> 2) meetings and more meetings to have people attempt to agree when they have
> such different goals which is often the case when putting a dba and a
> programmer in the same room

    There is one and only one set of constraints that apply to
data.  It either meets the constraints and is therefore
acceptable or it does not and is therefore unacceptable.  There
are no issues between the dbas and the developers in this regard.

> 3) high cost of ownership of software due to cost of maintenance including
> changes to "constraints" in multiple locations, multiple people & skillsets
> required to make changes, etc.

    The cost of software maintenance is negligible in comparison to
the cost and damage resulting from the failue to catch an error
in, to stick with the financial industry, the savings account
processing or loan processing applications in a bank.

> 4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored
> then too).
[quoted text clipped - 5 lines]
> information system has to accept in not separating these two roles is worth
> it in most cases.  --dawn

    In closing I will cite one actual case of a bad program
constraint that cost millions of dollars.  The company in
question had many divisions, including a computer division.  A
new director in looking over the division balance sheets saw
that the computer division was losing money and forced the sale
of the computer division.  When the profits did not increase to
the expected level in the following year an investigation
revealed that the revenues for computer service had been
credited to the service division rather than the computer
division and that the computer division really was profitable.

    The code constraints in the program used to determine which
division received credit for revenue were incorrect.
Unfortunately they were buried too deep in the system to have
been discovered in time to avoid what proved to be a very costly
mistake.

Jerry
Dawn M. Wolthuis - 08 Mar 2004 18:31 GMT
> >>[Big Snip]
> >
[quoted text clipped - 19 lines]
> didn't have separation of duties and extensive checks and
> balances built into their systems.

"Extensive checks and balances" is not the same as "separation of duties"
but even with a separation of duties, separating them so as to get multiple
sets of constraint logic does not make good sense.

> The best way to enforce the universal application of common
> constraints is to incorporate them into the database.  This is
> because they then appear once and only once in the application
> and, hopefully, in the institution as a whole.

If one could incorporate all such constraints into the database and use them
for all needed purposes from there, that's fine.  But you can't, so they end
up strewn throughout the code.  Do you read code often?  Are you not seeing
any additional constraint logic for validation, GUI display or other
purposes within the code?  If not, I suspect your setup is exceptional.

> Allowing individual developers to provide their own constraints
> (which implies duplication of constraints) will either have an
> exponential effect on the cost of maintaining the application or
> make it unmanagable.

No disagreement here -- that's was my point up front, but from the other
direction.  I simply don't want to have the constraints spec'd, respec'd and
then spec'd some more as seems the case in code I have seen over the past
couple of decades.

> > 1) code that has duplication of constraints (as is the topic here)
>
> If the constraints are in the database there is no reason for
> them to be coded in the application.

Perhaps you have missed my argument on this.  Many, many constraints show up
in the software applications.  For example, if the database says that a
particular attribute must be validated against a table of possible values,
then that constraint is applied by the database when CRUD services are
invoked.  However, prior to then, a GUI is likely to toss up a drop down
list of valid values for data entry so that no values other than the ones
the database constraint permits are available to the user. The user never
enters an invalid value in this application, then, but the application
developer invariably as recoded the constraint that says that this field is
validated by that table.

If this is not the case in your situation, how is code that displays the GUI
getting such information?  Is the software developer really reading this
constraint info from the database?  Even if they were to retrieve a spec
from the database, they would then be reimplementing the logic that
interprets and applies that spec unless the database validation logic is
able to be invoked for purposes other than CRUD services.

> > 2) meetings and more meetings to have people attempt to agree when they have
> > such different goals which is often the case when putting a dba and a
[quoted text clipped - 4 lines]
> acceptable or it does not and is therefore unacceptable.  There
> are no issues between the dbas and the developers in this regard.

Is that the experience of others on ths list too?  I work with small to
mid-size businesses.  I don't see blatant issues between the two groups --
just a bit of passive-agressive behavior.  DBAs build walls and programmers
build ladders.

> > 3) high cost of ownership of software due to cost of maintenance including
> > changes to "constraints" in multiple locations, multiple people & skillsets
[quoted text clipped - 4 lines]
> in, to stick with the financial industry, the savings account
> processing or loan processing applications in a bank.

I certainly believe that good QA is important, I just don't see the current
split between the roles of the DBA and programmer as contributing to that
quality in a significant way.  I do see where it could make it a little more
complex for a corrupt developer to steal, but if the production environment
is adequately locked down ... Anyway, as I mentioned, the issue of having
multiple people with multiple roles so there are checks and balances is the
only reason I can see for the currrent strategy of separation of skills and
roles between dba and programmer.  So, that type of security is still
needed, but it doesn't have to require that there be two such groups, nor
that they use two entirely different toolsets for constraint logic (for
example).

> > 4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored
> > then too).
[quoted text clipped - 22 lines]
> been discovered in time to avoid what proved to be a very costly
> mistake.

Good story.  We, as a profession, seem to have a poor handle on "our own"
data that we use (programs, database-spec'd constraints, etc).  It is not
surprising that this data was "buried too deep".

Cheers!  --dawn
mountain man - 08 Mar 2004 20:35 GMT
> > [Big Snip]
>
[quoted text clipped - 29 lines]
> information system has to accept in not separating these two roles is worth
> it in most cases.  --dawn

The roles might simply be two hats that one person wears.
The principle in question here is the multiple specification of constraints.

When systems get large, or when there is much change happening
then these constraints need to be centralised in the DB.

After all, the whole point of a DBMS is to have I/O.  Those who
dont want to "waste valuable i/o" on performing database constraint
or validation checking might consider allocating funds to a get better
CPU if the validation i/o is not quick enough for the users.

Some solutions obviously seek middle ground, such as caching the
data control-data locally.

Programmers and DBA's usually are guided by what the IT manager
may decide.  In Australia, the usual practice is that there is one single
person responsible - in the end - for decisions.  I dont know how you
run things in the north of the planet, but there are times for meetings
and there are times for decisions.

Your points 1 - 3 above would normally be resolved at the IT management
level, and dependent upon the "technical philosophy" of that person, so
the decisions and policies get made, etc.

That person should also be able to implement short-term workarounds
of changing application code to present a new set of contraints, if the
RDBMS cannot be prepared "overnight", and to oversight the change-
management of this, and its coordination with the existing operations.

But in the end, change management (of the RDBMS and the applications)
is the most expensive item in IT today.  And it is for this very reason that
the tendency to move all constraints to the RDBMS is growing.

Most things are better change-managed centrally than distributed.

Pete Brown
Falls Creek
Oz
www.mountainman.com.au
Eric Kaun - 08 Mar 2004 20:47 GMT
> After all, the whole point of a DBMS is to have I/O.

You're not really serious? In other words, as long as it's persisted, it
doesn't matter what "it" is?

> Those who
> dont want to "waste valuable i/o" on performing database constraint
> or validation checking might consider allocating funds to a get better
> CPU if the validation i/o is not quick enough for the users.

If they don't want to waste time doing that, they probably really don't want
to waste time - and money - fixing the inevitable results of data
corruption.

- erk
mountain man - 09 Mar 2004 12:38 GMT
> > After all, the whole point of a DBMS is to have I/O.
>
> You're not really serious? In other words, as long as it's persisted, it
> doesn't matter what "it" is?

I'm not sure I understand what your saying.

What I am advocating is use of the database to hold all control data,
validation data, constraints, etc.  Where these are no longer held in
the application code, this will imply physical RDBMS I/O in order
that the application return them to the end user.

The original statement above might be paraphrased
'the whole point in having a DBMS is utilising it.'

> > Those who
> > dont want to "waste valuable i/o" on performing database constraint
[quoted text clipped - 4 lines]
> to waste time - and money - fixing the inevitable results of data
> corruption.

In the long run, this is correct.
Dawn M. Wolthuis - 09 Mar 2004 03:26 GMT
> > > [Big Snip]

> Programmers and DBA's usually are guided by what the IT manager
> may decide.  In Australia, the usual practice is that there is one single
[quoted text clipped - 5 lines]
> level, and dependent upon the "technical philosophy" of that person, so
> the decisions and policies get made, etc.

Yes, one thing IT management does is resolve issues when two or more people
on a project team are not in agreement. That's how I know that DBAs and
programmers often have very different goals.  I'd prefer that everyone
building my house have the same goal in actually building a quality house
into which I can move.

<snip>
> But in the end, change management (of the RDBMS and the applications)
> is the most expensive item in IT today.  And it is for this very reason that
> the tendency to move all constraints to the RDBMS is growing.

ugh!  That's what gets us to the topic at hand as it is -- the way that
constraints are handled in today's DBMS's, from what I can see, is to employ
constraint logic when CRUD services are requested, rather than having the
necessary constraints available to the GUI developer, for example (I'm being
a broken record, but ...)

> Most things are better change-managed centrally than distributed.

Maybe in 1988 the database was a central repository for all data a company
cared about, but that isn't the way it is today.  The DATA ARE
DISTRIBUTED -- so then you get to that "federated" term and all, but what
you can do to centralize the constraint logic is create services for use by
any and all that need it.  How do you ENFORCE the use of these services?
You can require the CRUD services for any databases you do "own" to employ
these services, for one thing.  And then there is the carrot rather than the
stick for your GUI and other components -- if it is easier to use the data
constraint services than not to and to change them when needed, then we can
guess what will happen.

--dawn
Marshall Spight - 10 Mar 2004 23:19 GMT
> Yes, one thing IT management does is resolve issues when two or more people
> on a project team are not in agreement. That's how I know that DBAs and
> programmers often have very different goals.  I'd prefer that everyone
> building my house have the same goal in actually building a quality house
> into which I can move.

This viewpoint doesn't scale. It requires every engineer working on a given
system understand all the development and operating requirements of
every part of the system. That only works if the system is small enough
so that the engineers can invest the time needed to do that, and also do
their job.

If they don't have enough time to do that, then they will have localized
responsibilities and localized concerns that will sometimes conflict with
those of other engineers working on the system, even if everyone agrees
that the overall goal is to maximize the value of the company.

Marshall
mAsterdam - 28 Feb 2004 08:01 GMT
> Constraints should be centralised.

This is to general. I look at it this way: There are more sets of
constraints, each with a different purpose. The contraints at the
database serve to protect the integrity of the managed set of data.
The constraints at the user-interface on the other hand serve to assist
the user in providing the data he needs to provide in order to achieve
his goal.

So here we have two sets of constraints. Let's call them 'D' and 'U'.
You can look at a set of constraints as defining all possible
combinations of a set of data, i.e. as a type.

Is U defining a subset of D? A superset?
The intersection of the sets defined is the relevant set for the
feared constraint-redundancy.

Now when there are lots of data and - relatively - just a few
constraints ("Large databanks"), I won't worry about that redundancy.
Bank ID's provide a good example for this.

Just my 2 Eurocents.
ben brugman - 01 Mar 2004 09:38 GMT
> > Constraints should be centralised.
>
> This is to general. I look at it this way: There are more sets of
> constraints, each with a different purpose.
Yes, I do agree here.
The sets do not have to be completely disjunct.

And most discussions in this 'newsgroup' only concern constraints
which do apply to the database. (This is the set of data which is
still present, after the machine is shut down. Wel the constraints
'are' not present, but the data still conforms to that set of constraints).

Then the centralising of the constraints. I do not think the implementation
can be 'totally' centralised. But a nice situation would be that the
model with all constraints is somewhere centralised in some form.
(Documentation or something).
This set of constraints has to be implemented and the implementation
can be redundand. (In the code and in the database.)

Example there is an implementation of 'child' and 'parent' in the database,
the database holds the constraint the 'parent' must exist everytime a
child exists. Often the application is build in such a way that even if
the constraint is dropped in the database, the application only can
'make' data in accordance with the constraint. Often the application
can not 'violate' a constraint if it is the only application.

My point :
All constraints should be 'included' in the model 'documentation'.
(This is considered centralised.)
Implementation of constraints should be done as close as possible
to the data, preferably in the RDBMS.
(The implementation can be centralised, but I think this is not
realistic for all situations).
Server and application coding often do implement a part of the
constraints implemented in the database. (The risc here is that
different software has different implementations of the 'same'
constraint).
Then some constraints are only implemented outside the RDBMS,
the risc here is different implementations of the 'same' constraint and
no central enforcement (RDBMS) of the constraint and over time
the constraint can be changed (switched on or off or can be different).
This gives the risc that even if a constraint is implemented that there
is data not in accordance with that constraint.

(
Some object modellers believe that all constraints should be implemented
in 'their' object code. The implementation should be centralised for each
object. The advantage being that there is a centralised constraint and
that then there is a freedom to implement the data of the object independend
of the constraint in any data-repository which can hold the data.

In concept I do agree with those views. But from past experience I totaly
disagree with that concept. Why ? Environments change more than
databases. Constraints are more difficult to enforce in this way.
There is no hard link between the constraint itself and the existing data.

Implementing the constraint in an implemented database, the link is hard.
Because to use the data you need the specific RDBMS and therefore
can enforce the constraints. There are no databases which can be
accesed by RDBMSses from different vendors.
)

> The contraints at the
> database serve to protect the integrity of the managed set of data.
[quoted text clipped - 9 lines]
> The intersection of the sets defined is the relevant set for the
> feared constraint-redundancy.

Even if a constraint if only of the type 'D' or of the type 'U' it can be
implemented more than once. So I do not think this is the problem
of the intersection. For example the 'U' set can be implemented more
than once if more than once if serveral applications work with the
same data.

> Now when there are lots of data and - relatively - just a few
> constraints ("Large databanks"), I won't worry about that redundancy.
> Bank ID's provide a good example for this.

The risc of constraint redundency is not that it is implemented more
than once, the risc is that there can be different implementations of
the same constraint. (For example a new application using the same
data but not implementing the constraint at all).
(I do not know what you mean by the redundancy of Bank ID's).

Thanks for your participation,
ben brugman

> Just my 2 Eurocents.
Marshall Spight - 01 Mar 2004 16:15 GMT
> And most discussions in this 'newsgroup' only concern constraints
> which do apply to the database.

It's a database theory newsgroup, so this should not be a surprise.

> (This is the set of data which is
> still present, after the machine is shut down.

Ack! That's not a good definition! RDBMSs are primarily about data
management, not persistence. They happen to be a good place to
implement persistence as a service, but there are other things that
can do that too just as well. But those things can't do data management.

The idea that databases are primarily about persistence is toxic myth.
You can have persistence without data management, and you can
have data management without persistence.

>Well the constraints
> 'are' not present, but the data still conforms to that set of constraints).
[quoted text clipped - 5 lines]
> This set of constraints has to be implemented and the implementation
> can be redundand. (In the code and in the database.)

> Example there is an implementation of 'child' and 'parent' in the database,
> the database holds the constraint the 'parent' must exist everytime a
> child exists. Often the application is build in such a way that even if
> the constraint is dropped in the database, the application only can
> 'make' data in accordance with the constraint. Often the application
> can not 'violate' a constraint if it is the only application.

It is important to differentiate between application constraints and
data constraints. Something that is only necessary for one particular
application and not for others is not a data constraint, and does
not have the same centralization requirements.

> My point :
> All constraints should be 'included' in the model 'documentation'.
> (This is considered centralised.)

Not by me it isn't.

> Implementation of constraints should be done as close as possible
> to the data, preferably in the RDBMS.
[quoted text clipped - 4 lines]
> different software has different implementations of the 'same'
> constraint).

That's why centralization is essential.

Let's put it this way: you centralize the constraints for the same
reasons you centralize the data. In fact, the constraints *are*
a part of the data; that's why you can't separate them.

> Then some constraints are only implemented outside the RDBMS,
> the risc here is different implementations of the 'same' constraint and
> no central enforcement (RDBMS) of the constraint and over time
> the constraint can be changed (switched on or off or can be different).
> This gives the risc that even if a constraint is implemented that there
> is data not in accordance with that constraint.

This is another problem with centralization solves.

Marshall
ben brugman - 01 Mar 2004 17:08 GMT
> > And most discussions in this 'newsgroup' only concern constraints
> > which do apply to the database.
[quoted text clipped - 8 lines]
> implement persistence as a service, but there are other things that
> can do that too just as well. But those things can't do data management.

The 'definition' concerns the database, not the RDBMS or not the
(RDBMS-)instance. So for the database the definition 'holds'.
For a RDBMS or a (RDBMS-)instance you are correct. The database
part is only the persistence part which can be recovered after a crash.
(But I must admit that I often use the term database where RDBMS should
be used. In the above case I didn't).

> The idea that databases are primarily about persistence is toxic myth.
> You can have persistence without data management, and you can
> have data management without persistence.

Databases are about persistence. RDBMSses are about the system
around the database.

> It is important to differentiate between application constraints and
> data constraints. Something that is only necessary for one particular
> application and not for others is not a data constraint, and does
> not have the same centralization requirements.

Constraints are requirements of the 'user', or the 'one' who is defining
the system. At specification time of the requirements it does not have
to be clear if a constraint will be implemented in a database or in any
other way. So a specification time for some constraints it is not
possible to differentiate between appliation constraints and data
constraints.
When implementing the differentiation has to be made. But before
implementing
it would be nice to have all constraint somewhere centralised.

> > My point :
> > All constraints should be 'included' in the model 'documentation'.
> > (This is considered centralised.)
>
> Not by me it isn't.
Ok.

(Refraise : The requirements should give or dictate the constraints.
The requirements should be a complete set. It would be nice that
the set is not scattered over several systems/documents/buildings).

> > Implementation of constraints should be done as close as possible
> > to the data, preferably in the RDBMS.
[quoted text clipped - 6 lines]
>
> That's why centralization is essential.

No it is not, we run a legacy database only capable of implementing
very rudimentair constraints and constraints have to be implemented
into the coding. We have survived in the marked without this
essential centralization. So centralization is not essential for survival.

And you can see from my other mails that I have given some examples
where constraints outside the RDBMS is the 'best' option. And also
in one of the mails I described that some people call constraints outside
the RDBMS as non database constraints. I do not agree with that.
But I must agree it is debatable when a constraint is a database
constraint and when it is not.

> Let's put it this way: you centralize the constraints for the same
> reasons you centralize the data. In fact, the constraints *are*
[quoted text clipped - 8 lines]
>
> This is another problem with centralization solves.

No it does not. If the centralization is done in application code it does
not solve the problem. And if different database venders are used
it does not solve the problem either. (Two databases can not
be centralised in one RDBMS).
If the constraints are implemented with triggers than this problem is
not solved either.

I agree that referential constraints and domain constraints should be
implemented in the RDBMS (if possible, using two databases does not
allow for referential constraint checking within one RDBMS).
But a lot of constraints (which I still think of as database constraints)
are implemented outside the RDBMS.

ben brugman.

> Marshall
Bob Badour - 02 Mar 2004 04:20 GMT
> > "ben brugman" <ben@niethier.nl> wrote in message
> news:4043049f$0$269$4d4ebb8e@read.news.nl.uu.net...
[quoted text clipped - 24 lines]
>
> Databases are about persistence.

A database is a set of facts. It might persist, and it might not.

> > It is important to differentiate between application constraints and
> > data constraints. Something that is only necessary for one particular
[quoted text clipped - 3 lines]
> Constraints are requirements of the 'user', or the 'one' who is defining
> the system.

Nope. They are limitations imposed on the user to keep the fallible from
doing what they do best.
Eric Kaun - 03 Mar 2004 16:16 GMT
> > > Ack! That's not a good definition! RDBMSs are primarily about data
> > > management, not persistence. They happen to be a good place to
[quoted text clipped - 6 lines]
>
> A database is a set of facts. It might persist, and it might not.

Agreed. And if we had languages that supported relations for all processing,
we wouldn't be in the O-O mess we're in now. It would be just amazingly
useful to be able to perform relational operations on data in memory... and
then, at the end when satisfied, persist it.

(ben wrote)
> > Constraints are requirements of the 'user', or the 'one' who is defining
> > the system.
>
> Nope. They are limitations imposed on the user to keep the fallible from
> doing what they do best.

I'd say both, actually. Certainly they prevent errors, but they also serve
to communicate (to developers and DBAs and such). And they document (in
executable form!) the meaning of the system's data.

- Eric
Marshall Spight - 11 Mar 2004 18:15 GMT
> Agreed. And if we had languages that supported relations for all processing,
> we wouldn't be in the O-O mess we're in now. It would be just amazingly
> useful to be able to perform relational operations on data in memory... and
> then, at the end when satisfied, persist it.

Okay, that sounds great and all, but how's it going to work? For one thing,
the operations shouldn't be limited to what's in memory any more than
the DBMS has that restriction. And what about things like keys?
Let's say we want to migrate some hierarchically-stored data into
our RDBMS, how do we make that work? We need to have the
foreign keys, which means we need to allocate k