Database Forum / General DB Topics / DB Theory / March 2004
Multiple specification of constraints
|
|
Thread rating:  |
Dawn M. Wolthuis - 25 Feb 2004 18:39 GMT This is a partial response to Eric's previous e-mail.
Where should constraints be specified?
Take a services framework that includes an interface for "web services" such as SOAP-wrapped XML that comes in from some place external to our system and also a user interface for people doing data entry. This is an appropriate design for Orders, for example.
In this case, we could break down the high level services as follows:
Web Service "entry service" GUI
(those two being the analogous interfaces to what used to be called "batch" and "interactive" computing) These services might then both be clients to the same
Data Validation Service
This would apply all business rules and constraints related to the data entered through one means or the other. Ignoring workflow types of services for now, we would add in at least
Data CRUD Services to Create, Read, Update, and/or Delete data stored in a database
When we use DBMS's, the Data Validation Service and CRUD Services are tightly coupled. This gives an advantage of a "lock box" (as Al Gore might say) for our data. It is not only impossible for a user to write data with an improper type, for example, it is also impossible for an IT professional without DBA access to do so.
However, if you look at software applications, it is common, even required, to have a GUI employ such business rules, for example, in giving users a list of valid codes from which to choose, or even sometimes in finishing the data entry based on prior entries (as with Excel) or based on valid entries. Additionally, once a set of data is passed to the Validation Service, prior to heading to the CRUD Services (whether Validation and CRUD are tightly or loosely coupled), these business rules are employed again.
Isn't it rather common, for example, for there to be a drop down box locking in the user to a valid set of choices, with the specification for this drop down box in one place/language and then these choices again validated when changes are made to the database? We are then taking our industry's data (specs/code) and duplicating it, thus making it difficult to maintain while keeping integrity in this data.
My point is that with the way that software is written today, quite often data constraints are coded in multiple places, even multiple languages. There are apparently many people as well as exceptional conditions that might prompt multiple locks on a door, but this need not be the norm, I would think.
If the front-end services are written in Java, for example, then I would prefer that the GUI and the Validation Services make use of the exact same constraint specs and logic to validate the data against the constraints. Even if we tap into the specifications that are placed in a DBMS, taking the logic of the DBMS that is used to apply the constraints and using that up front in the GUI isn't happening in current software applications, in general, for at least a few reasons I can think of:
1) The DBMS's logic is applied when CRUD services are launched, while the logic in the GUI is not ready to launch such services, so there is not an easy means (or perhaps any means) for the application software developer to tap into the DBMS's logic. This is true for type constraints as well as ref int constraints where the GUI might need to come back to a user and ask the equivalent of "Are you sure you want to delete this folder and all of its contents?" when a user requests that a parent be removed, for example.
2) The application developer cannot easily see the specs and logic applied by the DBMS to know whether that is the same as what they believe they need for this application. So they do not even make the decision on whether the constraints they are applying are tighter in this case than what the database spec permits. It is easier for the application developer to treat the DBMS constraints as a black box where they try not to bump into the limits, but they don't think of them as related.
3) Reading a database to get constraints when trying to respond quickly to the end-user is sometimes considered by the application developer to be out of the question compared to employing perhaps byte code that runs in an app server JVM.
One solid design could be to have all of the constraints specified and the validations applied in a Validation Service that is able to reuse the same constraint specifications that the GUI uses and that the CRUD routines use, all specified outside of the DBMS. We could code and run all such services in a JVM on an app server (rather than the database server). In this case, employing a file system that might not be considered a DBMS would be acceptable provided that we still have the door locked. This can be done by permitting CRUD services on the stored data only via the designated services.
The constraint specs and logic need to be moved right up to the front of the application -- to the GUI. I know some have advocated for the database language to deliver the applications, but that tends to tightly couple way too much of the code, rather than employing a good design pattern, such as a services pattern to the application.
I'm just thinking out loud and I'm sure something I wrote was not brilliant, but help me out here -- isn't this a real issue in application software development written today? I'm suggesting something rather radical -- decouple the constraint specification and validation logic from the database. Does anyone agree?
Cheers! --dawn
mountain man - 26 Feb 2004 00:29 GMT > This is a partial response to Eric's previous e-mail. > > Where should constraints be specified? ...[trim]...
> I'm just thinking out loud and I'm sure something I wrote was not brilliant, > but help me out here -- isn't this a real issue in application software > development written today? I'm suggesting something rather radical -- > decouple the constraint specification and validation logic from the > database. Does anyone agree? What happens when you have two sources for controls/constraints/validations? The answer is that there must exist and be maintained coordination tasks which ensure that the two sources are consistent. This is from a 'flat operational view' that looks at the total system of a day-to-day basis.
The other type of view is dynamic, incorporates far longer time scales and must face the problems associated with change management and evolution of an entire system. It is from this perspective (change management) that the coordination tasks that are described above become extremely complex in themselves, and often require serious resources (or additional costs).
For example, assume you implement such a non-db centric "minder of controls, constraints, validations, etc". After a while, the organisation somehow expands its operations into another business area, and you acquire another one, or a set of new databases and applications to maintain. The organisation now wants to integrate the lot, and have it operational before it announces its next acquisition.
Change management is the final testing ground of good ideas, so do your ideas still appear sound if the environment is undergoing critical change? How big do the coordination tasks become before you hire another person, or buy some more software?
Pete Brown Falls Creek Oz
Dawn M. Wolthuis - 26 Feb 2004 01:27 GMT > > This is a partial response to Eric's previous e-mail. > > > > Where should constraints be specified? > > ...[trim]... <snip>
> What happens when you have two sources for controls/constraints/validations? > The answer is that there must exist and be maintained coordination tasks [quoted text clipped - 28 lines] > the coordination tasks become before you hire another person, or buy some > more software? Good point.
If you do not tightly couple any database storage with database validation and integrity constraints processing, then I'm guessing (based on experience) that you can much more easily react to changes. I definitely agree that holding specifications and the processing for such constraint specs in multiple places is a setup for complexity, cost, and having your constraints out of synch. We cannot remove the constraint processing from easy access from the GUI because what shows on the screen is based on theose constraints, so I suggest removing them from being so tightly coupled with the database that they are inaccessible to the GUI.
--dawn
Eric Kaun - 03 Mar 2004 15:30 GMT > If you do not tightly couple any database storage with database validation > and integrity constraints processing, then I'm guessing (based on > experience) that you can much more easily react to changes. This also introduces more points of failure (not in operations, but in logic). More chances of data corruption. Logic phrased in GUI-like terms. If programmers have the choice of placing logic in the GUI, in a "data validation" component, or in intermediate objects, you'll get 10 different answers from 10 different programmers. While code and design review help somewhat, generating those things from a central point is much clearer and less prone to abuse.
> I definitely > agree that holding specifications and the processing for such constraint [quoted text clipped - 3 lines] > constraints, so I suggest removing them from being so tightly coupled with > the database that they are inaccessible to the GUI. Having the GUI driving things is a bad idea. Which GUI? Do you always have just one?
Having the constraints accessible to the GUI doesn't mean they have to be removed from elsewhere.
- Eric
Eric Kaun - 03 Mar 2004 14:37 GMT A good discussion - first my response to Dawn's original thoughts:
Dawn wrote:
> Where should constraints be specified? In one place, the best place, where they can be specified declaratively, and which won't allow violations of them. If you require violations, your consraints are wrong.
In addition, the constraints must be easy to extract, so that boilerplate code can be generated from them. I think that's key, and while it's done in ad hoc ways by many different companies, this type of generation must be elevated. It is logic in action; the implementation technologies are irrelevant.
> Take a services framework that includes an interface for "web services" such
> as SOAP-wrapped XML that comes in from some place external to our system and
> also a user interface for people doing data entry. This is an appropriate > design for Orders, for example. [quoted text clipped - 5 lines] > > (those two being the analogous interfaces to what used to be called "batch"
> and "interactive" computing) > These services might then both be clients to the same [quoted text clipped - 3 lines] > This would apply all business rules and constraints related to the data > entered through one means or the other. This means that your constraints have to be applied for ALL data entry. (Just repeating because it's critical.)
> Ignoring workflow types of services > for now, we would add in at least [quoted text clipped - 4 lines] > When we use DBMS's, the Data Validation Service and CRUD Services are > tightly coupled. This gives an advantage of a "lock box" (as Al Gore might
> say) for our data. It is not only impossible for a user to write data with > an improper type, for example, it is also impossible for an IT professional
> without DBA access to do so. Any file system does CRUD. No one must be allowed to update data independent of the "data validation service." Given that we're trying to abstract above the level of a file system, I see no good reason to differentiate the "CRUD services" - that's what the DBMS vendor does.
> However, if you look at software applications, it is common, even required,
> to have a GUI employ such business rules, for example, in giving users a > list of valid codes from which to choose, or even sometimes in finishing the
> data entry based on prior entries (as with Excel) or based on valid entries.
> Additionally, once a set of data is passed to the Validation Service, prior
> to heading to the CRUD Services (whether Validation and CRUD are tightly or
> loosely coupled), these business rules are employed again. The GUI data "validation" isn't really validation in the same sense, though its rules are derived directly from constraints. I'd say the GUI should be "declarative" - rather than validating, give the user no choice to enter bad data. A drop-down list doesn't validate - it's the sole input mechanism for that field.
> Isn't it rather common, for example, for there to be a drop down box locking
> in the user to a valid set of choices, with the specification for this drop
> down box in one place/language and then these choices again validated when > changes are made to the database? We are then taking our industry's data > (specs/code) and duplicating it, thus making it difficult to maintain while
> keeping integrity in this data. Agreed - that's the trouble with multi-tier systems, and it's the reason code generation is so popular these days. Declared constraints give a much, MUCH stronger basis on which to generate code, boilerplate, data access mechanisms, etc. They are for more than just validation - they are are the meaning of your application.
> My point is that with the way that software is written today, quite often > data constraints are coded in multiple places, even multiple languages. With multi-tier systems, they need to be in multiple places, in slightly different forms. But that code doesn't need to be written and maintained separately.
> There are apparently many people as well as exceptional conditions that > might prompt multiple locks on a door, but this need not be the norm, I > would think. Don't think of it as multiple locks on a door - think of it as channeling everyone going to a secure room past the cameras, through the patrolled hallway, etc.
If your data has much value to the organization, then it will be accessed by a wide variety of programmers. Declaring constraints does more than just defend against corruption (though that's extremely important). It also communicates the meaning of the data and the system.
> If the front-end services are written in Java, for example, then I would > prefer that the GUI and the Validation Services make use of the exact same > constraint specs and logic to validate the data against the constraints. Agreed, the specs should be the same. Their implications within a tier are not. Only the GUI needs a drop-down list, for example, but the contents of that drop-down are derived using constraints. If you have an HTML and a Swing interface to your data, you'll need different code for the same drop-down, but it's still the same constraints which define its essence.
Separating essence (the constraints) from accident (the implementation language) is critical, and I hope it's the "next big thing" in software development.
> Even if we tap into the specifications that are placed in a DBMS, taking the
> logic of the DBMS that is used to apply the constraints and using that up > front in the GUI isn't happening in current software applications, in [quoted text clipped - 3 lines] > logic in the GUI is not ready to launch such services, so there is not an > easy means (or perhaps any means) for the application software developer to
> tap into the DBMS's logic. There can and should be. The system catalog should tell the GUI most of what it needs to know.
> This is true for type constraints as well as ref > int constraints where the GUI might need to come back to a user and ask the
> equivalent of "Are you sure you want to delete this folder and all of its > contents?" when a user requests that a parent be removed, for example. That's derivable.
> 2) The application developer cannot easily see the specs and logic applied > by the DBMS to know whether that is the same as what they believe they need
> for this application. So they do not even make the decision on whether the
> constraints they are applying are tighter in this case than what the > database spec permits. It is easier for the application developer to treat
> the DBMS constraints as a black box where they try not to bump into the > limits, but they don't think of them as related. Code generation (or direct higher-level language execution) based on constraints doesn't give them as much of a choice in the matter.
> 3) Reading a database to get constraints when trying to respond quickly to > the end-user is sometimes considered by the application developer to be out
> of the question compared to employing perhaps byte code that runs in an app
> server JVM. The constraints can be read up-front; JSPs and ASPs are dynamically-generated GUIs, and can read whatever they like on the server, before the HTML is streamed to the browser. The app build process can use the DBMS to generate much, so the code won't run slower than something with hard-coded duplicated constraints. The structure of your database is going to change far, far less than its contents.
> One solid design could be to have all of the constraints specified and the > validations applied in a Validation Service that is able to reuse the same > constraint specifications that the GUI uses and that the CRUD routines use,
> all specified outside of the DBMS. I still haven't seen a compelling reason to be outside the DBMS. You're DESCRIBING a DBMS!. It's more than just a file system!
> We could code and run all such services > in a JVM on an app server (rather than the database server). And you were worried about speed before? :-)
> In this case, > employing a file system that might not be considered a DBMS would be > acceptable provided that we still have the door locked. And feel like doing a lot more work.
> This can be done by > permitting CRUD services on the stored data only via the designated > services. Could be, but why?
> The constraint specs and logic need to be moved right up to the front of the
> application -- to the GUI. Depends on what you mean by "moved." Certainly they need to be manifest there.
> I know some have advocated for the database > language to deliver the applications, but that tends to tightly couple way > too much of the code, rather than employing a good design pattern, such as a
> services pattern to the application. I'm unclear as to why a services pattern is "good," and what the nature of the coupling you mention is. I don't see it.
> I'm just thinking out loud and I'm sure something I wrote was not brilliant,
> but help me out here -- isn't this a real issue in application software > development written today? It's the very heart of the matter, and I thank you for raising this discussion.
> I'm suggesting something rather radical -- > decouple the constraint specification and validation logic from the > database. Does anyone agree? Not I. A database is defined by constraints and validation - decouple them, and you have what we already have: file systems. You have what DBMSs were created to abstract above! But again, a good topic.
- Eric
Dawn M. Wolthuis - 03 Mar 2004 19:30 GMT > A good discussion - first my response to Dawn's original thoughts: > [quoted text clipped - 4 lines] > which won't allow violations of them. If you require violations, your > consraints are wrong. Agreed.
<snip>
> Any file system does CRUD. No one must be allowed to update data independent > of the "data validation service." Given that we're trying to abstract above > the level of a file system, I see no good reason to differentiate the "CRUD > services" - that's what the DBMS vendor does. The reason to do so is not to sever the relationship between the database CRUD services and the validation services, but to ensure the validation services need not require that CRUD requests be directed to them. We need to be able to access validation/constraint information and apply it in cases that have nothing to do with the CRUD services. It is typically too tightly coupled with CRUD to do this today.
> > However, if you look at software applications, it is common, even > required, [quoted text clipped - 14 lines] > data. A drop-down list doesn't validate - it's the sole input mechanism for > that field. Agreed.
> > Isn't it rather common, for example, for there to be a drop down box > locking [quoted text clipped - 11 lines] > mechanisms, etc. They are for more than just validation - they are are the > meaning of your application. Agreed.
> > My point is that with the way that software is written today, quite often > > data constraints are coded in multiple places, even multiple languages. > > With multi-tier systems, they need to be in multiple places, in slightly > different forms. But that code doesn't need to be written and maintained > separately. Agreed.
<snip>
> If your data has much value to the organization, then it will be accessed by > a wide variety of programmers. Declaring constraints does more than just [quoted text clipped - 4 lines] > > prefer that the GUI and the Validation Services make use of the exact same > > constraint specs and logic to validate the data against the constraints. Agreed.
> Agreed, the specs should be the same. Their implications within a tier are > not. Only the GUI needs a drop-down list, for example, but the contents of > that drop-down are derived using constraints. If you have an HTML and a > Swing interface to your data, you'll need different code for the same > drop-down, but it's still the same constraints which define its essence. Yup.
> Separating essence (the constraints) from accident (the implementation > language) is critical, and I hope it's the "next big thing" in software > development. This statement is somewhat meaningless to me -- I'm pretty sure you are not saying that you would not encode these constraints in some language, right? Whether that language is declarative, procedural, OO, or whatever, I just don't care -- that's an implementation detail. It is key that the constraints and the application of the constraint logic (interpretation of specs, if that is how it is implemented) be accessible (as "services" the way I view it) to the database CRUD services and any software application UI services, for example.
> > Even if we tap into the specifications that are placed in a DBMS, taking > the [quoted text clipped - 10 lines] > There can and should be. The system catalog should tell the GUI most of what > it needs to know. Yes, in a similar way that it is "telling" the database CRUD services what they need to know.
> > This is true for type constraints as well as ref > > int constraints where the GUI might need to come back to a user and ask [quoted text clipped - 3 lines] > > That's derivable. Yup.
> > 2) The application developer cannot easily see the specs and logic applied > > by the DBMS to know whether that is the same as what they believe they [quoted text clipped - 9 lines] > Code generation (or direct higher-level language execution) based on > constraints doesn't give them as much of a choice in the matter. Of course I would prefer to see it as giving them the full choice in the matter.
> > 3) Reading a database to get constraints when trying to respond quickly to > > the end-user is sometimes considered by the application developer to be [quoted text clipped - 9 lines] > hard-coded duplicated constraints. The structure of your database is going > to change far, far less than its contents. why would we gen code from the DBMS for use with the GUI and not think of gen'ing code for similar applications of constraint logic with the CRUD services? Simply RUN (not gen) the same services for both.
> > One solid design could be to have all of the constraints specified and the > > validations applied in a Validation Service that is able to reuse the same [quoted text clipped - 4 lines] > I still haven't seen a compelling reason to be outside the DBMS. You're > DESCRIBING a DBMS!. It's more than just a file system! Fine -- call the entire application software development environment a DBMS. But I haven't seen such an animal yet, so whether there are, in the future, pre-packaged DBMS's that include this functionality or whether we write all such components today, it is still functionality I would not want to compromise.
> > We could code and run all such services > > in a JVM on an app server (rather than the database server). > > And you were worried about speed before? :-) smiles. Unfortunately whether tuning Java code or Oracle, there is still expertise required to get the desired performance, no matter how much bigger boxes are each year.
> > In this case, > > employing a file system that might not be considered a DBMS would be [quoted text clipped - 7 lines] > > Could be, but why? so as not to duplicate the constraint specs, logic, and engine
> > The constraint specs and logic need to be moved right up to the front of > the > > application -- to the GUI. > > Depends on what you mean by "moved." Certainly they need to be manifest > there. granted.
> > I know some have advocated for the database > > language to deliver the applications, but that tends to tightly couple way [quoted text clipped - 4 lines] > I'm unclear as to why a services pattern is "good," and what the nature of > the coupling you mention is. I don't see it. I understand why -- it was poorly stated and perhaps even poorly thought out -- I'll get back to that someday if words to describe my thoughts are less muddy. Sorry not to have provided clean lines for that one. That isn't even a hypothesis -- it is just intuition, so ...
> > I'm just thinking out loud and I'm sure something I wrote was not > brilliant, [quoted text clipped - 3 lines] > It's the very heart of the matter, and I thank you for raising this > discussion. And thank-you for taking the questions seriously and not responding with any "stupid" accusations, Eric!
> > I'm suggesting something rather radical -- > > decouple the constraint specification and validation logic from the [quoted text clipped - 3 lines] > and you have what we already have: file systems. You have what DBMSs were > created to abstract above! But again, a good topic. My intent with the "decoupling" suggestion is not to make it so that CRUD services do not make use of the validation services (multiple negatives there) but to ensure that these validation/constraint services are available wherever they are needed in software applications so they are not duplicated.
--dawn
Eric Kaun - 04 Mar 2004 13:56 GMT > > Any file system does CRUD. No one must be allowed to update data > independent [quoted text clipped - 10 lines] > that have nothing to do with the CRUD services. It is typically too tightly > coupled with CRUD to do this today. Oh, I agree completely. I was referring to that when I said (somewhere) that languages are not nearly relational enough... if I had relations in my programming language (e.g. for client and non-persistence stuff), I'd be overjoyed.
> > Separating essence (the constraints) from accident (the implementation > > language) is critical, and I hope it's the "next big thing" in software [quoted text clipped - 8 lines] > way I view it) to the database CRUD services and any software application UI > services, for example. Yes, definitely a language, though I think declarative is better because it is easier to prove and reason about, generally shorter, can be optimized more easily, etc.
I would be interested in what you mean by "accessible as services." I think services is an over-used word, and I've yet to see a good definition as such. What differentiates a service from a non-service?
> > Code generation (or direct higher-level language execution) based on > > constraints doesn't give them as much of a choice in the matter. > > Of course I would prefer to see it as giving them the full choice in the > matter. We might be talking about different things - I prefer not to give developers the choice to get around constraints that the designers and analysts have agreed are critical for the business.
> > The constraints can be read up-front; JSPs and ASPs are > > dynamically-generated GUIs, and can read whatever they like on the server, [quoted text clipped - 3 lines] > > hard-coded duplicated constraints. The structure of your database is going > > to change far, far less than its contents. I was unclear here - I meant specifically generate using the constraints defined in the (R)DBMS.
> why would we gen code from the DBMS for use with the GUI and not think of > gen'ing code for similar applications of constraint logic with the CRUD > services? Simply RUN (not gen) the same services for both. Not sure I know what you're saying here, but one reason might be to accomodate technology evolution: "Hey!" says the CEO. "Let's do .NET instead of J2EE! Cool!" Etc.
> And thank-you for taking the questions seriously and not responding with any > "stupid" accusations, Eric! Not a problem...
> > Not I. A database is defined by constraints and validation - decouple > them, [quoted text clipped - 6 lines] > wherever they are needed in software applications so they are not > duplicated. That sounds different than what you were saying before... I thought you were specifically separating validation from CRUD.
What I think we agree on: 1. that validation services (I'd say constraints) are useful animals with implications throughout the app 2. that manually keeping multiple tiers in sync with respect to these contraints is a poor use of time
- Eric
Marshall Spight - 26 Feb 2004 03:14 GMT > I'm just thinking out loud and I'm sure something I wrote was not brilliant, > but help me out here -- isn't this a real issue in application software > development written today? I'm suggesting something rather radical -- > decouple the constraint specification and validation logic from the > database. Does anyone agree? It is a real issue, but I don't agree with the direction you're taking.
I think the optimal solution here is to have the constraints specified declaratively and centrally, but to be able to *additionally* execute them remotely, i.e., on the client. That last part, on the client, is actually the least essential place for validation to occur; it's the only one that can be omitted. It's always essential to have it enforced centrally, on the server, because clients are typically not running on trusted computing bases.
Marshall
Dawn M. Wolthuis - 26 Feb 2004 14:01 GMT > > I'm suggesting something rather radical -- > > decouple the constraint specification and validation logic from the [quoted text clipped - 9 lines] > server, because clients are typically not running on trusted computing > bases. I agree they should be specified "centrally" but don't necessarily agree with what you mean by that. If you mean that they should be specified in a way that is tightly coupled with database storage services, I disagree. I would advocate for use of a design that might include an http server with a jvm as the location for all central rules/constraints processing. I certainly wouldn't locate the validation on the client, but might use the Service UI libraries with Jini for distributed computing -- in that case the compiled (to byte code) validation logic is executed on whichever platform it needs to be.
Additionally, I would argue that the user interface is the MOST important place for knowing the constraints so that the user interface can be as intuitive as possible, preferably written so as not to permit any illegal data from even being entered (but where that isn't feasible, at least letting giving the user the opportunity to fix it immediately). The user interface makes a big difference in the accuracy of the information collected.
I forget who wrote up the story of a major retail chain that built a new store in a certain zip code because their data said a lot of people go from that zip to another to go to this chain. They had to close the store for lack of customers. It turnes out that one of the retail clerks didn't like the constant question of what zip code the buyer was in, so she always entered her own to get past that question. Software usability is key to clean data. The GUI having full knowledge of the business rules (validations & all constraints) is critical to a usable GUI. If the validation should be coded only once (which I agree with), it MUST be where it is usable by the user interface routines. And if we apply all integrity logic up front, so that nothing gets past our validation routines into the database CRUD services, then we don't have to have duplicate logic in the database, right?
Another advantage of this approach is that there is not a need for both a DBA who knows one language and an application developer who knows another to either a) cross train so they are both competent in each or b) wait on each other, often not exactly in a "team player" sort of way when a small change is required in an application that requires a change in the type checking.
--dawn
ben brugman - 26 Feb 2004 17:14 GMT I agree with most that is allready replied. Constraints should be centralised. But the problem is often where and how.
Constraints which can be enforced in the RDBMS system should be enforced there. (If the constraint is not to complex).
Then most applications (server or client) will enforce a constraint again, because the application does not like the exception handling when a constraint of the database goes 'off'. For the GUI it is a lot handier to remark there is no data the 30th of februar (except in sweden) or that data is on a sunday and you can not use that date, than to do the database access and get an error. So most programmers doing the data access still have to know the constraints to anticipate on them, same goes for the Gui programmers. (I hate web applications which first ask all the questions and in the end tell me that I am not allowed to do something because of .....)
Then there are constraints which are more difficult or impossible to implement in the RDBMS, and although the constraints should be 'kept' centralised. These have to be enforced outside the RDBMS. (Some database people say that if a constraint can not be implemented in a database, then it can not be a real constraint. I do not agree with that.)
The advantage of enforcing the constraints in the RDBMS over another centralised system is that it also protects against concurrency and that it is often better scalable than a 'coding' solution.
> I forget who wrote up the story of a major retail chain that built a new > store in a certain zip code because their data said a lot of people go from > that zip to another to go to this chain. Looks like Urban legend to me.
But I know that there is a similar story. A retail chain did not have a outlet in a more posh area, because surveys had shown people in that area would not buy there. But from that area people did use the retail store in other area's because they where not recognised there. This the retail knew from payment information. (The area and the retail store are both know by me).
ben brugman
Bob Badour - 26 Feb 2004 18:19 GMT > I agree with most that is allready replied. > Constraints should be centralised. But the problem > is often where and how. > > Constraints which can be enforced in the RDBMS system > should be enforced there. (If the constraint is not to complex). No such complex constraint exists.
> Then most applications (server or client) will enforce a constraint > again, because the application does not like the exception handling > when a constraint of the database goes 'off'. If the application was written by a competent programmer, it likes it just fine.
> Then there are constraints which are more difficult or impossible > to implement in the RDBMS, and although the constraints should > be 'kept' centralised. No such constraint exists. All you have to do to disprove my contention is demonstrate a single constraint in a well designed schema that contradicts my assertion.
ben brugman - 01 Mar 2004 08:48 GMT > No such constraint exists. All you have to do to disprove my contention is > demonstrate a single constraint in a well designed schema that contradicts > my assertion. 1. (Data outside the one database). If the constraint is over more than one database, there is not ONE database that can check the constraint. If a link is not possible the constraint has to be checked outside the database.
2. (Security is kept outside the database.) Bank, black box identity checking, is done outside the database for several reasons, one is security. (The encryption and the codes are often kept out of the server system).
3. (Situation is to complex (performance) to be resolved inside the database). Content based authorisation. Allthough the authorisation could be handled by joining the 'correct' tables, this would make most queries (and views) prohibitive slow. (The complexity of this situation is far greater if implemented in the database than it is if implemented in the server coding, therefore the server coding is a better place to implement these constraints).
4.(Snapshot isolation does not prevent other transactions to alter read rows). Depending on the implementation of the RDBMS, some checks can not be done within the RDBMS. Read-consistency (Snapshot isolation) in Oracle prevents that within a transaction you can control the change of only read rows. (Another concurrent transaction can change rows that the first transaction has only read and are required to remain the same during the complete transaction).
>>If the application was written by a competent programmer, it likes it just >>fine. I am sorry to say, but I have always worked with real existing and to my opinion competent programmers. (A varying degree of competence I must admit). But here (and that has been written before) I must conclude that by the newsgroup standards my organisation has always worked with NON-competent programmers. It is often said that if a certain situation arrises then the cause must be incompetent programmers. I do not agree with that. But if that is the definition of competent programmers, so be it. Then I have to accomodate for the group of programmers we actually are working with.
Back to the argument, I think that in a lot of environments the communication between the database and the calling code the error handling and exception mechanism could be improved. If no checking would be done by the calling code and all checking would be done by the database. The errorhandling and differentiating between different errors would give rise to more complex coding than actually checking for those conditions before calling the database code. Inserting a 'child' record is often only allowed after the 'parent' record has been read by the application. The database then takes care of concurrency problems (disappearing parent), but in general the application takes care of the fact that the parent should exist before inserting.
ben brugman
Bob Badour - 02 Mar 2004 04:07 GMT > > > Then there are constraints which are more difficult or impossible > > > to implement in the RDBMS, and although the constraints should [quoted text clipped - 5 lines] > > > 1. (Data outside the one database). i.e. One made an arbitrary decision not to manage the data. This does has nothing to do with difficulty or impossibility.
> 2. (Security is kept outside the database.) i.e. One made an arbitrary decision not to manage the data's security.
> (The encryption and the codes are often kept out > of the server system). I suggest you switch to a secure server. Communicating codes from another source only adds an opportunity for interception.
> 3. (Situation is to complex (performance) to be resolved inside the > database). If it is too complex to resolve within the dbms, it is necessarily too complex to resolve anywhere else.
> 4.(Snapshot isolation does not prevent other transactions to alter read > rows). Poor choice of dbms does not relate to difficulty or impossibility.
> Back to the argument, I think that in a lot of environments the > communication > between the database and the calling code the error handling and exception > mechanism could be improved. Or removed entirely.
> If no checking would be done by the calling code and all checking would be > done by the database. A database is a set of facts. It does nothing on its own. A dbms does things. A file processor does things. An application does things. A database just is.
Dawn M. Wolthuis - 26 Feb 2004 20:16 GMT > I agree with most that is allready replied. > Constraints should be centralised. But the problem [quoted text clipped - 13 lines] > (I hate web applications which first ask all the questions and in the end > tell me that I am not allowed to do something because of .....) If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS COMPLEX DATA -- locating it once in the database and again in a different language in the business rules used by the GUI. Is that correct? It does seem like you are in the majority, but I just don't see what is gained by coding these rules twice in different languages -- sounds like busy work that leads to complex-to-maintain software and is likely to be out of synch with itself, with app programmers prone to change only the application software and not the dbms constraints when they can do so.
> Then there are constraints which are more difficult or impossible > to implement in the RDBMS, and although the constraints should > be 'kept' centralised. These have to be enforced outside the RDBMS. > (Some database people say that if a constraint can not be implemented > in a database, then it can not be a real constraint. I do not agree with > that.) "Centralized" is fine, but another minor point today (more major perhaps in the future) is that the database could be one of many accessed by the software application, so that a central place for the business logic / constraints / data validation information would more likely be within the software code that is external to one of the databases, right?
> The advantage of enforcing the constraints in the RDBMS over > another centralised system is that it also protects against concurrency and > that it is often better scalable than a 'coding' solution. Let's say that we have no issue with scaling on this or even find it better not to have to go to the database to find that the valid options for radio buttons on a flag are True & False, or that a valid date is one on a calendar. What is an example of the biggest risk related to concurrency? If the spec of a constraint is that a field needs to have a value that comes from a list of stored data, then the database would still be read to get that data -- just not, perhaps, to find out the name of the table that needs to be used. But I suspect you are on to something I'm missing, so please advise.
<snip>
Thanks. --dawn
ben brugman - 27 Feb 2004 16:35 GMT > If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS > COMPLEX DATA -- locating it once in the database and again in a different [quoted text clipped - 4 lines] > with itself, with app programmers prone to change only the application > software and not the dbms constraints when they can do so. Input of a date for example, it is fairly common dat the Gui only accepts real date's. The constraint that there is no 30th of februar is checked in the Gui and it is checked again in the RDBMS. Often when entering information, for example lending a book, this can only be entered for an existing lender. The application does not allow to enter a non existing lender and the RDBMS still checks that that the lender does exist.
> "Centralized" is fine, but another minor point today (more major perhaps in > the future) is that the database could be one of many accessed by the > software application, so that a central place for the business logic / > constraints / data validation information would more likely be within the > software code that is external to one of the databases, right? If the software is used for more than one database and the database is not used for different types of software then the coding in the software has to be done only once. But in the past it has been more likely that the software will change than that the database will change.
> Let's say that we have no issue with scaling on this or even find it better > not to have to go to the database to find that the valid options for radio [quoted text clipped - 5 lines] > to be used. But I suspect you are on to something I'm missing, so please > advise. Some constraints are based on data read, if these constraints are done in the database, there is no concurrency issue. If you check the total amount of data in the application the read data has to be locked. (Predicate locking). Oracle for example does not allow predicate locking so there you have to find a way around it. Checking the constraint in the database does not have this problem.
For example two partners have both their own bank account, they are allowed to have an overdraft, but the two accounts together must be kept out of the read. If this constraint is implemented outside the database, depending on the way it is implemented, both partners will be capable of overdrafting their own account if the timing is 'perfect'. A real 'serializable' level does prevent this, but Oracle does not supply a real 'serializable' level. So then you might be better of to implement this constraint in the database, so that when to transactions occure at the same time the constraint is still checked against correctly.
ben brugman.
> <snip> > > Thanks. --dawn Eric Kaun - 03 Mar 2004 16:01 GMT > If I understand you correctly, you are actually IN FAVOR OF DUPLICATING THIS > COMPLEX DATA -- locating it once in the database and again in a different > language in the business rules used by the GUI. Is that correct? Yes, it's a necessity if you want any assurances. If you only have ONE application, with ONE GUI, then you may not have a problem (for now).
And the rules aren't duplicates - not exactly. For example, the GUI sort of needs the DBMS to provide rapidly-changing data like currency exchange rates, right?
The GUI, like an XML doc or a nested data structure, is a single and simplistic view of things. Real things can support multiple views, from various angles, which each have utility for different parts of the business. But even in apps I've written, I've provided GUIs that let them envision the data in different ways (sometimes they wanted to see paint color variants by model, sometimes they wanted to see models by color variant.) Letting one particular GUI govern my rules is putting the cart before the horse.
> It does > seem like you are in the majority, but I just don't see what is gained by > coding these rules twice in different languages -- sounds like busy work > that leads to complex-to-maintain software and is likely to be out of synch > with itself, with app programmers prone to change only the application > software and not the dbms constraints when they can do so. I agree, code them in an RDBMS and then apply simple templates to generate the UIs. We're in agreement.
> "Centralized" is fine, but another minor point today (more major perhaps in > the future) is that the database could be one of many accessed by the > software application, so that a central place for the business logic / > constraints / data validation information would more likely be within the > software code that is external to one of the databases, right? If the application accesses many DBMSs, then the other programs that also access those DBMSs have rules of their own. You have a serious rule coordination problem.
If your app is going to be the only one setting the rules, then why have multiple DBMSs anyway?
If the databases really are separate, then you need conversion logic (predicates) which allow interpretation in one or both directions.
- Eric
Dawn M. Wolthuis - 03 Mar 2004 19:40 GMT > > If I understand you correctly, you are actually IN FAVOR OF DUPLICATING > THIS [quoted text clipped - 7 lines] > needs the DBMS to provide rapidly-changing data like currency exchange > rates, right? There are definitely local constraints for specific applications that would not be specified as global constraints. It would be foolish to specify all global constraints in one language and apply them with one data integrity engine, while coding local ones using different tools entirely, I would think. Local constraints are just global constraints that have some additional conditional logic in them. And we definitely want either type of constraint to be easy to modify, with quality assurance, of course.
> The GUI, like an XML doc or a nested data structure, is a single and > simplistic view of things. Real things can support multiple views, from [quoted text clipped - 3 lines] > model, sometimes they wanted to see models by color variant.) Letting one > particular GUI govern my rules is putting the cart before the horse. You are right that it is not the way that the data is presented that is a constraint. You should be able to put different GUI views with the same constraints.
> > It does > > seem like you are in the majority, but I just don't see what is gained by [quoted text clipped - 6 lines] > I agree, code them in an RDBMS and then apply simple templates to generate > the UIs. We're in agreement. I'll go for that. I don't have a reason to "gen" a lot of intermediate steps -- as someone once said (I'll have to track it down) "the code is the spec" so taking that code and gen'ing byte-code for whatever virtual machine you are running the dbms & UI constraint handling within ... [This is a dance we could do all day, of course, but bottom line is that we agree on the theory and envision different implementation approaches, I think]
I better do some "real work" for a while, so I'm leaving many juicy responses unanswered for now. Cheers! --dawn
Eric Kaun - 04 Mar 2004 13:30 GMT > There are definitely local constraints for specific applications that would > not be specified as global constraints. No, there aren't. If app X deals with a completely different set of data than app Y, then you can easily apply each app's "individual constraints" as global ones, qualified by the subset predicates. And if they deal with the same data, then you can't have X writing data that violates Y's constraints, can you? That would be nonsense.
Do you have an example? If you're talking about simple data transformation, then calculating f(a) requires that a meets its constraints, and if you're storing results in b, that f(a) meet the appropriate constraints for b.
I'm not sure where the need for this division is.
> It would be foolish to specify all > global constraints in one language and apply them with one data integrity > engine, while coding local ones using different tools entirely, I would > think. I agree.
> Local constraints are just global constraints that have some > additional conditional logic in them. Then they're global. For example:
A => X & B=>Y
A and B can be propositions about the same data, and X and Y indicate their implications. A can be for app A, B for app B. The global constraint is their conjunction (the above can be written differently, of course).
> I better do some "real work" for a while, so I'm leaving many juicy > responses unanswered for now. Yeah, me too... hard to tear myself away, though...
- Eric
Dawn M. Wolthuis - 05 Mar 2004 20:41 GMT > > There are definitely local constraints for specific applications that > would [quoted text clipped - 11 lines] > > I'm not sure where the need for this division is. I think we agree that all constraints can be seen as constraints on the data globally by simply narrowing the scope of the constraint. However, this is what I have seen happen -- tell me if you do not think this is the norm:
Data get specified in requirements, we normalize them and stick 'em in schemas, with constraints and all. Then somewhere in the life of the database, changes are made to the initial applications or new applications are built where the database constraints are necessary but not sufficient for taking in the data properly from a user, for example.
Let's say that the developer and DBA get along (there have been cases, as I understand it) and when a constraint allows for all real numbers to two decimal positions and it is determined that the requirements change so we are going to collect only whole numbers, the database constraints are changed. However, even this programmer, when coding a portion of the overall application that is used only by the xyz department where the value of this attribute must be between 2 & 5 simply adds the logic into that application for this restriction rather than having any change made at the database level.
Does this still happen in real software development today or am I stuck in the 80's? If it does, you might think that the answer is to get the programmers to do things right. Ah, but there are ways to provide carots rather than sticks even for developers. If a software developer can handily change a constraint they write but can't easily change one they don't, then guess what?
smiles. --dawn
Eric Kaun - 05 Mar 2004 21:07 GMT > I think we agree that all constraints can be seen as constraints on the data > globally by simply narrowing the scope of the constraint. However, this is [quoted text clipped - 15 lines] > application for this restriction rather than having any change made at the > database level. But if that application then reads data written by a different application, which wrote a 6, what happens? If that application WRITES only 2-5, fine... but the 2-5 range either is a restriction for the data domain or it isn't. What does it mean that the data "must be between 2 & 5"? That introduces a problem if that application and others are modifying the same data. What about the "old" values under 2 or above 5?
> Does this still happen in real software development today or am I stuck in > the 80's? If it does, you might think that the answer is to get the > programmers to do things right. Ah, but there are ways to provide carots > rather than sticks even for developers. If a software developer can handily > change a constraint they write but can't easily change one they don't, then > guess what? Either the meaning of the data has changed (in which case you have to cope with all the data already out there, or convert it), or the application is dealing with a more restricted set of the data (which is fine). It's not a local constraint - it's a query. Maybe I'm just confused by the example.
- Eric
Dawn M. Wolthuis - 05 Mar 2004 21:49 GMT > > I think we agree that all constraints can be seen as constraints on the > data [quoted text clipped - 40 lines] > dealing with a more restricted set of the data (which is fine). It's not a > local constraint - it's a query. Maybe I'm just confused by the example. It is a lousy example, but if you can stick with it this "local constraint" (one that applies to this application related to the database, but not to other applications on the same data) is one that has me putting a dropdown list of values 2, 3, 4, & 5 for the user to choose from. Where did the application get those values? From somewhere in the non-database side of the application, I'm guessing. Why? Because I, the programmer, can control the constraints here so that when 6 becomes a valid value, I can change my dropdown list without anyone else involved. This has to do with human beings and how they work. Programmers will almost always do what they see as having the best quality solution from their perspective and that perspective is often at odds with what a dba would see as the best quality solution. If both could manage validation/constraint specification/logic within the same environment ...
Anyway, it relates to the same point I started with -- that the way we have our system development partitioned today and where we have our constraint processing handled, we end up with multiple specifications of validation/constraint handling. I don't think our systems should be partitioned the way they are, nor do I think the roles on the project team should be partitioned into dba's and programmers.
--dawn
Jerome H. Gitomer - 06 Mar 2004 19:12 GMT [Big Snip]
> It is a lousy example, but if you can stick with it this "local constraint" > (one that applies to this application related to the database, but not to [quoted text clipped - 18 lines] > > --dawn You might partition your system development that way, but some of us use a different model. For example, I use a model based on the concept of noun and verb. Verbs (procedures) operate on Nouns (Data). Data is further divided into two categories: Control Data and Application Data.
Using my model your dropdown list would be Control Data stored in a table and could only be changed when the installations change and validation criteria were satisfied.
My model perpetuates the dba and developer project team for good reason -- Separation of Authority. It reflects the need for commercial enterprises to protect their data from change except through authorized, controlled and monitored processes. Those who control the data should not develop the procedures that operate on the data and those that develop the procedures should not be responsible for the integrity of the data. Experience has proven that it is too risky to do otherwise -- not from a systems point of view, but from the point of view of the enterprise that owns and uses the data.
Jerry
mountain man - 07 Mar 2004 01:20 GMT > You might partition your system development that way, but some > of us use a different model. For example, I use a model based [quoted text clipped - 16 lines] > systems point of view, but from the point of view of the > enterprise that owns and uses the data. Well explained Jerry.
Dawn M. Wolthuis - 08 Mar 2004 01:01 GMT > [Big Snip]
> My model perpetuates the dba and developer project team for good > reason -- Separation of Authority. It reflects the need for [quoted text clipped - 6 lines] > systems point of view, but from the point of view of the > enterprise that owns and uses the data. I think this is the best reason I can see for separating these roles. I don't think it is good enough in most cases, however, to put up with the side-effects such as:
1) code that has duplication of constraints (as is the topic here) 2) meetings and more meetings to have people attempt to agree when they have such different goals which is often the case when putting a dba and a programmer in the same room 3) high cost of ownership of software due to cost of maintenance including changes to "constraints" in multiple locations, multiple people & skillsets required to make changes, etc. 4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored then too).
When the issue is whether to drop a bomb somewhere, then I want checks and balances. When it has to do with painting a wall in my house, I really don't want the speed & quality of committee work on that effort. I guess I'm saying that the risks that the average company with the average information system has to accept in not separating these two roles is worth it in most cases. --dawn
Jerome H. Gitomer - 08 Mar 2004 15:51 GMT >>[Big Snip] > [quoted text clipped - 12 lines] > don't think it is good enough in most cases, however, to put up with the > side-effects such as: Stop and think carefully about what you just wrote. I don't think you would be happy if the IT departments of your bank, your credit card company, your mortgage lender, your automobile loan provider, and whoever you got your tuition loans from didn't have separation of duties and extensive checks and balances built into their systems.
The best way to enforce the universal application of common constraints is to incorporate them into the database. This is because they then appear once and only once in the application and, hopefully, in the institution as a whole.
Allowing individual developers to provide their own constraints (which implies duplication of constraints) will either have an exponential effect on the cost of maintaining the application or make it unmanagable.
> 1) code that has duplication of constraints (as is the topic here) If the constraints are in the database there is no reason for them to be coded in the application.
> 2) meetings and more meetings to have people attempt to agree when they have > such different goals which is often the case when putting a dba and a > programmer in the same room There is one and only one set of constraints that apply to data. It either meets the constraints and is therefore acceptable or it does not and is therefore unacceptable. There are no issues between the dbas and the developers in this regard.
> 3) high cost of ownership of software due to cost of maintenance including > changes to "constraints" in multiple locations, multiple people & skillsets > required to make changes, etc. The cost of software maintenance is negligible in comparison to the cost and damage resulting from the failue to catch an error in, to stick with the financial industry, the savings account processing or loan processing applications in a bank.
> 4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored > then too). [quoted text clipped - 5 lines] > information system has to accept in not separating these two roles is worth > it in most cases. --dawn In closing I will cite one actual case of a bad program constraint that cost millions of dollars. The company in question had many divisions, including a computer division. A new director in looking over the division balance sheets saw that the computer division was losing money and forced the sale of the computer division. When the profits did not increase to the expected level in the following year an investigation revealed that the revenues for computer service had been credited to the service division rather than the computer division and that the computer division really was profitable.
The code constraints in the program used to determine which division received credit for revenue were incorrect. Unfortunately they were buried too deep in the system to have been discovered in time to avoid what proved to be a very costly mistake.
Jerry
Dawn M. Wolthuis - 08 Mar 2004 18:31 GMT > >>[Big Snip] > > [quoted text clipped - 19 lines] > didn't have separation of duties and extensive checks and > balances built into their systems. "Extensive checks and balances" is not the same as "separation of duties" but even with a separation of duties, separating them so as to get multiple sets of constraint logic does not make good sense.
> The best way to enforce the universal application of common > constraints is to incorporate them into the database. This is > because they then appear once and only once in the application > and, hopefully, in the institution as a whole. If one could incorporate all such constraints into the database and use them for all needed purposes from there, that's fine. But you can't, so they end up strewn throughout the code. Do you read code often? Are you not seeing any additional constraint logic for validation, GUI display or other purposes within the code? If not, I suspect your setup is exceptional.
> Allowing individual developers to provide their own constraints > (which implies duplication of constraints) will either have an > exponential effect on the cost of maintaining the application or > make it unmanagable. No disagreement here -- that's was my point up front, but from the other direction. I simply don't want to have the constraints spec'd, respec'd and then spec'd some more as seems the case in code I have seen over the past couple of decades.
> > 1) code that has duplication of constraints (as is the topic here) > > If the constraints are in the database there is no reason for > them to be coded in the application. Perhaps you have missed my argument on this. Many, many constraints show up in the software applications. For example, if the database says that a particular attribute must be validated against a table of possible values, then that constraint is applied by the database when CRUD services are invoked. However, prior to then, a GUI is likely to toss up a drop down list of valid values for data entry so that no values other than the ones the database constraint permits are available to the user. The user never enters an invalid value in this application, then, but the application developer invariably as recoded the constraint that says that this field is validated by that table.
If this is not the case in your situation, how is code that displays the GUI getting such information? Is the software developer really reading this constraint info from the database? Even if they were to retrieve a spec from the database, they would then be reimplementing the logic that interprets and applies that spec unless the database validation logic is able to be invoked for purposes other than CRUD services.
> > 2) meetings and more meetings to have people attempt to agree when they have > > such different goals which is often the case when putting a dba and a [quoted text clipped - 4 lines] > acceptable or it does not and is therefore unacceptable. There > are no issues between the dbas and the developers in this regard. Is that the experience of others on ths list too? I work with small to mid-size businesses. I don't see blatant issues between the two groups -- just a bit of passive-agressive behavior. DBAs build walls and programmers build ladders.
> > 3) high cost of ownership of software due to cost of maintenance including > > changes to "constraints" in multiple locations, multiple people & skillsets [quoted text clipped - 4 lines] > in, to stick with the financial industry, the savings account > processing or loan processing applications in a bank. I certainly believe that good QA is important, I just don't see the current split between the roles of the DBA and programmer as contributing to that quality in a significant way. I do see where it could make it a little more complex for a corrupt developer to steal, but if the production environment is adequately locked down ... Anyway, as I mentioned, the issue of having multiple people with multiple roles so there are checks and balances is the only reason I can see for the currrent strategy of separation of skills and roles between dba and programmer. So, that type of security is still needed, but it doesn't have to require that there be two such groups, nor that they use two entirely different toolsets for constraint logic (for example).
> > 4) and a few others, but I'm boring myself (sorry, I'm guessing you're bored > > then too). [quoted text clipped - 22 lines] > been discovered in time to avoid what proved to be a very costly > mistake. Good story. We, as a profession, seem to have a poor handle on "our own" data that we use (programs, database-spec'd constraints, etc). It is not surprising that this data was "buried too deep".
Cheers! --dawn
mountain man - 08 Mar 2004 20:35 GMT > > [Big Snip] > [quoted text clipped - 29 lines] > information system has to accept in not separating these two roles is worth > it in most cases. --dawn The roles might simply be two hats that one person wears. The principle in question here is the multiple specification of constraints.
When systems get large, or when there is much change happening then these constraints need to be centralised in the DB.
After all, the whole point of a DBMS is to have I/O. Those who dont want to "waste valuable i/o" on performing database constraint or validation checking might consider allocating funds to a get better CPU if the validation i/o is not quick enough for the users.
Some solutions obviously seek middle ground, such as caching the data control-data locally.
Programmers and DBA's usually are guided by what the IT manager may decide. In Australia, the usual practice is that there is one single person responsible - in the end - for decisions. I dont know how you run things in the north of the planet, but there are times for meetings and there are times for decisions.
Your points 1 - 3 above would normally be resolved at the IT management level, and dependent upon the "technical philosophy" of that person, so the decisions and policies get made, etc.
That person should also be able to implement short-term workarounds of changing application code to present a new set of contraints, if the RDBMS cannot be prepared "overnight", and to oversight the change- management of this, and its coordination with the existing operations.
But in the end, change management (of the RDBMS and the applications) is the most expensive item in IT today. And it is for this very reason that the tendency to move all constraints to the RDBMS is growing.
Most things are better change-managed centrally than distributed.
Pete Brown Falls Creek Oz www.mountainman.com.au
Eric Kaun - 08 Mar 2004 20:47 GMT > After all, the whole point of a DBMS is to have I/O. You're not really serious? In other words, as long as it's persisted, it doesn't matter what "it" is?
> Those who > dont want to "waste valuable i/o" on performing database constraint > or validation checking might consider allocating funds to a get better > CPU if the validation i/o is not quick enough for the users. If they don't want to waste time doing that, they probably really don't want to waste time - and money - fixing the inevitable results of data corruption.
- erk
mountain man - 09 Mar 2004 12:38 GMT > > After all, the whole point of a DBMS is to have I/O. > > You're not really serious? In other words, as long as it's persisted, it > doesn't matter what "it" is? I'm not sure I understand what your saying.
What I am advocating is use of the database to hold all control data, validation data, constraints, etc. Where these are no longer held in the application code, this will imply physical RDBMS I/O in order that the application return them to the end user.
The original statement above might be paraphrased 'the whole point in having a DBMS is utilising it.'
> > Those who > > dont want to "waste valuable i/o" on performing database constraint [quoted text clipped - 4 lines] > to waste time - and money - fixing the inevitable results of data > corruption. In the long run, this is correct.
Dawn M. Wolthuis - 09 Mar 2004 03:26 GMT > > > [Big Snip]
> Programmers and DBA's usually are guided by what the IT manager > may decide. In Australia, the usual practice is that there is one single [quoted text clipped - 5 lines] > level, and dependent upon the "technical philosophy" of that person, so > the decisions and policies get made, etc. Yes, one thing IT management does is resolve issues when two or more people on a project team are not in agreement. That's how I know that DBAs and programmers often have very different goals. I'd prefer that everyone building my house have the same goal in actually building a quality house into which I can move.
<snip>
> But in the end, change management (of the RDBMS and the applications) > is the most expensive item in IT today. And it is for this very reason that > the tendency to move all constraints to the RDBMS is growing. ugh! That's what gets us to the topic at hand as it is -- the way that constraints are handled in today's DBMS's, from what I can see, is to employ constraint logic when CRUD services are requested, rather than having the necessary constraints available to the GUI developer, for example (I'm being a broken record, but ...)
> Most things are better change-managed centrally than distributed. Maybe in 1988 the database was a central repository for all data a company cared about, but that isn't the way it is today. The DATA ARE DISTRIBUTED -- so then you get to that "federated" term and all, but what you can do to centralize the constraint logic is create services for use by any and all that need it. How do you ENFORCE the use of these services? You can require the CRUD services for any databases you do "own" to employ these services, for one thing. And then there is the carrot rather than the stick for your GUI and other components -- if it is easier to use the data constraint services than not to and to change them when needed, then we can guess what will happen.
--dawn
Marshall Spight - 10 Mar 2004 23:19 GMT > Yes, one thing IT management does is resolve issues when two or more people > on a project team are not in agreement. That's how I know that DBAs and > programmers often have very different goals. I'd prefer that everyone > building my house have the same goal in actually building a quality house > into which I can move. This viewpoint doesn't scale. It requires every engineer working on a given system understand all the development and operating requirements of every part of the system. That only works if the system is small enough so that the engineers can invest the time needed to do that, and also do their job.
If they don't have enough time to do that, then they will have localized responsibilities and localized concerns that will sometimes conflict with those of other engineers working on the system, even if everyone agrees that the overall goal is to maximize the value of the company.
Marshall
mAsterdam - 28 Feb 2004 08:01 GMT > Constraints should be centralised. This is to general. I look at it this way: There are more sets of constraints, each with a different purpose. The contraints at the database serve to protect the integrity of the managed set of data. The constraints at the user-interface on the other hand serve to assist the user in providing the data he needs to provide in order to achieve his goal.
So here we have two sets of constraints. Let's call them 'D' and 'U'. You can look at a set of constraints as defining all possible combinations of a set of data, i.e. as a type.
Is U defining a subset of D? A superset? The intersection of the sets defined is the relevant set for the feared constraint-redundancy.
Now when there are lots of data and - relatively - just a few constraints ("Large databanks"), I won't worry about that redundancy. Bank ID's provide a good example for this.
Just my 2 Eurocents.
ben brugman - 01 Mar 2004 09:38 GMT > > Constraints should be centralised. > > This is to general. I look at it this way: There are more sets of > constraints, each with a different purpose. Yes, I do agree here. The sets do not have to be completely disjunct.
And most discussions in this 'newsgroup' only concern constraints which do apply to the database. (This is the set of data which is still present, after the machine is shut down. Wel the constraints 'are' not present, but the data still conforms to that set of constraints).
Then the centralising of the constraints. I do not think the implementation can be 'totally' centralised. But a nice situation would be that the model with all constraints is somewhere centralised in some form. (Documentation or something). This set of constraints has to be implemented and the implementation can be redundand. (In the code and in the database.)
Example there is an implementation of 'child' and 'parent' in the database, the database holds the constraint the 'parent' must exist everytime a child exists. Often the application is build in such a way that even if the constraint is dropped in the database, the application only can 'make' data in accordance with the constraint. Often the application can not 'violate' a constraint if it is the only application.
My point : All constraints should be 'included' in the model 'documentation'. (This is considered centralised.) Implementation of constraints should be done as close as possible to the data, preferably in the RDBMS. (The implementation can be centralised, but I think this is not realistic for all situations). Server and application coding often do implement a part of the constraints implemented in the database. (The risc here is that different software has different implementations of the 'same' constraint). Then some constraints are only implemented outside the RDBMS, the risc here is different implementations of the 'same' constraint and no central enforcement (RDBMS) of the constraint and over time the constraint can be changed (switched on or off or can be different). This gives the risc that even if a constraint is implemented that there is data not in accordance with that constraint.
( Some object modellers believe that all constraints should be implemented in 'their' object code. The implementation should be centralised for each object. The advantage being that there is a centralised constraint and that then there is a freedom to implement the data of the object independend of the constraint in any data-repository which can hold the data.
In concept I do agree with those views. But from past experience I totaly disagree with that concept. Why ? Environments change more than databases. Constraints are more difficult to enforce in this way. There is no hard link between the constraint itself and the existing data.
Implementing the constraint in an implemented database, the link is hard. Because to use the data you need the specific RDBMS and therefore can enforce the constraints. There are no databases which can be accesed by RDBMSses from different vendors. )
> The contraints at the > database serve to protect the integrity of the managed set of data. [quoted text clipped - 9 lines] > The intersection of the sets defined is the relevant set for the > feared constraint-redundancy. Even if a constraint if only of the type 'D' or of the type 'U' it can be implemented more than once. So I do not think this is the problem of the intersection. For example the 'U' set can be implemented more than once if more than once if serveral applications work with the same data.
> Now when there are lots of data and - relatively - just a few > constraints ("Large databanks"), I won't worry about that redundancy. > Bank ID's provide a good example for this. The risc of constraint redundency is not that it is implemented more than once, the risc is that there can be different implementations of the same constraint. (For example a new application using the same data but not implementing the constraint at all). (I do not know what you mean by the redundancy of Bank ID's).
Thanks for your participation, ben brugman
> Just my 2 Eurocents. Marshall Spight - 01 Mar 2004 16:15 GMT > And most discussions in this 'newsgroup' only concern constraints > which do apply to the database. It's a database theory newsgroup, so this should not be a surprise.
> (This is the set of data which is > still present, after the machine is shut down. Ack! That's not a good definition! RDBMSs are primarily about data management, not persistence. They happen to be a good place to implement persistence as a service, but there are other things that can do that too just as well. But those things can't do data management.
The idea that databases are primarily about persistence is toxic myth. You can have persistence without data management, and you can have data management without persistence.
>Well the constraints > 'are' not present, but the data still conforms to that set of constraints). [quoted text clipped - 5 lines] > This set of constraints has to be implemented and the implementation > can be redundand. (In the code and in the database.)
> Example there is an implementation of 'child' and 'parent' in the database, > the database holds the constraint the 'parent' must exist everytime a > child exists. Often the application is build in such a way that even if > the constraint is dropped in the database, the application only can > 'make' data in accordance with the constraint. Often the application > can not 'violate' a constraint if it is the only application. It is important to differentiate between application constraints and data constraints. Something that is only necessary for one particular application and not for others is not a data constraint, and does not have the same centralization requirements.
> My point : > All constraints should be 'included' in the model 'documentation'. > (This is considered centralised.) Not by me it isn't.
> Implementation of constraints should be done as close as possible > to the data, preferably in the RDBMS. [quoted text clipped - 4 lines] > different software has different implementations of the 'same' > constraint). That's why centralization is essential.
Let's put it this way: you centralize the constraints for the same reasons you centralize the data. In fact, the constraints *are* a part of the data; that's why you can't separate them.
> Then some constraints are only implemented outside the RDBMS, > the risc here is different implementations of the 'same' constraint and > no central enforcement (RDBMS) of the constraint and over time > the constraint can be changed (switched on or off or can be different). > This gives the risc that even if a constraint is implemented that there > is data not in accordance with that constraint. This is another problem with centralization solves.
Marshall
ben brugman - 01 Mar 2004 17:08 GMT > > And most discussions in this 'newsgroup' only concern constraints > > which do apply to the database. [quoted text clipped - 8 lines] > implement persistence as a service, but there are other things that > can do that too just as well. But those things can't do data management. The 'definition' concerns the database, not the RDBMS or not the (RDBMS-)instance. So for the database the definition 'holds'. For a RDBMS or a (RDBMS-)instance you are correct. The database part is only the persistence part which can be recovered after a crash. (But I must admit that I often use the term database where RDBMS should be used. In the above case I didn't).
> The idea that databases are primarily about persistence is toxic myth. > You can have persistence without data management, and you can > have data management without persistence. Databases are about persistence. RDBMSses are about the system around the database.
> It is important to differentiate between application constraints and > data constraints. Something that is only necessary for one particular > application and not for others is not a data constraint, and does > not have the same centralization requirements. Constraints are requirements of the 'user', or the 'one' who is defining the system. At specification time of the requirements it does not have to be clear if a constraint will be implemented in a database or in any other way. So a specification time for some constraints it is not possible to differentiate between appliation constraints and data constraints. When implementing the differentiation has to be made. But before implementing it would be nice to have all constraint somewhere centralised.
> > My point : > > All constraints should be 'included' in the model 'documentation'. > > (This is considered centralised.) > > Not by me it isn't. Ok.
(Refraise : The requirements should give or dictate the constraints. The requirements should be a complete set. It would be nice that the set is not scattered over several systems/documents/buildings).
> > Implementation of constraints should be done as close as possible > > to the data, preferably in the RDBMS. [quoted text clipped - 6 lines] > > That's why centralization is essential. No it is not, we run a legacy database only capable of implementing very rudimentair constraints and constraints have to be implemented into the coding. We have survived in the marked without this essential centralization. So centralization is not essential for survival.
And you can see from my other mails that I have given some examples where constraints outside the RDBMS is the 'best' option. And also in one of the mails I described that some people call constraints outside the RDBMS as non database constraints. I do not agree with that. But I must agree it is debatable when a constraint is a database constraint and when it is not.
> Let's put it this way: you centralize the constraints for the same > reasons you centralize the data. In fact, the constraints *are* [quoted text clipped - 8 lines] > > This is another problem with centralization solves. No it does not. If the centralization is done in application code it does not solve the problem. And if different database venders are used it does not solve the problem either. (Two databases can not be centralised in one RDBMS). If the constraints are implemented with triggers than this problem is not solved either.
I agree that referential constraints and domain constraints should be implemented in the RDBMS (if possible, using two databases does not allow for referential constraint checking within one RDBMS). But a lot of constraints (which I still think of as database constraints) are implemented outside the RDBMS.
ben brugman.
> Marshall Bob Badour - 02 Mar 2004 04:20 GMT > > "ben brugman" <ben@niethier.nl> wrote in message > news:4043049f$0$269$4d4ebb8e@read.news.nl.uu.net... [quoted text clipped - 24 lines] > > Databases are about persistence. A database is a set of facts. It might persist, and it might not.
> > It is important to differentiate between application constraints and > > data constraints. Something that is only necessary for one particular [quoted text clipped - 3 lines] > Constraints are requirements of the 'user', or the 'one' who is defining > the system. Nope. They are limitations imposed on the user to keep the fallible from doing what they do best.
Eric Kaun - 03 Mar 2004 16:16 GMT > > > Ack! That's not a good definition! RDBMSs are primarily about data > > > management, not persistence. They happen to be a good place to [quoted text clipped - 6 lines] > > A database is a set of facts. It might persist, and it might not. Agreed. And if we had languages that supported relations for all processing, we wouldn't be in the O-O mess we're in now. It would be just amazingly useful to be able to perform relational operations on data in memory... and then, at the end when satisfied, persist it.
(ben wrote)
> > Constraints are requirements of the 'user', or the 'one' who is defining > > the system. > > Nope. They are limitations imposed on the user to keep the fallible from > doing what they do best. I'd say both, actually. Certainly they prevent errors, but they also serve to communicate (to developers and DBAs and such). And they document (in executable form!) the meaning of the system's data.
- Eric
Marshall Spight - 11 Mar 2004 18:15 GMT > Agreed. And if we had languages that supported relations for all processing, > we wouldn't be in the O-O mess we're in now. It would be just amazingly > useful to be able to perform relational operations on data in memory... and > then, at the end when satisfied, persist it. Okay, that sounds great and all, but how's it going to work? For one thing, the operations shouldn't be limited to what's in memory any more than the DBMS has that restriction. And what about things like keys? Let's say we want to migrate some hierarchically-stored data into our RDBMS, how do we make that work? We need to have the foreign keys, which means we need to allocate k
|
|