Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / November 2008

Tip: Looking for answers? Try searching our database.

Why is database integrity so impopular ?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
eric.bouchardlefebvre@gmail.com - 05 Oct 2008 19:30 GMT
Hello,

When time comes to build transactional databases (as opposed to data
wharehouses), I belong to the school that STRONGLY believe in
normalizing data with high integrity mechanisms.  I know all the
performance cons but IMHO, pros largely overwhelme.

It amazes me, though, how many systems rely on the application to
manage data integrity.  I work as IT director for a large-size
manufacturer and *none* of our applications use integrity.  And I am
talking here of ERP and other mission-critical systems.

In fact, I had rarely open a database properly normalized and
inforced ... and I have been working with databases for over 10 years,
mostly in sectors where lack of integrity can result in dramatic
consequences.

What is wrong with modern DB design approaches?  And what's the point
of using a big relational DB without the benefits of integrity and
normalization?

Thank you,
EBL
Alfredo Novoa - 05 Oct 2008 23:56 GMT
Hello Eric,

> It amazes me, though, how many systems rely on the application to
> manage data integrity.  I work as IT director for a large-size
> manufacturer and *none* of our applications use integrity.

I supose you are new in the position ;-)

> In fact, I had rarely open a database properly normalized and
> inforced ... and I have been working with databases for over 10 years,
> mostly in sectors where lack of integrity can result in dramatic
> consequences.

I have the same experience, and I never have found an IT director who
thinks like you.

> What is wrong with modern DB design approaches?  And what's the point
> of using a big relational DB without the benefits of integrity and
> normalization?

The problem is not in modern database theory, the problem is that most
developers don't know the foundations of their profession and common
sense is very uncommon.

In my country most people who develop business systems never read a
database theory book. The few ones who studied a database course in
the university never understood database theory very well at all, and
they forgot almost everything just on the end of the final
examination.

The textbook we used didn't have any chapter devoted to database
integrity, only a few pages about the poor SQL declarative integrity
support, and not covered with exercices. The whole Relational Model
was dispatched in five hours or so, and taught with many mistakes and
misconceptions.

The vendor's training materials are usually even worse.

And we also have all that abject oriented programming stuff saying
that RDBMS's are nothing but silly and cumbersome register buckets.

It is not only integrity and normalization. Most developers I know are
not able to write non trivial queries and they load the data in the
applications using simple queries, make several iterations on the
registers, and send the data back to the DBMS.

In the business software industry, technical incompetence is the norm,
and the develpment tools we have are awful.

Regards
eric.bouchardlefebvre@gmail.com - 06 Oct 2008 15:16 GMT
> Hello Eric,
>
[quoted text clipped - 3 lines]
>
> I supose you are new in the position ;-)

Yes :-) but I have been working as a developper for many years before.

And as a developper, even when doing team OO programming, it was clear
that integrity was part of the database.  This did not mean that
applications could not use their own integrity rules, it only meant
that the database was ultimately setting the rules.  Additionaly, it
is now comon to see databases shared among many applications ... it's
therefore common sens to concentrate integrity at the data layer.

When I moved from a developper to a manager position, I had to work
with commercial applications I had not developped.  Eventually - and I
guess it's everywhere the case - we had to share data from those
databases to new internally built applications.  That's when I first
noticed poor data normalizations and complete absence of integrity.
Later on, errors at some ERP applications also put in evidence the
absence of integrity.

The strangest thing about this is the fact programmer use complex
RDBMS as a simple persistent data environemtn.  If I had to developp
an application with all the integrity concentrated  in the
application, I would use simple text files to store data.  The result
would be the same in terms of security but I would obtain more
performance.
patrick61z@yahoo.com - 31 Oct 2008 21:16 GMT
> Hello Eric,
>
[quoted text clipped - 30 lines]
>
> Regards

My complaint is that there's too much work to be done and not enough
money to pay for the more expensive relational purists to implement
stuff. What the industry needs is to split up the database into
components so that implementers can decide how to persist their
applications. Sometimes you just want to write data reliably without
worrying about what normalization level you're at.

If and when a project needs fully relational recordkeeping, this
should be available too for those projects that can budget for it.
paul c - 06 Oct 2008 03:21 GMT
> Hello,
>
[quoted text clipped - 19 lines]
> Thank you,
> EBL

Integrity is rare in general.  When it comes to db's, even rarer, partly
because of pressures on naif's to deliver faster on machines that would
be faster than needed if only people  took the time to determine the
essential requirements of systems.
Tony Toews [MVP] - 06 Oct 2008 03:43 GMT
>What is wrong with modern DB design approaches?  And what's the point
>of using a big relational DB without the benefits of integrity and
>normalization?

Probably because the talented people at the software vendors are the sales droids.
The system was originally designed by someone who didn't have a clue and they managed
to sell a few copies and kept on going.   Any competent database or developers or
systems analyst types likely didn't stay very long either.   And management is
cracking the whips on generally mediocre people to get the product out as quickly as
they can.  "To heck with bugs.  It's features that count."

So how's that for generalizing?  <smile>

I really enjoy reading the WTF and other similar blogs for such stories.  And then
there's comp.risks where mistakes have killed people.

Tony

Signature

Tony Toews, Microsoft Access MVP
  Please respond only in the newsgroups so that others can
read the entire thread of messages.
  Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
  Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

Alfredo Novoa - 06 Oct 2008 11:58 GMT
Hello Tony,

> The system was originally designed by someone who didn't have a clue and they managed
> to sell a few copies and kept on going.   Any competent database or developers or
> systems analyst types likely didn't stay very long either.

I know cases like this, and also cases where the systems were
originally developed using DBase or Access and ported to a DBMS using
an automatic database importer without any redesign.

Regards
Tony Toews [MVP] - 07 Oct 2008 03:11 GMT
>I know cases like this, and also cases where the systems were
>originally developed using DBase or Access and ported to a DBMS using
>an automatic database importer without any redesign.

Let me add that my databases built in Access are good designs and have relational
integrity. Although I've seen lots that aren't.  <sigh>

Tony
Signature

Tony Toews, Microsoft Access MVP
  Please respond only in the newsgroups so that others can
read the entire thread of messages.
  Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
  Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/

Daniel Pitts - 06 Oct 2008 18:51 GMT
> Hello,
>
[quoted text clipped - 19 lines]
> Thank you,
> EBL
I think that part of the problem is DB design and Application design are
really different types of abstraction.  For application programmers,
dealing with DB constraints is tedious.

The truth is that whenever your "Application" calls for persistence, it
is no longer just an "Application"; it has become a "System".  System
design is a higher level abstraction.

Moving from Application design to System design is /almost/ a natural
progression, and many engineers traverse the barrier without ever
realizing and without learning the other aspects of System design. This
includes learning proper DB design.

I admit that I fell into that category for some time. My background has
been Application design, but I've started to appreciate the concept of
constraints at ever level of the System. I even sometimes wish that the
DB could do more validation than it does, even if it makes things a
little more "tedious".  In this case, tedious just means the hard
problem is already solved.

Signature

Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

David BL - 07 Oct 2008 09:05 GMT
On Oct 7, 1:51 am, Daniel Pitts
<newsgroup.spamfil...@virtualinfinity.net> wrote:

> I think that part of the problem is DB design and Application design are
> really different types of abstraction.  For application programmers,
> dealing with DB constraints is tedious.

I assume you mean it's tedious when done in application code.   It's
easy when the integrity constraints are defined declaratively and
enforced by the DBMS.

> The truth is that whenever your "Application" calls for persistence, it
> is no longer just an "Application"; it has become a "System".  System
> design is a higher level abstraction.

So is notepad.exe a "system" because it allows for persisting text
files?  ;)

The need for a formal data model has to do with the need to encode
complex values, irrespective of whether persistence is needed.   One
can have persistent or transient data, just as one can have persistent
or transient finite state machines.   Of course mostly we want data to
persist and state machines to be transient, and there are reasons for
that - some quite subtle, such as the difficulty of defining
consistent cuts on distributed state machines, but it is hardly a
definition of the distinction between data and state machine.

A common modern view of a large distributed system, is of interacting
applications that communicate either synchronously or asynchronously
such as by using web services.  The various databases in the system
take a back seat to this application-centric view point.

However I suspect most of the data theorists in this ng prefer the
idea that state machines and data are kept apart at ever increasing
scales in the "system" - so even at an "enterprise" level there is a
clean separation between applications and the enterprise wide data.

I like to define "data-centric architecture" according to the 4th item
in the following article:

http://en.wikipedia.org/wiki/Database-centric_architecture

Another crime perpetrated by ignorant application programmers is to
heavily use application defined messaging protocols to build
distributed systems.   It is usually much easier for applications to
communicate indirectly via a shared database.
patrick61z@yahoo.com - 31 Oct 2008 20:58 GMT
On Oct 6, 12:51 pm, Daniel Pitts
<newsgroup.spamfil...@virtualinfinity.net> wrote:
> eric.bouchardlefeb...@gmail.com wrote:
> > Hello,
[quoted text clipped - 43 lines]
> --
> Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

I think that relational database theory is too limitting for some
applications. I believe that the modern database needs to be split
into component parts so that not everyone has to be saddled with the
relational part.
paul c - 31 Oct 2008 23:07 GMT
> On Oct 6, 12:51 pm, Daniel Pitts
> <newsgroup.spamfil...@virtualinfinity.net> wrote:
[quoted text clipped - 44 lines]
> into component parts so that not everyone has to be saddled with the
> relational part.

No offence, but I think you are way off-base to associate any version of
the rm as being aimed at transactions in any way, neither theory (that's
being charitable as much of the transaction writing has no formal
theory) has much connection with other, at least so far.  Suspect such
thought has to do with marketing bumpf from various product publicizers,
indoctrination seems to be involved.  Still, I must say that a
declarative model of some version of transaction interests me though
I've never seen anybody try to explain one.  But so far, you haven't
even hinted at that notion  as far as I can tell.  I think the happy
cirsumstance is all-important when trying to transpose one technique to
a new purpose and the first insight must be to recognize the circumstance.
Bob Badour - 01 Nov 2008 00:07 GMT
>> On Oct 6, 12:51 pm, Daniel Pitts
>> <newsgroup.spamfil...@virtualinfinity.net> wrote:
[quoted text clipped - 56 lines]
> cirsumstance is all-important when trying to transpose one technique to
> a new purpose and the first insight must be to recognize the circumstance.

Paul, please, who cares about ignorant opinions that have been asserted
with no support whatsoever? So the guy boldly steps forward to show off
his ignorance. Do you really have to reply to it?
paul c - 01 Nov 2008 03:57 GMT
...
> Paul, please, who cares about ignorant opinions that have been asserted
> with no support whatsoever? So the guy boldly steps forward to show off
> his ignorance. Do you really have to reply to it?

You're probably right.
patrick61z@yahoo.com - 01 Nov 2008 13:55 GMT
> patrick...@yahoo.com wrote:
> > On Oct 6, 12:51 pm, Daniel Pitts
[quoted text clipped - 57 lines]
> cirsumstance is all-important when trying to transpose one technique to
> a new purpose and the first insight must be to recognize the circumstance.

Maybe the A of ACID. For instance, what would I use if I want every
record to store or none, but otherwise wasn't too picky? It could be a
filesystem question I guess.

Look at the web market for instance, every day I run across a web
application that has lost contact with its sql server way before the
webserver has lost contact with its visiting browser. I'm just saying,
sometimes you really want every clock cycle you can get and still want
atomicity of your persistance state (well, maybe could be just me). If
you don't think relational integrity of your data model burns oil then
maybe, but in that case, show me the money and the rdbms that I can
install on my very marginal rented webspace.
Walter Mitty - 01 Nov 2008 14:57 GMT
> I think that relational database theory is too limitting for some
> applications. I believe that the modern database needs to be split
> into component parts so that not everyone has to be saddled with the
> relational part.

I think you might have meant  "the relational data model" where you wrote
"relational database theory".   True?

Could you give an example?

If you design part of your database to be relational, and another part to be
something else that's not conformant to relational,  how do you use data
from the two parts together in an integrated fashion?  Or is that too much
to ask?
patrick61z@yahoo.com - 01 Nov 2008 15:18 GMT
> <patrick...@yahoo.com> wrote in message
>
[quoted text clipped - 14 lines]
> from the two parts together in an integrated fashion?  Or is that too much
> to ask?

Well in theory everything I want could be handled by a relational
database and normalized tables. In practice, some situations are too
expensive to be normalized for me as currently implemented. The ideal
situation would be for my database product to offer fully normalized
looking access to what are essentially repeating groups or cached
direct access, ie., if I could have 'select whatever from wherever
where key = someinteger' be actually and guaranteed translation to
'return whereever[someinteger * elementsize]' then sure, I'll just put
up with a wordy dialect that translates to an array reference.
Sometimes I want to be that precise in how many computer cycles I've
decided to burn for that particular operation. I'm a convenience store
clerk with rented webspace for my forum for instance. I'm spamming
digg with my relational dbms using blog entry and I don't want my $10
a month server to be 'buried'.

Yes, I have this for ram, but can I set array elements 1, 3, and 5 to
value 'something'' and have all three assignments either be persisted
or not and no partial state? Can todays rdbms offer me this without
even using a hash, just indexing into adjacent locations of storage?
Could I do it with sql today, passing an out of band pragma that says
when I'm describing a normalized table that replaces the repeating
groups I loved so much, can I still tell it that "I'd really like this
puppy to be at random access speed into contiguous storage localized
in reference to the current left side of the join?"
patrick61z@yahoo.com - 01 Nov 2008 18:31 GMT
> <patrick...@yahoo.com> wrote in message
>
[quoted text clipped - 14 lines]
> from the two parts together in an integrated fashion?  Or is that too much
> to ask?

This is probably where I proposed ditching relational theory
inadvertently. I think there is a role for relational theory, just
because I'm not immersed into it at the level others are doesn't mean
I'm telling _them_ to ditch it.

I'm saying that I could think of cases where I would avoid relational
technology for particular benefits, chiefly in terms of certain costs
of running it.
Walter Mitty - 07 Oct 2008 15:27 GMT
> Hello,
>
[quoted text clipped - 19 lines]
> Thank you,
> EBL

Let's separate out the issues of data normalization and data integrity
enforcement, even though they are clearly interrelated.

For a period of about 10 years, I was a database consultant, called in to
fix databases which were broken beyond the ability of local talent to
repair.  This could mean "broken" at a very mechanical level, where the DBMS
was refusing to allow connections to the database due to DBMS detected
corruption.  It could also mean "broken" at a very high level, where two
copies of the same database had diverged in their model of the subject
matter domain,  and the data couldn't be used together. Of course, the
nature of my work was such that I had a bias towards observing databases
that had been badly designed or badly managed.

But, nevertheless,  here's the conclusions I draw.  An enormous percentage
of the mission critical databases are intially designed by programmers who
are highly competent at programming, but have no formal training in database
design and no practical experience in database management.  Highly competent
programmers often have the misconception that database design and
administration involve a fairly trivial subset of skills they already have.
By the time they find out differently, it's already too late.

Programmers tend to want to enforce data integrity at the application level,
because that's where their strength lies.  Certainly well written
application programs will not be needlessly careless about data integrity.
There are certain cases where it's wise for the application program to
enforce some constraints,  even if the database will later enforce the same
constraints all over again.  The NOT NULL constraint comes to mind as an
example.

But the failure to enforce data integrity at the database level, together
with the failure to enforce standards of data integrity across all the
application programs, often produces a situation where data integrity falls
by the wayside.  The people who have had a chance to learn from such a
bitter experience often never get another chance to design a database from
scratch.
Roy Hann - 09 Oct 2008 09:38 GMT
> Hello,
>
[quoted text clipped - 16 lines]
> of using a big relational DB without the benefits of integrity and
> normalization?

I am afraid I have to confess (not for the first time) that I am a
prostitute.  I work for the kind of company that builds the kind of
systems you use.

If we ever submitted a bid that included a plan to do a lot of
up-front analysis and up-front database definition (including all the
constraints we could discover, but minus the ones SQL can't even
describe), before we ever delivered a line of executable application
code, we would lose out to the company that didn't bother with all that
fancy nonsense.  And that would be true even if our overall bid was
lower.

This is not speculation.  It happened repeatedly until we were starved
into submission.  The only very slight satisfaction I have is that one
of our present customers once rejected one of our early proposals, for
exactly the reasons I listed, and the IT director was kind enough to
tell me recently that if he had to do that project over again he'd do
it the way we had offered to do it.

But in general, you can't sell pearls to pigs.  (And if you're not a
pig don't expect to buy pearls from a pig-swill merchant.)

Signature

Roy

Walter Mitty - 09 Oct 2008 13:14 GMT
> This is not speculation.  It happened repeatedly until we were starved
> into submission.  The only very slight satisfaction I have is that one
> of our present customers once rejected one of our early proposals, for
> exactly the reasons I listed, and the IT director was kind enough to
> tell me recently that if he had to do that project over again he'd do
> it the way we had offered to do it.

I can echo that.  Right down to the IT director who later said that my
approach was the right one, much later.

That suggests a new discussion about the "sadder but wiser client".  There
must be a lot of stories out there about that.
eric.bouchardlefebvre@gmail.com - 09 Oct 2008 15:48 GMT
> > This is not speculation.  It happened repeatedly until we were starved
> > into submission.  The only very slight satisfaction I have is that one
[quoted text clipped - 8 lines]
> That suggests a new discussion about the "sadder but wiser client".  There
> must be a lot of stories out there about that.

Well, that point bring the question of IT Directors competences.

I start from the assertion that when a company is ready to buy a big
software solution (like an ERP system), IT Managers are part of the
decision.  It is then their reponsability to make database integrity a
requirement, to explain to the financial managers the purpose of such
a requirement and to be able to argue with vendors or integrators that
will try to minimize the importance of integrity.

The case here is classic : 5 years ago (I was not working here) the
company decided to buy an ERP solution.  At that time, the IT director
had a financial profile with non-technical computers knowledge.  IT
team was basically made of clerks playing with Excel pivot reports.

The decision to choose the ERP was therefore exclusively based on the
trust they had with some outside consultants.  Consultants, most of
the time, specialize in one system and stick to it even if it has
major problems, like lack or absence of database integrity.

Conclusion : it's always a question of competence at the IT
departement level.  If IT managers were all the time chose among
technical personel, systems would improve because IT managers would
force vendors to have a technological thinking.

But most of the time, IT managers are former business managers (or
people holing "information systems management" diploma) thinking their
working knowledge of computers gives them aptitudes to take IT mission-
critical decisions.
DBMS_Plumber - 10 Oct 2008 20:36 GMT
On Oct 5, 11:30 am, eric.bouchardlefeb...@gmail.com wrote:

> What is wrong with modern DB design approaches?  And what's the point
> of using a big relational DB without the benefits of integrity and
> normalization?

For what it's worth .....

The situation, I think, is actually getting better. We can and
probably always will find examples where the folk in charge of the IT
were clueless and the result was a complete debacle.

But - I recently spent two days in the company of two very senior IT
people from a very successful Web Company. At one point in the
meetings -- it was a conference about the challenges posed by the
kinds of very, very large databases generated by sensor data -- an
unrepentant OO-DBMS type proclaimed that the only way to make this lot
work was to tie the data management very closely to the programming
language in order to overcome the dreaded impedance mismatch and you
know how this story goes. . .

 Our 'senior IT guy' stood up and basically called the OO-DBMS guy a
moron. He then gave a 5 minute rant that might have come out of the
mouths of any of our more virulent relational bigots. Programmers --
he asserted -- were very, very, VERY bad for the business. He wanted
naked access to structured data by the company's analysts. Central to
that? Get the data integrity right.

  Later, over a cocktail or two, we chatted. In his shop EVERY table
has a declared primary key, every column without NOT NULL and DEFAULT
must have a documented reason for the deviation. Their code review
check list includes questions like "If your code has nested cursor
declarations and looping, please explain why in the comments." Good
Java people are cheap and plentiful. Good SQL people are rare, and he
pays them a lot. His set of interview questions includes "What can you
tell me about relational algebra?"

  An exception? Mebbe. But he's really, really successful. And an
aggressive (abrasive) advocate of relational thinking.
Jon Heggland - 13 Oct 2008 13:24 GMT
>    Later, over a cocktail or two, we chatted. In his shop EVERY table
> has a declared primary key, every column without NOT NULL and DEFAULT
> must have a documented reason for the deviation.

It is considered a deviation not to have a default? Why is that? I would
have thought that having defaults for every column would encourage
sloppiness.
Signature

Jon

DBMS_Plumber - 13 Oct 2008 17:17 GMT
> >    Later, over a cocktail or two, we chatted. In his shop EVERY table
> > has a declared primary key, every column without NOT NULL and DEFAULT
[quoted text clipped - 5 lines]
> --
> Jon

If you mandate that a column cannot contain a NULL, setting a DEFAULT
means that when a programmer legitimately doesn't have a value for the
column they aren't obliged to put in there the first thing that
springs to mind.

I'm sure there would be cases where you could say NOT NULL but not
provide a DEFAULT. The point to me was that this shop is serious about
their integrity and this seriousness is manifested in their
engineering policies and procedures.
paul c - 13 Oct 2008 20:25 GMT
>>>    Later, over a cocktail or two, we chatted. In his shop EVERY table
>>> has a declared primary key, every column without NOT NULL and DEFAULT
[quoted text clipped - 14 lines]
> their integrity and this seriousness is manifested in their
> engineering policies and procedures.

Suggest that a DEFAULT value could also be an integrity mechanism,
especially for updates to projection views, for example, a "Year"
attribute's default might be changed every January 1.  This might help
isolate apps from the db 'rules', if you will.
David BL - 14 Oct 2008 09:24 GMT
> Suggest that a DEFAULT value could also be an integrity mechanism,
> especially for updates to projection views, for example, a "Year"
> attribute's default might be changed every January 1.  This might help
> isolate apps from the db 'rules', if you will.

So the DBMS constrains the year of newly inserted tuples, but not the
year of existing tuples in the database.

This relates to the question of whether it is sufficient for integrity
to only be related to a predicate calculated on a given snapshot of
the database state, without concern for how it got to that state.  I
really like the simplicity of that.

I’m a bit suspicious of reading too much into the history of updates
on a database unless one is going to distinguish between data entry
corrections versus real changes (ie external predicates that are
functions of time). In the latter case it seems much better to
actually model temporal external predicates in the schema – ie use a
temporal database, and then the update history isn’t significant.

Database history seems relevant to auditing (and requirements that
follows from the likes of the Sarbanes-Oxley Act).  I would hope that
a good DBMS provides a generic, hacker proof means to calculate prior
states that would meet the requirements of the law courts for
establishing culpability.   The nice thing is that this is independent
of the logical model and even whether or not it is a temporal
database.
David BL - 14 Oct 2008 08:28 GMT
On Oct 14, 12:17 am, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
wrote:

> If you mandate that a column cannot contain a NULL, setting a DEFAULT
> means that when a programmer legitimately doesn't have a value for the
> column they aren't obliged to put in there the first thing that
> springs to mind.

How can a programmer legitimately not have a value for the column?
Doesn’t that imply that the schema is inadequate (for supporting
partial information)?
DBMS_Plumber - 14 Oct 2008 18:09 GMT
> On Oct 14, 12:17 am, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
> wrote:
[quoted text clipped - 7 lines]
> Doesn’t that imply that the schema is inadequate (for supporting
> partial information)?

In engineering, sh.t happens. Design with failure in mind.

I'm not going to argue for two weeks with 'Max' in accounting about
about whether he has a legitimate reason to not know the close date of
some transaction. Waste. Of. Time.  I'm not going to expect
perfection; especially not out of programmers.

It seems prudent management practice as well as sound engineering to
say simply that "If you don't know - that's OK - just go with the
default."
David BL - 15 Oct 2008 02:45 GMT
On Oct 15, 1:09 am, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
wrote:

> > On Oct 14, 12:17 am, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
> > wrote:
[quoted text clipped - 18 lines]
> say simply that "If you don't know - that's OK - just go with the
> default."

No doubt many applications need to support partial information.  I
wonder whether there’s a risk with using default values as far as
being at odds with the principle that unusual cases or exceptions
should be explicit, not implicit.   What’s prudent seems to depend on
the dangers of getting it wrong.
Walter Mitty - 15 Oct 2008 15:08 GMT
On Oct 14, 12:28 am, David BL <davi...@iinet.net.au> wrote:

> In engineering, sh.t happens. Design with failure in mind.

> I'm not going to argue for two weeks with 'Max' in accounting about
> about whether he has a legitimate reason to not know the close date of
> some transaction. Waste. Of. Time.  I'm not going to expect
> perfection; especially not out of programmers.

> It seems prudent management practice as well as sound engineering to
> say simply that "If you don't know - that's OK - just go with the
> default."

Outstanding reply.  I'd go even further.  I'd say that in data management,
impossible cases arise routinely.  Planning your systems so that they do
something reasonably intelligent when the impossible happens is just plain
good engineering.

PS, causing an inrecoverable failure during the weekend batch update, when
only a skeleton crew is on duty, is an example of unreasonable behavior.  I
can't tell you the number of production systems that behave this way when
required data is missing.

PPS, the whole discussion of default values has ignored a valuable
distinction, unless I missed it along the way.  There are two points in time
when a default value can be susbstituted for a NULL:  at INSERT or UPDATE
time, or at query time.  If defaults are substituted at INSERT or UPDATE
time,  the fact that a default was used is lost, unless that fact is
recorded somewhere else.  You can't, in general, know whether the
application provided a value that just happens to agree with the default,
or whether the application provided a NULL, and the system plugged in the
default.

Substituting a default value at query time is best illustrated by the
COALESCE function.  There are however, some DBMSes that provide for
substituting a default at query time, which save application programmers and
data analysts the trouble of coding it in.
David BL - 16 Oct 2008 13:32 GMT
> "DBMS_Plumber" <paul_geoffrey_br...@yahoo.com> wrote in message

>  > In engineering, sh.t happens. Design with failure in mind.
>
[quoted text clipped - 10 lines]
> something reasonably intelligent when the impossible happens is just plain
> good engineering.

I'd say impossible cases never arise by definition :)
Roy Hann - 16 Oct 2008 15:23 GMT
>> "DBMS_Plumber" <paul_geoffrey_br...@yahoo.com> wrote in message
>
[quoted text clipped - 14 lines]
>
> I'd say impossible cases never arise by definition :)

I'd say I hear a lot of claims that things are impossible from people
who didn't think about the problem hard enough to begin with.

I'd want some assurance that my database designer isn't using defaults
(or nulls) to gloss over his own sloppiness.

Applications can supply default values if they're needed, but the DBMS
should insist the applications conform to the business model
implemented by the database schema (to the extent that is possible).

Signature

Roy

Ed Prochak - 15 Oct 2008 19:55 GMT
On Oct 14, 12:09 pm, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
wrote:

> > On Oct 14, 12:17 am, DBMS_Plumber <paul_geoffrey_br...@yahoo.com>
> > wrote:
[quoted text clipped - 18 lines]
> say simply that "If you don't know - that's OK - just go with the
> default."

There's the key word for me: engineering.
The difference is between programming code and software engineering.

 Ed
Jon Heggland - 15 Oct 2008 11:46 GMT
> If you mandate that a column cannot contain a NULL, setting a DEFAULT
> means that when a programmer legitimately doesn't have a value for the
> column they aren't obliged to put in there the first thing that
> springs to mind.

I would have thought the effect was the opposite: If programmers (or
anyone else, of course---I consider myself a programmer:) aren't obliged
to specify a value (and thus aren't required to understand what the
column is for), they will happily go with a default that may very well
be inappropriate/incorrect. But without empirical evidence, either
hypothesis is just speculation.

> I'm sure there would be cases where you could say NOT NULL but not
> provide a DEFAULT. The point to me was that this shop is serious about
> their integrity and this seriousness is manifested in their
> engineering policies and procedures.

Point taken, of course.
Signature

Jon

paul c - 14 Oct 2008 06:16 GMT
>>    Later, over a cocktail or two, we chatted. In his shop EVERY table
>> has a declared primary key, every column without NOT NULL and DEFAULT
[quoted text clipped - 3 lines]
> have thought that having defaults for every column would encourage
> sloppiness.

When it comes to "every column", no argument, but isn't what SQL calls a
DEFAULT really just a special form of constraint?
whileone - 10 Nov 2008 16:06 GMT
On Oct 5, 11:30 am, eric.bouchardlefeb...@gmail.com wrote:
> Hello,
>
> When time comes to build transactional databases (as opposed to data
> wharehouses), I belong to the school that STRONGLY believe in
> normalizing data with high integrity mechanisms.  I know all the
> performance cons but IMHO, pros largely overwhelme.

Just to be difficult.....

This is now a long thread, with numerous well-crafted arguments
exhaustively
defending good database design.  But (largely  for the sake of
argument)
there are times when bad design is called for.

For large, well-funded projects,  performance can always
be purchased, usually  for far less than the maintenance costs
associated with
bad design.

But smaller, not-so-well funded systems are part of the bell curve
too.
I wrote a web-based administration system for an "outfitters"
organization once.
There were about 1000 licensed outfitters that needed to be tracked
by a single-person administrator. So there was a private interface
plus a
public one. The public interface was supposed to
include a keyword search that combined outfitter location with
arbitrary
combinations of attribute keywords like "elk hunting, fly fishing,
walleye,  pheasant" etc.

This organization had a limited budget and insisted on
a cheap shared host server.  Mysql queries with mulitiple named
cursors
ran like lightning on my desktop linux box, but they ran like half-
frozen molasses
on the overloaded shared host server.

But application code that did "string-like" over a long string of
comma-separated
keywords worked, and it ran an order of magnitude faster (on the
chintzy shared
host).
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.