Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / July 2005

Tip: Looking for answers? Try searching our database.

Property sheet, ad hoc, property page, flexible data

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
AC - 19 Jul 2005 19:06 GMT
Hello,

I am looking for a solution that hopefully someone here has tackled.  I want
to create "base" tables with fields that are intrinsic to the application.
I also want to allow the customers using the application to be able to
create their own fields so that they can easily customize the application.
These new fields would either be added to the base table or to a separate
table with a 1:1 relationship to the base table.  The client application
will render the controls based on the new fields in a grid to allow a
flexible amount of fields to be added by customers.

How would you handle this?

Thanks in advance,

-- AC
Marshall  Spight - 19 Jul 2005 22:13 GMT
> I am looking for a solution that hopefully someone here has tackled.  I want
> to create "base" tables with fields that are intrinsic to the application.
[quoted text clipped - 6 lines]
>
> How would you handle this?

You can have whatever schema you want for your built-in, predefined
fields; that's what schema is for. But the mistake most people
make is thinking that user-defined fields somehow specify more
schema; they don't. I hereby propose, in public for the first
time, my formulation on the relationship between schema and
semantics:

 No schema, no semantics.
 Know schema, know semantics.

Okay, so it's bumper-sticker level material, but it gets the
point across: if you don't have a schema for something, you
don't know jack about it.

You don't know anything about what these user-defined fields
mean. So don't make any schema for them; it's not the right
tool. Instead, just make the schema for what you *do* have
semantics for, which is "user-supplied name".

Marshall
Neo - 19 Jul 2005 22:27 GMT
> No schema, no semantics.

Based on your likely definition of schema, you are correct (but is your
definition of schema correct).
AC - 20 Jul 2005 21:07 GMT
>> I am looking for a solution that hopefully someone here has tackled.  I
>> want
[quoted text clipped - 3 lines]
>> create their own fields so that they can easily customize the
>> application.
<snip/>
>> How would you handle this?

<snip/>

>  No schema, no semantics.
>  Know schema, know semantics.
>
> Okay, so it's bumper-sticker level material, but it gets the
> point across: if you don't have a schema for something, you
> don't know jack about it.
<snip/>

> Marshall

If I load the ad hoc data into a grid what else would I need to know?  This
type of customization was part of Symantec ACT and some other contact
management programs.  How else do you easily allow customization of a
program?

Regards
Neo - 19 Jul 2005 22:19 GMT
> How would you handle this?

These type of applications tend to be impractical in RM. One method
might be to use generic modelling (ie T_Object, T_Property,
T_Relation).
Neo - 19 Jul 2005 22:38 GMT
> How would you handle this?

One way is to design your original schema such that things, properties,
values and relationships between things can be added as records in
existing tables rather than creating additional tables.
AC - 20 Jul 2005 21:28 GMT
>> How would you handle this?
>
> One way is to design your original schema such that things, properties,
> values and relationships between things can be added as records in
> existing tables rather than creating additional tables.

I will have to think about this.  Do you have links to this type of schema?

Thanks
mAsterdam - 20 Jul 2005 00:11 GMT
> ...I want to create "base" tables with fields that are
> intrinsic to the application.
[quoted text clipped - 4 lines]
> will render the controls based on the new fields in a grid to allow a
> flexible amount of fields to be added by customers.  ...

If this group would have a FAQ, this Q would be in it.
For now you'll have to do with googling the archives.

The short answer is: don't go that way, you'll get the worst
of both worlds.

I won't go into this any deeper unless somebody shows a real start (and
commitment to maintain it for say the first year) for a cdt FAQ (or
unless you showed you googled and come up with some new aspects :-)

It's not a bad question, but I don't like going over the same arguments
every time it pops up.
AC - 20 Jul 2005 21:07 GMT
>> ...I want to create "base" tables with fields that are intrinsic to the
>> application. I also want to allow the customers using the application to
>> be able to create their own fields so that they can easily customize the
>> application.
<snip/>

> If this group would have a FAQ, this Q would be in it.
> For now you'll have to do with googling the archives.
>
> The short answer is: don't go that way, you'll get the worst
> of both worlds.
<snip/>
> It's not a bad question, but I don't like going over the same arguments
> every time it pops up.

Can you post the search string I should use?  I did google which is why I
created the subject I did.  So that others that were searching for this type
of answer could find it.  I don't know how others have described this
scenario.  I get a lot of pages on property sheets for programming controls.
I know that this type of setup was used by Symantec ACT but didn't find
references to that either.

Thanks in advance
dawn - 24 Jul 2005 02:33 GMT
> >> ...I want to create "base" tables with fields that are intrinsic to the
> >> application. I also want to allow the customers using the application to
[quoted text clipped - 17 lines]
> I know that this type of setup was used by Symantec ACT but didn't find
> references to that either.

I get a couple of hits on the string

user-defined-attributes

in cdt in googlegroups and I suspect some variations would also yield
results.

The bottom line is that when someone who "does" databases is asked how
to support user-defined attributes, they often tell the user that their
requirements are not as important as the integrity of the database and
that there is simply no way they can figure out how to design a
solution that mitigates the risks within reason.

I have no problem letting users define attributes, but I don't often
use a SQL-DBMS.  Even if using a SQL-DBMS, there are multiple ways to
try to meet the user requirements.  You can assess the risk to your
company if you were to toss in a few strings, ints, and floats with
names like M1, M2, or userDef1, userDef2 or User_Def_1, User_Def_2 ...
(just understand that some DBAs in the audience are certain I'm either
stupid or ignorant when they read this).  Place such attributes (with
unique names) in relations that are likely spots for additions,
document it well, and make UI's to collect the data.

Another approach is one where you store tag & value in two separate
attributes so they can name a value and enter it, thereby mixing
metadata with data.  This is, of course, one of the "never do this"
commandments, but, again, you need to assess the cost vs benefits and
mitigate risks.

Your users will thank you.  Scratch that as overly optimistic, but if
you have determined that they and the company would be more productive
with these attributes when you did your risk / cost assessment, then
you just might have done something good.

Cheers!  --dawn
AC - 24 Jul 2005 06:05 GMT
>> >> ...I want to create "base" tables with fields that are intrinsic to
>> >> the
[quoted text clipped - 10 lines]
>> > The short answer is: don't go that way, you'll get the worst
>> > of both worlds.

<snip/>
> I get a couple of hits on the string
>
> user-defined-attributes
>
> in cdt in googlegroups and I suspect some variations would also yield
> results.

<snip/>
> try to meet the user requirements.  You can assess the risk to your
> company if you were to toss in a few strings, ints, and floats with
[quoted text clipped - 6 lines]
> Another approach is one where you store tag & value in two separate
> attributes so they can name a value and enter it, thereby mixing
<snip/>
> Cheers!  --dawn

Thanks for the reply and the search string suggestion, Dawn.  We considered
tag/value storage but decided that it wouldn't work as well as defining each
field.  The problem with pre-creation of the fields (UserDef1, UserDef2,
etc) is that you would be stuck with for example 5 text fields, 6 ints, 2
date_time, etc.  This would be limitation that does not appear to have much
benefit.

Regards  --AC
Jonathan Leffler - 24 Jul 2005 08:19 GMT
>>>>>...I want to create "base" tables with fields that are intrinsic to
>>>>>the
[quoted text clipped - 39 lines]
>
> Thanks for the reply and the search string suggestion, Dawn.

Another search in groups.google.com that works reasonably well is:

attribute value universal group:comp.databases.theory

I have a feeling that I'm missing the most appropriate combination of
words (but 'attribute value' alone is a bad search in this news group).

Signature

Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler@earthlink.net, jleffler@us.ibm.com
Guardian of DBD::Informix v2005.01 -- http://dbi.perl.org/

David  Cressey - 24 Jul 2005 14:10 GMT
PMFJI.

The fundamental problem with user defined attributes is not a theoretical
one.  It's one of conflicts between two incompatible requirements of the
users.

The first requirement is, as you outlined, that users want to be able to
extend the schema to include data items that are not defined in the schema,
and are defined in each user's head.  Each user's extensions might be
understood by the entire user community,  but more commonly, each user's
extensions are idiosyncratic to that user,  and either meaningless or
misstated in the view of the rest of the users.

The other requirement is that the data in the data base be sharable and
usable by the entire user community in search of information.  You can't
turn data into information without the semantics.  Of all the summaries of
the problems of using data not included in the schema,  I've never seen a
more elegant summary than Marshall's:

No schema, no semantics.
Know Schema, Know Semantics.

Now, here's the kicker:  there's generally a delay time between the time the
users demand the ability to add to the schema,  and the time the same user
community demands that the architects or administrators make sense out of
the whole mess.  By the time the mess has been made,  and the semantics are
encapsulated inside the skulls of hundreds of users, many of whom are former
users,  it's the devil's own work to analyze all that data, and to obtain
the kind of outputs that "ought to be easy, because the data is in a
database".

If you can somehow get the users to agree that they are responsible for
sharing the semantics of the data with each other,  and that the data
architecture can't help them , except for data inside the schema,  then,
and only then,  many of the suggestions already posted will do the job just
fine
David  Cressey - 24 Jul 2005 14:10 GMT
PMFJI.

The fundamental problem with user defined attributes is not a theoretical
one.  It's one of conflicts between two incompatible requirements of the
users.

The first requirement is, as you outlined, that users want to be able to
extend the schema to include data items that are not defined in the schema,
and are defined in each user's head.  Each user's extensions might be
understood by the entire user community,  but more commonly, each user's
extensions are idiosyncratic to that user,  and either meaningless or
misstated in the view of the rest of the users.

The other requirement is that the data in the data base be sharable and
usable by the entire user community in search of information.  You can't
turn data into information without the semantics.  Of all the summaries of
the problems of using data not included in the schema,  I've never seen a
more elegant summary than Marshall's:

No schema, no semantics.
Know Schema, Know Semantics.

Now, here's the kicker:  there's generally a delay time between the time the
users demand the ability to add to the schema,  and the time the same user
community demands that the architects or administrators make sense out of
the whole mess.  By the time the mess has been made,  and the semantics are
encapsulated inside the skulls of hundreds of users, many of whom are former
users,  it's the devil's own work to analyze all that data, and to obtain
the kind of outputs that "ought to be easy, because the data is in a
database".

If you can somehow get the users to agree that they are responsible for
sharing the semantics of the data with each other,  and that the data
architecture can't help them , except for data inside the schema,  then,
and only then,  many of the suggestions already posted will do the job just
fine
Marshall  Spight - 24 Jul 2005 18:14 GMT
> PMFJI.
>
> The fundamental problem with user defined attributes is not a theoretical
> one.  It's one of conflicts between two incompatible requirements of the
> users.
> fine

Strongly agree.

Some more problems:

Dawn describes two possible approaches.
approach 1: add columns "UserDef1", "UserDef2"
approach 2: additional table, with three columns: FK, Name, Value

The first one is just frightening. It's not even in first normal form,
and it will exhaust its expanision capacity quickly.

The second one is better. Amusingly, I would describe that solution
as being one that actually has well-defined semantics. The semantics
are that the first column specifies the entity being described, the
second is the name of the quality being described, and the third is
the value for that quality. Not a *lot* of semantics there, but at
least it's well-defined. Note that I would not agree with Dawn as
describing this as mixing metadata and data, because the values in
the name column *aren't* metadata, since we don't know what they
mean, other than "they are names."

Another significant issue with these approaches is that they both
do not have any way of allowing any structure in the user defined
data; you're limited to exactly whatever domains you allow up front;
probably int and string, say, but certainly not, say, three pairs of
int/float tuples and a list of x, y points.

Marshall
dawn - 24 Jul 2005 21:49 GMT
> > PMFJI.
> >
[quoted text clipped - 13 lines]
> The first one is just frightening. It's not even in first normal form,
> and it will exhaust its expanision capacity quickly.

Yes, it is not a highly scalable approach, but does give more than the
NOTHING that is often offered as an alternative to users.  I'm a
happier e-mail end users with gmail because it not only lets me tag an
e-mail with multiple labels, but I can also click a star on e-mails as
more information.

Users might be able to reuse the same single variable length character
attribute for many purposes over the life of their system.  These are
basically "comment" attributes and you don't restrict the content of a
comment.  Name the comment "MultiPurposeShortComment" or
"ProductNotation" (if an attribute of a product).

I'm guessing you have seen users request comments where a comment is
not really what is needed, but they intend to put intelligent comments
in the system so they can search on those.  An attribute not named
"comment" that is intended for this purpose would be better, right?
Again, there are trade-offs and you need to assess the real
requirements and design for flexibility for users as well as the system
overall.

> The second one is better. Amusingly, I would describe that solution
> as being one that actually has well-defined semantics.

as would I :-)

> The semantics
> are that the first column specifies the entity being described, the
> second is the name of the quality being described, and the third is
> the value for that quality. Not a *lot* of semantics there, but at
> least it's well-defined. Note that I would not agree with Dawn as
> describing this as mixing metadata and data,

that is what I have heard it called.  I would claim that we mix these a
lot, so I do not have a problem with this. Metdata and data are all
data.

> because the values in
> the name column *aren't* metadata, since we don't know what they
> mean, other than "they are names."

Who is the "we" in that sentence?  The user knows what they mean and
can, perhaps, write reports that qualify rows based on the value
related to a particular tag.  So, SQL doesn't grab this metadata from
the same place, but there is still data about data stored in as
attribute data, instead of in the system catalog.

> Another significant issue with these approaches is that they both
> do not have any way of allowing any structure in the user defined
> data;

Yes, that is both their charm and their downfall.

> you're limited to exactly whatever domains you allow up front;
> probably int and string, say, but certainly not, say, three pairs of
> int/float tuples and a list of x, y points.

Or perhaps a relation-valued attribute ...

You could toss in a few of these notational attributes as strings and
perhaps a date or two.  I'd prefer they were all defined as strings and
cast to whatever is needed at the time, but in a SQL-DBMS, that might
not be as easy.

My point is that you need to assess the needs and associated risks with
any design you might choose and also consider the risk of ignoring this
particular perceived need.  I suspect that too many DBAs are inclined
to tell their users they can't have such an attribute without doing an
analysis and without realizing that there are real benefits to
notational fields of this nature.  I would suggest that even when a
user does not identify this need up front, it is still a good idea to
identify if there is a place for such attributes. I didn't ask for that
gmail star, but I really appreciate it and suspect that it increases my
productivity.

Cheers!  --dawn
Marshall  Spight - 25 Jul 2005 04:19 GMT
> > because the values in
> > the name column *aren't* metadata, since we don't know what they
> > mean, other than "they are names."
>
> Who is the "we" in that sentence?

The programmer and the software he writes.

> The user knows what they mean and
> can, perhaps, write reports that qualify rows based on the value
> related to a particular tag.

I disagree. The users (plural) *don't* know what they mean. The
reason this doesn't work is because there is no canonical,
homogenous user who knows what he knows. Instead there are
vast seas of users. And while I would agree that one particular
user probably knows what he means when he enters a particular name,
we have no reason to believe that anyone else will agree. In fact,
we have strong reason to believe that chaos will be the order
of the day.

Tags/labels/what have you are lexical data merely; they have
no semantics exactly because they have no schema. (Beyond the
minimal "this is the name, and this is the value.")

> > you're limited to exactly whatever domains you allow up front;
> > probably int and string, say, but certainly not, say, three pairs of
> > int/float tuples and a list of x, y points.
>
> Or perhaps a relation-valued attribute ...

Sure.

> I'd prefer they were all defined as strings and
> cast to whatever is needed at the time, but in a SQL-DBMS, that might
> not be as easy.

That is a much worse way to do things. If evenything is a string
and can be cast to anything as needed, then you can't reason about
your code. It has no useful statically analyzable properties.

> I didn't ask for that
> gmail star, but I really appreciate it and suspect that it increases my
> productivity.

I am glad you are enjoying gmail.

I would propose that the star is not an example of a used defined
fields, but gmail labels are. They match your option 2) pretty well.

Marshall
David  Cressey - 25 Jul 2005 13:56 GMT
> > > because the values in
> > > the name column *aren't* metadata, since we don't know what they
[quoted text clipped - 3 lines]
>
> The programmer and the software he writes.

I see two problems with the above:

First "we" is plural,  and "the programmer" is singular.  You may think I'm
nitpicking, but I don't think I am.  When several programmers want to
understand what the data means,  the understanding can be "joint" or
"several", to use the legalese terms.

Second, you confuse the role of "data architect" with the role of
"programmer".  There's nothing to say that one person cannot do both roles,
and do each of them well.  But they aren't the same role.  And the vast
majority of programmers I've come in contact with have no knowledge of what
the data "really means".  Their level of interpretation of the data is
strictly what's necessary in order to determine whether their programs are
delivering correct results.  That's it.

That's not a deep enough understanding of the data and its purpose to make
data sharing feasible and cost effective among a wide and diverse user
community.

A disturbing trend is this:  there are an increasing number of database
designers that are not fit to be data architects.
Marshall  Spight - 25 Jul 2005 15:10 GMT
> "Marshall Spight" <marshall.spight@gmail.com> wrote in message
>
> Second, you confuse the role of "data architect" with the role of
> "programmer".  There's nothing to say that one person cannot do both roles,
> and do each of them well.  But they aren't the same role.

Fair enough.

> And the vast
> majority of programmers I've come in contact with have no knowledge of what
> the data "really means".  Their level of interpretation of the data is
> strictly what's necessary in order to determine whether their programs are
> delivering correct results.  That's it.

Hmmm. Could you clarify the difference? The two things you describe
here seem to me to be the same thing.

> That's not a deep enough understanding of the data and its purpose to make
> data sharing feasible and cost effective among a wide and diverse user
> community.

How so?

> A disturbing trend is this:  there are an increasing number of database
> designers that are not fit to be data architects.

I'll buy that!

Marshall
Gene Wirchenko - 26 Jul 2005 00:41 GMT
[snip]

>> And the vast
>> majority of programmers I've come in contact with have no knowledge of what
>> the data "really means".  Their level of interpretation of the data is
>> strictly what's necessary in order to determine whether their programs are
>> delivering correct results.  That's it.

    That is, what the data means to the DBMS.

>Hmmm. Could you clarify the difference? The two things you describe
>here seem to me to be the same thing.

    What does the data mean to the DBMS?  It is merely the sum of all
the constraints.

    What does the data mean to the end user?  What it means to the
DBMS and more.  A value of "Toronto" for City means more to the end
user than just the characters.

[snip]

Sincerely,

Gene Wirchenko
David  Cressey - 26 Jul 2005 15:39 GMT
> > "Marshall Spight" <marshall.spight@gmail.com> wrote in message
> >
[quoted text clipped - 25 lines]
>
> Marshall

I'm going to have to think long and hard about your questions.  I'm hoping
others will jump in.

All I can say off the top of my head is that data can have value (or worth,
if you prefer) that is apart from its correctness.

There is some data in databases that is verifiably correct,  but means
nothing of much value.
AC - 25 Jul 2005 19:04 GMT
> PMFJI.
>
[quoted text clipped - 5 lines]
> extend the schema to include data items that are not defined in the
> schema,
<snip/>

That is the situation.  Just like ACT (previously from Symantec) allowed you
to have "property sheets" we would like to do something similar.  The reason
is that some items will be standard but there are too many possibilities
with various business requirements to specify in advance all fields.  This
way there is flexibility without too much data bastardization.  Ad hoc
fields will show up in a grid view only.

> The other requirement is that the data in the data base be sharable and
> usable by the entire user community in search of information.  You can't
<snip/>
This is not a requirement or expectation.  Base fields would be shareable
but ad hoc fields would of course not be able to be shared.  There is no
expectation for sharing in the same way that two companies that used ACT and
merged could not expect a 1:1 matching of their "property sheet" data.

Has anyone done this before?  I know it's not common as I have not worked on
a project with this scenario before however I know that with Salesforce
Automation packages it is done.

Regards,
--AC
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2010 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.