Database Forum / Ingres Topics / September 2004
primary key as integer
|
|
Thread rating:  |
GG - 25 Aug 2004 16:01 GMT Hello.
I would like use integer typ for primary key. What can I do automatically inserting sequence value when new row is inserting ???
Thanks for help GG
Karl & Betty Schendel - 25 Aug 2004 14:30 GMT >Hello. > > I would like use integer typ for primary key. >What can I do automatically inserting sequence value when new row is >inserting ??? Ingres r3 implements ANSI sequences. You can wait for r3 to become generally available. For 2.6 and older, the usual way is to have a number generator table with a single row containing the next-value for the key; increment that before inserting a new row into your table.
Karl
GG - 25 Aug 2004 15:34 GMT >> Hello. >> [quoted text clipped - 8 lines] > > Karl OK but I would like automatically inserting sequence value when insert query not contain key field.
GG
Roy Hann - 26 Aug 2004 09:24 GMT > >> Hello. > >> [quoted text clipped - 11 lines] > OK but I would like automatically inserting sequence value when insert > query not contain key field. That's a problem then. If you have designated that a column is a primary key then Ingres insists that you supply a (unique) value for it, when you insert, per the ANSI SQL standard.
Just out of curiosity, if you don't already know the value of the primary key when you insert, and you want Ingres to generate a value for you when it inserts, how will you ever find that row again?
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Philip Lewis - 26 Aug 2004 14:56 GMT >Just out of curiosity, if you don't already know the value of the primary >key when you insert, and you want Ingres to generate a value for you when it >inserts, how will you ever find that row again? Perhaps i'm thinking too simply here, or am just not understanding your concern... but my guess would be via other fields in the record.
We have a contact database. Names (and addresses for those names... say invoice and shipping) not being unique, we needed a unique identifier for each record, so each name saved gets a name_key associated with it.
We also have a "mailing list" database, each list having a list_key.
When we want to add a name to a mailing list, we browse both lists and names via a textual information, not the keys (well... certain folks see the keys). The function "add this name to this list" will create an entry in a join table which has list/name key pairs.
 Signature be safe. flip Ich habe keine Ahnung was das bedeutet, oder vielleicht doch? Remove origin of the word spam from address to reply (leave "+")
Roy Hann - 27 Aug 2004 00:44 GMT > >Just out of curiosity, if you don't already know the value of the primary > >key when you insert, and you want Ingres to generate a value for you when it > >inserts, how will you ever find that row again? > > Perhaps i'm thinking too simply here, or am just not understanding > your concern... but my guess would be via other fields in the record. Well of course that is exactly what I assume he's expecting to do. But if those other fields (sic) constitute a candidate key why is there any need to introduce a spurious synthetic key? (There may be a good reason, but we're not told. Experience suggests synthetic keys are almost always introduced unnecessarily.) And if there isn't a candidate key my first question still stands; how will the row ever be reliably identified again? Distinguishing a row and identifying a row are not the same thing.
> We have a contact database. Names (and addresses for those > names... say invoice and shipping) not being unique, we needed a unique > identifier for each record, so each name saved gets a name_key associated > with it. If the rows that represent the entities in your logical model are not intrinsically distinct then there is an error in your logical model. Making them distinct by fiat only tricks SQL into not noticing the problem. It doesn't get rid of the problem (which is that you have the potential for ambiguous "facts" in the database).
(BTW, I believe surrogate and synthetic keys do have a place, but it is not clear that this is one of those places.)
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Spartacus - 08 Sep 2004 06:58 GMT > Well of course that is exactly what I assume he's expecting to do. But if > those other fields (sic) constitute a candidate key why is there any need to > introduce a spurious synthetic key? (There may be a good reason, but we're > not told. Experience suggests synthetic keys are almost always introduced > unnecessarily.) I always figured that a 'simple' integer key as opposed to a composite key would yield a speed and storage benefits for FK relations. For a contacts database, for example, a candidate key could be FirstName,LastName,Address, but it would be tedious to repeat those columns in every table that references contacts. No?
Roy Hann - 08 Sep 2004 12:16 GMT > > Well of course that is exactly what I assume he's expecting to do. But if > > those other fields (sic) constitute a candidate key why is there any need to [quoted text clipped - 7 lines] > FirstName,LastName,Address, but it would be tedious to repeat those > columns in every table that references contacts. No? There are definitely places where surrogate keys are required, as (I think) I said.
I have two comments about the proposed candidate key above. First of all I have no numbers that would support the claim that a simple integer key offers performance benefits over a composite key. I am sure one could construct an enormously large composite key that would perform poorly, but based on experience investigating and benchmarking customers' systems over the years I've never seen a key "in the wild" that was so outrageously big that it had any measurable adverse performance effect. Even if it were measurable, I still wouldn't propose introducing a surrogate key to improve performance unless it were so bad that it was noticeable. (Users only complain about what they can see/feel.) There are fantastically larger perfomance gains to be had elsewhere.
Interestingly, about a year ago I spent some time investigating a customer's request for verification that their composite key components were in the appropriate order. (The dogma has always been that it is better to specify the more selective components first so that they are logically on the left of the key.) Even for a very big table and a very big key it was virtually impossible to discern any benefit from one key order over another. Ingres just doesn't seem to be very vulnerable that way.
My second comment is that I would not regard that as a very good candidate key anyway. If that were the best I could do given the entity in question I might very well introduce a surrogate key. The problem is that LastName and Address can easily change, and there is no logical necessity why they shouldn't change in a way that becomes ambiguous. Therefore the fact that those three values identify a row at some point in time is a coincidence rather than a logical necessity. Depending on the application domain that might be more or less unlikely, but it could happen. But if it were very unlikely, and now that Ingres supports FOREIGN KEY constraints with ON UPDATE CASCADE, I think I would probably try to use exactly the key you suggest (assuming there was nothing better).
If I were to introduce a surrogate key I would absolutely *never* assign it automatically. I would require the users to go through all kinds of hoops to absolutely guarantee the proposed new row really does describe a new entity previously unknown to the system. But before even resorting to that, I'd spend a really long time satisfying myself that I really, really couldn't find a reliable natural candidate key.
As for tediousness of repeating long composite keys, computers don't get bored. Programmers might get bored repeatedly writing long join conditions, but the payoff from using a complete, accurate, and robust logical model (part of which involves indentifying reliable natural candidate keys and excluding unnecessary surrogates) is that you need to write way less code elsewhere. I am often amazed at how little code I sometimes have to write once I get rid of some innocent-looking but extraneous attribute. For instance, using your example key above, I would have to write no application code whatever to maintain that (assuming it really is a candidate key).
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Emiliano Heyns - 09 Sep 2004 05:07 GMT > My second comment is that I would not regard that as a very good candidate > key anyway. If that were the best I could do given the entity in question I [quoted text clipped - 7 lines] > UPDATE CASCADE, I think I would probably try to use exactly the key you > suggest (assuming there was nothing better). Hey, I was looking for just that construct! I must have missed it reading the R3 docs -- good to hear it's in!
> If I were to introduce a surrogate key I would absolutely *never* assign it > automatically. I would require the users to go through all kinds of hoops > to absolutely guarantee the proposed new row really does describe a new > entity previously unknown to the system. But before even resorting to that, > I'd spend a really long time satisfying myself that I really, really > couldn't find a reliable natural candidate key. Could a key as was suggested be made case-insensitive? That would pretty much be a requirement in this case, right?
Emiliano
Roy Hann - 09 Sep 2004 11:19 GMT > Could a key as was suggested be made case-insensitive? That would pretty > much be a requirement in this case, right? Absolutely!
I'd probably impose a constraint, and if I were feeling generous I might also supply a rule to do automatic case folding. Unfortunately Ingres doesn't have a built-in namecase() function, so I'd have to code it (somehow).
(Are there any open-source weenies out there who want to see my wish-list?
:-) Roy
Emiliano Heyns - 09 Sep 2004 21:31 GMT >>Could a key as was suggested be made case-insensitive? That would pretty >>much be a requirement in this case, right? > > Absolutely! > > I'd probably impose a constraint, and if I were feeling generous I might Could you describe such a constraint?
> also supply a rule to do automatic case folding. That would depend on the situation... in general, I'd not want to change the name as the user entered it.
Emile
Roy Hann - 10 Sep 2004 09:25 GMT > >>Could a key as was suggested be made case-insensitive? That would pretty > >>much be a requirement in this case, right? [quoted text clipped - 4 lines] > > Could you describe such a constraint? First of all I was being sloppy. Ingres doesn't have a case-insensitive type so I can't impose the required domain constraint. Because I can't make individual attributes case-insensitive, I leapt ahead to making the attribute a known case.
I would like to have more kinds of constraints that I could easily enforce. For instance I might like to enforce "namecase", where each initial is in uppercase and the rest of the name is in lower case (e.g. Roy Hann). But I can't easily do that in Ingres. I can easily enforce all uppercase though, or all lowercase:
ALTER TABLE foo ADD CONSTRAINT name_in_uppercase CHECK ( name = uppercase(name) );
As I say, I *might* add a rule to automatically fold inserted/updated names to uppercase (or lowercase or whatever I've decided names should be), although I am actually reluctant to mess with what the application (*not the user*--see below) is doing. I assume the app and the database are working off the same spec, and while they might not agree, they are intended to agree, so any disagreement is just a bug, to be fixed.
> > also supply a rule to do automatic case folding. > > That would depend on the situation... in general, I'd not want to change > the name as the user entered it. That would be a mistake. If there is no good reason to interfere with the data as the user entered it, then by all means leave it alone. But if there is a good reason then the database designer needs to collaborate with the business to get agreement on how the data should look and then they should impose constraints in the database to ensure that is the only way it can look. But in no case should individual end-users ever be allowed to make decisions about how the data should look.
Now in the particular case at hand, a case-insensitive text type would be the ideal solution. But Ingres has no such type. (Leandro Fava has just posted a neat solution for making Ingres everywhere case insensitive though.) We are therefore compelled to decide (in collaboration with the business) and enforce one standard.
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Emiliano - 20 Sep 2004 13:48 GMT >> I always figured that a 'simple' integer key as opposed to a composite >> key would yield a speed and storage benefits for FK relations. For a [quoted text clipped - 13 lines] > complain about what they can see/feel.) There are fantastically larger > perfomance gains to be had elsewhere. So what you're saying is that with Ingres, it doesn't really matter all that much wether I would use an INT or a VARCHAR(80) as the promary key (and use it as a foreign key in other relationships)? That would be great, queries would be simpler and the data more meaningful upon simple inspection.
> My second comment is that I would not regard that as a very good candidate > key anyway. But in my case, the candidate key would be the users' login name, and in another table it would be pc name. If I can get these enforced to lowercase, they should make good PKs.
 Signature Emiliano
Emiliano - 22 Sep 2004 19:10 GMT >>I have two comments about the proposed candidate key above. First of all I >>have no numbers that would support the claim that a simple integer key [quoted text clipped - 13 lines] > great, queries would be simpler and the data more meaningful upon simple > inspection. I'm currently drafting my schema; it'd mean a lot to me if someone could offer some insight here. Choosing the primary key type wrongly would mean lots of changes in my application code later.
Thanks, Emiliano
Ковернинский Виктор Игоревич - 26 Aug 2004 17:52 GMT > Karl & Betty Schendel wrote: > > At 5:01 PM +0200 8/25/04, GG wrote: [quoted text clipped - 14 lines] > OK but I would like automatically inserting sequence value when insert > query not contain key field. That's a problem then. If you have designated that a column is a primary key then Ingres insists that you supply a (unique) value for it, when you insert, per the ANSI SQL standard.
Just out of curiosity, if you don't already know the value of the primary key when you insert, and you want Ingres to generate a value for you when it inserts, how will you ever find that row again?
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
---------------------------------------------------------------------------------------------------- Try this:
/* Procedure for getting next i4 value */
CREATE PROCEDURE GetSysValue ( tag varchar(256) not null, value integer4 not null ) AS DECLARE erno = integer4 not null; rowc = integer4 not null; BEGIN UPDATE sys_value SET value=value+1 WHERE tag_pkey=:tag; SELECT iirowcount,iierrornumber INTO rowc,erno; IF iierrornumber <> 0 THEN RAISE ERROR -1 'GetSysValue ERROR -1!!!'; RETURN -1; ENDIF; IF rowc = 0 AND erno = 0 THEN INSERT INTO sys_value (tag_pkey,value) VALUES (:tag,1); SELECT iirowcount,iierrornumber INTO rowc,erno; value=1; ELSEIF rowc = 1 AND erno = 0 THEN SELECT :value=value FROM sys_value WHERE tag_pkey=:tag; SELECT iirowcount,iierrornumber INTO rowc,erno; ENDIF; IF rowc <> 1 OR erno <> 0 THEN RAISE ERROR -2 'GetSysValue ERROR -2!!!'; RETURN -2; ENDIF; RETURN :value; END
/* Procedure for getting new sequence value for inserted row */ /* NOTE: "-999" can be any value what cannot be a value of primary key of the table "depo" */
CREATE PROCEDURE auto_inserting ( depo_no = integer4 not null ) AS BEGIN IF depo_no <> -999 THEN RETURN 0; ENDIF;
depo_no = EXECUTE PROCEDURE GetSysValue(tag='depo_no');
IF depo_no <= 0 THEN RAISE ERROR -1 'auto_inserting ERROR -1!!!''; RETURN -1; ENDIF;
UPDATE depo SET depo_no = :depo_no WHERE depo_no = -999;
IF iierrornumber <> 0 THEN RAISE ERROR -2 'auto_inserting ERROR -2!!!'; RETURN -2; ENDIF;
RETURN 0; END
CREATE RULE ru_auto_inserting AFTER INSERT INTO depo EXECUTE PROCEDURE auto_inserting ( depo_no = new.depo_no)
INSERT INTO depo ( depo_no, ... ) VALUES (-999, ...); SELECT * FROM depo WHERE depo_no = (SELECT value FROM sys_value WHERE tag='depo_no');
OR
INSERT INTO depo (depo_no, ...) SELECT -999, ... FROM ...
This is working.
Best regards, Victor I Koverninsky
Roy Hann - 27 Aug 2004 00:48 GMT > > OK but I would like automatically inserting sequence value when insert > > query not contain key field. [quoted text clipped - 10 lines] > > /* Procedure for getting next i4 value */ [snip]
> This is working. I'm going to rant about this a bit, because wanton use of surrogate and synthetic keys is one of my pet-peeves. (And if that wasn't enough to set me off, this thread also talks of "primary keys", which are another of my pet-peeves. :-)
Does it not occur to anyone that Ingres (and indeed ANSI SQL) insists on a unique key value on purpose, for a reason? This code is undoubtedly "working". It works the same way that a steel nail works when you use it to replace a 1 amp fuse that keeps blowing.
When you declare that a set of attributes constitute a primary key, you are asserting that all your rows are guaranteed to represent distinct real world entities because that irreducible set of attributes identifies each one unambiguously *in the real world*. Put slightly differently, the database is guaranteed not to contain ambiguous facts. That guarantee is the whole, entire point of the primary key and unique constraints. Sticking some bogus unique number on the end of a row and calling it a primary key is just defeating something that was put there to help and protect you.
I wouldn't get so exercised about this if I saw even a breath of hint that anyone contributing to this thread acknowledges the need to ensure every row is uniquely identifiable *before* adding a synthetic key value. Synthetic keys are very convenient, and I use them myself when the situation requires it, but used carelessly (as they virtually always are) they are a subtle cause of database corruption.
End of rant. I had to get that off my chest.
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Patrick Spinler - 27 Aug 2004 16:29 GMT > (lots of information about unique keys) So, something I've always wondered about - what do you do when in fact the real world problem you area attempting to represent is ambiguous ? (Or at least, you don't have and can't get enough information to de-ambiguate it.)
As an example, if one were creating a database of people who lived in a place without a assigned unique identifier of some sort, it's plausible that you might have in your data set one or more sets of identical twins with the same names who lived at the same location.
Of course, there's also a (considerable) problem in the real world in instances such as this, but that doesn't excuse us from throwing up our hands and saying we won't at least attempt to deal with it.
-- Pat
Roy Hann - 27 Aug 2004 19:53 GMT > > (lots of information about unique keys) > > [quoted text clipped - 3 lines] > (Or at least, you don't have and can't get enough information to > de-ambiguate it.) Then you would be stuck, because there simply is no way to deal with that. Adding an arbitrary value to the rows in question will allow you to "do stuff", but since the entire problem is that you don't know what you're doing the "stuff" to (it's ambiguous), you don't know if the result has any relationship to reality. Putting it bluntly, you would just be choosing rows at random and hoping for the best.
> As an example, if one were creating a database of people who lived in a > place without a assigned unique identifier of some sort, it's plausible > that you might have in your data set one or more sets of identical twins > with the same names who lived at the same location. That is rather recherche. But suppose it's true. What makes anyone, including ourselves, think it is reasonable to expect to do sensible, coherent, everywhere meaningful operations on things we can't identify? And what gives us any reason to be confident there really are two or more "indistinguishable" things in the real world anyway?--after all, we can't distinguish them!
> Of course, there's also a (considerable) problem in the real world in > instances such as this, but that doesn't excuse us from throwing up our > hands and saying we won't at least attempt to deal with it. I don't know if I follow this. Are you saying we are obliged to try, even in face of logic, or that we can be excused for saying it can't properly be done? If the latter, then I agree; in fact it is our professional duty to say when something fundamentally can't be done, instead of pretending it can and hoping no one ever finds out. (Civil engineers who do that kind of thing kill people. Accountants who do it crash stock markets.)
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
Emiliano - 23 Sep 2004 11:30 GMT > I'm going to rant about this a bit, because wanton use of surrogate and > synthetic keys is one of my pet-peeves. (And if that wasn't enough to set [quoted text clipped - 21 lines] > it, but used carelessly (as they virtually always are) they are a subtle > cause of database corruption. Hmm, I have one table that holds data on PCs. I was thinking of using either their IP address or hostname as PK, but IPs are less 'stable' than hostnames in most environments this will be deployed in (DHCP), but OTOH, hostnames can legally be up to 255 octets, which seems a bit long for a PK (although I really don't expect to see more than a few thousand of these rows, and typically < 400 rows), and hostnames aren't necesarily unique (multiple names can refer to a single PC).
Any recommendations?
Thanks, Emiliano Heyns
Roy Hann - 25 Aug 2004 15:50 GMT > Hello. > > I would like use integer typ for primary key. > What can I do automatically inserting sequence value when new row is > inserting ??? There is a paper on generating keys available from www.rationalcommerce.com under the resources link. I draw your attention particularly to the part that warns against using sequential numbers, and to the alternate suggestion. The paper includes example code.
Automatic numbering can be implemented using a rule and a procedure. I would tend to recommend against doing that however, because it creates the risk of a subtle kind of database corruption involving duplicate facts.
Roy Hann (rhann at rationalcommerce dot com) Rational Commerce Ltd. www.rationalcommerce.com "Ingres development, tuning, and training experts"
|
|
|