Database Forum / DB2 Topics / May 2008
ADD A COLUMN IN SERTAIN POSITIONS:
|
|
Thread rating:  |
lenygold - 28 May 2008 15:42 GMT I have a table FAMILY folowing columns: SSN,NAME,DOB,AGE,PL_BIRTH,MARITAL_STAT
I would like to add a column GENDER between columns NAME and DOB. Is posiible wihout Droping table FAMILY?
Thank's in avance. Leny G.
Dave Hughes - 28 May 2008 16:53 GMT > I have a table FAMILY folowing columns: > SSN,NAME,DOB,AGE,PL_BIRTH,MARITAL_STAT [quoted text clipped - 4 lines] > Thank's in avance. > Leny G. No, columns can only be appended to a table. Still, I have to ask: why do you care what position the column is in? It can only make a difference to queries utilising the evil SELECT * syntax (or the slightly-less-evil-and-sometimes-genuinely-useful SELECT qualifier.* syntax).
If you need a column in a different position it's a lot easier to just define a view ontop of the table which rearranges the columns, rather than all the rigmarole associated with recreating it.
Cheers,
Dave.
Frank Swarbrick - 28 May 2008 21:11 GMT >>> On 5/28/2008 at 9:53 AM, in message <67qdnZR2d89B4qDVnZ2dnUVZ8uWdnZ2d@posted.plusnet>, Dave Hughes<dave@waveform.plus.com> wrote:
>> I have a table FAMILY folowing columns: >> SSN,NAME,DOB,AGE,PL_BIRTH,MARITAL_STAT [quoted text clipped - 10 lines] > slightly-less-evil-and-sometimes-genuinely-useful SELECT qualifier.* > syntax). Can you give an example of "the slightly-less-evil-and-sometimes-genuinely-useful SELECT qualifier.* syntax"? I've never heard of it.
Thanks, Frank
jefftyzzer - 28 May 2008 22:17 GMT On May 28, 1:11 pm, "Frank Swarbrick" <Frank.Swarbr...@efirstbank.com> wrote:
> >>> On 5/28/2008 at 9:53 AM, in message > [quoted text clipped - 24 lines] > Thanks, > Frank An example of SELECT qualifier.* (the "T.*"):
WITH T(C1, C2) AS ( VALUES (1,1), (1,2), (2,1), (2,2) ) SELECT T.*, MAX(C2) OVER (PARTITION BY C1) MAX_C2_PER_C1_GRP FROM T;
--Jeff
jefftyzzer - 28 May 2008 22:48 GMT > On May 28, 1:11 pm, "Frank Swarbrick" <Frank.Swarbr...@efirstbank.com> > wrote: [quoted text clipped - 47 lines] > > --Jeff Or you could do something like:
SELECT A.*, B.C5 FROM A JOIN B ON A.COL = B.COL;
--Jeff
Frank Swarbrick - 29 May 2008 00:05 GMT >>> On 5/28/2008 at 3:48 PM, in message <202e6a45-33c6-4263-8632-f0531a218a56@s33g2000pri.googlegroups.com>,
>> On May 28, 1:11 pm, "Frank Swarbrick" <Frank.Swarbr...@efirstbank.com> >> wrote: [quoted text clipped - 57 lines] > ON > A.COL = B.COL; Now that I understand what it does, why is it any less evil than just '*'?
Is this now less evil?
SELECT T.* FROM T
:-) Frank
jefftyzzer - 29 May 2008 17:50 GMT On May 28, 4:05 pm, "Frank Swarbrick" <Frank.Swarbr...@efirstbank.com> wrote:
> >>> On 5/28/2008 at 3:48 PM, in message > [quoted text clipped - 72 lines] > > Frank I leave the final word on the nefariousness of SELECT * to Dave, but I'd imagine SELECT QUALIFIER.* is considered less evil than SELECT * precisely because the former *is* qualified, i.e., it's at least a bit more targeted than SELECT <all columns from all tables>.
--Jeff
Dave Hughes - 30 May 2008 23:46 GMT > I leave the final word on the nefariousness of SELECT * to Dave, but > I'd imagine SELECT QUALIFIER.* is considered less evil than SELECT * > precisely because the former is qualified, i.e., it's at least a bit > more targeted than SELECT <all columns from all tables>. Sorry for taking a while to respond to this, unfortunately this is a complex subject, one which I suspect I'm barely qualified to be commenting upon. Still, I'll give it a whirl...
First a simple riddle. Given a table T with columns A and B, the following query will select all columns of T and add an extra constant column C to the output, containing the value 1:
SELECT T.*, CAST(1 AS INTEGER) AS C FROM T
Simple enough. Surely the following query will do the same?
SELECT *, CAST(1 AS INTEGER) AS C FROM T
But no! The above query will generate a syntax error. SELECT * and SELECT qualifier.* are two fundamentally different things. This is best illustrated by the syntax diagram for select-clause non-terminal [1]:
.-ALL------.
>>-SELECT--+----------+-----------------------------------------> '-DISTINCT-'
>--+-*-----------------------------------------------+--------->< | .-,-------------------------------------------. | | V | | '---+-expression--+-------------------------+-+-+-' | | .-AS-. | | | '-+----+--new-column-name-' | '-exposed-name.*--------------------------'
Note that if using SELECT *, /nothing/ else can be specified in the select-clause. For an explanation, we'll start by looking at the (simplified) definition of SELECT *:
Represents a list of names that identify the columns of table R. The first name in the list identifies the first column of R, the second name identifies the second column of R, and so on...
Okay, so what's table "R"? See the definition of select-clause:
The SELECT clause specifies the columns of the final result table, R. The column values are produced by the application of the select list to R. The select list is the names or expressions specified in the SELECT clause, and R is the result of the previous operation of the subselect. For example, if the only clauses specified are SELECT, FROM, and WHERE, R is the result of that WHERE clause.
So, "SELECT *" means "all columns of table R", where table R is the "final result table" ... although it's not quite /that/ final or "SELECT *" would be an impossible self-referential definition. Table R is the "almost final" table; the result of everything before the final stage of the query, the select-clause (the first clause of the query ... which is executed last ... ahem).
Now, what about "SELECT exposed-name.*":
Represents the list of names that identify the columns of the result table identified by exposed-name. The exposed-name may be a table name, view name, nickname, or correlation name, and must designate a table, view or nickname named in the FROM clause. The first name in the list identifies the first column of the table, view or nickname, the second name in the list identifies the second column of the table, view or nickname, and so on.
This one's a bit more obvious; "does what it says on the tin". So, there's one reason SELECT * is really nasty - it doesn't do what you might think it does while SELECT qualifier.* does.
Now for reasons that both are evil to some extent. The following definitely applies to DB2; whether it applies to SQL in general I can't say (I've tried finding the relevant passages in a draft of SQL-2003 but my brain went on strike after an hour, before I'd found anything substantive). Firstly a quote from Celko's reply:
...Data elements are accessed by a LOGICAL data element name, and not a PHYSICAL location. There is no ordering. Rows have no ordering within a table; columns have no ordering within a row...
It's a nice theory, but as far as columns in a row (of a query) are concerned it's also not true. If columns have no order in a row, the following query specifying that the result set should be ordered by the first and second columns (whatever they may be) shouldn't be possible:
SELECT * FROM T ORDER BY 1, 2
Why is it permitted? Because in some cases, you /can't/ reliably access columns by name:
SELECT * FROM T, T ORDER BY 1, 2
Assuming our original definition of T as having columns A and B, the result set of the above query has two A columns and two B columns! Attempting to order by column A would result in SQL0203: "A reference to column "A" is ambiguous" (how could such an error exist in a system which only permitted access by element name?). It's still possible with the qualifier.* syntax, although it takes quite a bit more effort:
SELECT T1.*, T2.* FROM T AS T1, T AS T2 ORDER BY 1, 2
Of course, this isn't really the fault of poor old *; the following query is perfectly legal too:
SELECT A, B AS A FROM T
However, the use of * (or qualifier.*) makes it that bit easier to wind up with duplicated column names in a result set (especially in the case of queries involving self-referential joins). If it's any consolation, such queries can't be used as the definition of a view (without including an explicit column name list with no duplicates).
The use of * also leaves queries somewhat ambiguous - you can't tell from the query alone what columns it will return; the query must be compiled before that can be determined.
This in turn leads to another interesting situation regarding static vs. dynamic SQL. For brevity, I ommitted some bits from the definition of the select-clause above. These basically stated that the columns returned by SELECT * (or SELECT qualifier.*) are determined at bind time (for static SQL). Should columns be added to the table later on, static SQL will not return them.
Of course, dynamic SQL is a different matter - the "meaning" of any dynamic SQL containing SELECT * will immediately change when columns are added to a table it targets. If that doesn't sound too serious, consider the query from earlier:
SELECT T.*, CAST(1 AS INTEGER) AS C FROM T
If we now add a column C to table T the query is still valid, yet contains ambiguous column names. If a program is executing this query and accesses columns by name, which column C will it access? If it accesses columns by position, it will definitely access the new column C, not the original one.
So, my simple rule of thumb regarding SELECT * is as follows: it's a handy idiom on the command line for quick'n'dirty adhoc queries, but it should /never/ appear in code within a program / script. SELECT qualifier.* is a little different. It's also a handy idiom on the command line, but it does have occassional uses in code - specifically for the case where one wishes to add columns to a query against an arbitrary table.
Here's a real-world example where I've used SELECT qualifier.* in some code. It's a stored procedure for creating exceptions tables (see below) which utilizes the SELECT qualifier.* syntax to extend an existing table definition:
CREATE PROCEDURE CREATE_EXCEPTION_TABLE( SSCHEMA VARCHAR(30), STABLE VARCHAR(128), DSCHEMA VARCHAR(30), DTABLE VARCHAR(128), DTBSPACE VARCHAR(18) ) SPECIFIC CREATE_EXCEPT_T1 MODIFIES SQL DATA NOT DETERMINISTIC NO EXTERNAL ACTION LANGUAGE SQL BEGIN ATOMIC DECLARE DDL VARCHAR(8192) DEFAULT ''; -- Drop any existing table with the same name as the -- destination table FOR D AS SELECT 'DROP TABLE "' || TABSCHEMA || '"."' || TABNAME || '"' AS DROP_CMD FROM SYSCAT.TABLES WHERE TABSCHEMA = DSCHEMA AND TABNAME = DTABLE AND TYPE = 'T' DO EXECUTE IMMEDIATE D.DROP_CMD; END FOR; -- Create the exceptions table based on the source table SET DDL = 'CREATE TABLE "' || DSCHEMA || '"."' || DTABLE || '" AS ' || '(' || ' SELECT' || ' T.*,' || ' CAST(NULL AS TIMESTAMP) AS EXCEPT_TS,' || ' CAST(NULL AS CLOB(32K)) AS EXCEPT_MSG' || ' FROM' || ' "' || SSCHEMA || '"."' || STABLE || '" T' || ')' || 'WITH NO DATA IN ' || DTBSPACE; EXECUTE IMMEDIATE DDL; END!
In case you're not familiar with exceptions tables:
When using the LOAD command (as we frequently do to push data from our sources into our database), if the destination table has foreign keys or check constraints it will be placed in "check pending" mode after the load has completed, indicating that there may be rows in the table which don't conform to all the constraints. Tables in check pending mode are not accessible.
The SET INTEGRITY statement can be used to bring the table out of check pending mode - however, if it finds rows that don't comply with the table's constraints it needs somewhere to move them. This is where an "exceptions" table comes in. An exceptions table has the same structure as the original table, but has no constraints and may optionally include two additional columns to store a description of the exception (i.e. which constraint(s) in the original table a row violated) and a timestamp (storing when the SET INTEGRITY statement was executed).
So, there you have it. Hopefully the above is a vaguely convincing argument for the Dr. Evil status of SELECT * and the Mini-Me status of SELECT qualifier.* :-). I've doubtless missed some reasons (which more qualified people are more than welcome to elucidate upon), but for me at least the above is sufficient to make me extremely wary of using * in a select-clause.
Cheers,
Dave.
P.S. Incidentally, there's additional stored procedures to go with the one above which provide overloaded versions with parameter defaults, and another set which layers views on top of the generated tables. The views split out the complex EXCEPT_MSG columns recursively into rows for easier analysis of what caused a row to be rejected (the query utilized is an extended version of the one from the Exception Tables topic [2] of the InfoCenter). If anyone's interested in these I'll post'em too.
Having such stored procedures means that we can generate exceptions tables (and views) for all tables in a source schema with a simple chunk of dynamic compound SQL (a FOR loop over a query against SYSCAT.TABLES). Hence, if the source schema changes (new columns, new tables, whatever), we can regenerate the exceptions schema with a very simple script and without having to duplicate changes in another script.
[1] http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/topic/com.ibm.db2.l uw.sql.ref.doc/doc/r0000875.html
[2] http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/topic/com.ibm.db2.l uw.sql.ref.doc/doc/r0001111.html
Frank Swarbrick - 31 May 2008 00:12 GMT >>> On 5/30/2008 at 4:46 PM, in message <8aCdnVR61-NaHt3VnZ2dnUVZ8v6dnZ2d@posted.plusnet>, Dave Hughes<dave@waveform.plus.com> wrote:
> Sorry for taking a while to respond to this, unfortunately this is a > complex subject, one which I suspect I'm barely qualified to be > commenting upon. Still, I'll give it a whirl... Thank you for the very informative and interesting post. I learned quite a few useful things from it!
Frank
--CELKO-- - 28 May 2008 21:27 GMT >> I would like to add a column GENDER between columns NAME and DOB. Is posiible wihout Droping table FAMILY? <<
>> I would like to add a column GENDER between columns NAME and DOB. Is possible without dropping the table FAMILY? << You have missed an basic concept of RDBMS and SQL. Data elements are accessed by a LOGICAL data element name, and not a PHYSICAL location. There is no ordering. Rows have no ordering within a table; columns have no ordering within a row. Tables have no ordering in a schema. You are still thinking of a sequential file with contiguous storage.
You ALTER the table, add the sex_code column with proper constraints (look up ISO-5218 for details). You also missed the ISO-11179 rules for naming data elements -- "name" is too vague by itself (it begs the question, "of what?" -- town, person, etc.?), and while dob is common, "birth_date" is better since it shows the property (_date) of the data element.
When you do a SELECT or other statement, put the columns in the order you want them passed to the application program.
--CELKO-- - 28 May 2008 21:30 GMT >> I would like to add a column GENDER between columns NAME and DOB. Is possible without dropping the table FAMILY? << You have missed an basic concept of RDBMS and SQL. Data elements are accessed by a LOGICAL data element name, and not a PHYSICAL location. There is no ordering. Rows have no ordering within a table; columns have no ordering within a row. Tables have no ordering in a schema. You are still thinking of a sequential file with contiguous storage.
You ALTER the table, add the sex_code column with proper constraints (look up ISO-5218 for details). You also missed the ISO-11179 rules for naming data elements -- "name" is too vague by itself (it begs the question, "of what?" -- town, person, etc.?), and while dob is common, "birth_date" is better since it shows the property (_date) of the data element. And age is redundant given the birth date.
When you do a SELECT or other statement, put the columns in the order you want them passed to the application program.
|
|
|