Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / General DB Topics / DB Theory / July 2005

Tip: Looking for answers? Try searching our database.

A good argument for XML

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
arthernan@hotmail.com - 01 Jul 2005 00:00 GMT
    I have spent some time looking in this newsgroup to make sure this
has not been mentioned before and it seems like it has not.

    Most of the expressivity in SQL queries is required to build
complex reports. Common reporting tools like crystal reports break the
data processing in two. A tabular result set is extracted and then post
processed to generate a report. Conceptually the reporting tool does a
last pass at the data and forms the output. This Pass is often used to
process data.

    Some database engines allow a procedure to return a result set
like SQL server, allowing for a procedural pass at the data before
reaching the reporting tool. But this still leaves a lot post
processing to be done at the reporting tool level.

    Looking at the information and shape of multiple reports, they
resemble more XML (hierarchical) than tabular data. So why not have a
better data processing tool that produces an XML that should only
require light page forming to produce a report.

    That way the data processing and graphic report design can be
logically separated.

    If we were going to take that route then it does make sense to
have a query language that produces hierarchical file outputs. Today
many people do this by querying the database from a procedural language
and producing and XML iteratively. But leaves programmers like me
hoping for a richer syntax.

    Arturo Hernandez
Gene Wirchenko - 01 Jul 2005 00:22 GMT
>     I have spent some time looking in this newsgroup to make sure this
>has not been mentioned before and it seems like it has not.
[quoted text clipped - 5 lines]
>last pass at the data and forms the output. This Pass is often used to
>process data.

    And whether it is a cursor being processed or XML, it is
processing.

>     Some database engines allow a procedure to return a result set
>like SQL server, allowing for a procedural pass at the data before
[quoted text clipped - 8 lines]
>     That way the data processing and graphic report design can be
>logically separated.

    They are.  The output of the main processing still requires some
interpretation to generate the output.  The output of the data
gathering processing is the input to the graphics processing.  Since
the two are dependent (After all, what are you printing?), there is
some coupling.

>     If we were going to take that route then it does make sense to
>have a query language that produces hierarchical file outputs. Today
>many people do this by querying the database from a procedural language
>and producing and XML iteratively. But leaves programmers like me
>hoping for a richer syntax.

    XML is an alleged solution looking for a problem.

Sincerely,

Gene Wirchenko
arthernan@hotmail.com - 01 Jul 2005 03:53 GMT
> >processed to generate a report. Conceptually the reporting tool does a
> >last pass at the data and forms the output. This Pass is often used to
> >process data.
>
>      And whether it is a cursor being processed or XML, it is
> processing.

I am talking of hiding calculation fields that are placed on a report
just for the sake of data calculation, totaling, averaging, subreports
... I could go on and on. This is the stuff that is co-mingled with the
report page forming. If anybody wanted to see how is the data
processing happening in a report they NEED to navigate throught the
page layout.

XML can have all those calculations done already, even if multiple
passes were required to get it (thought multiple passes are rare). The
only things that might need to be calculated are fields like totals per
page. But the report writer can have all the fields necessary from the
XML definition, and then the fields can be placed in a GUI page forming
tool. Only links between the attributes of the XML vs the attributes of
the report would need to be established.

These are very different kinds of processing. And there is an
opportunity to distribute processes better, from a logical point of
view. Not for performance or anything else. This helps undestanding of
the code and therefore maintenance, coding, productivity ....

> >many people do this by querying the database from a procedural language
> >and producing and XML iteratively. But leaves programmers like me
> >hoping for a richer syntax.
>
>      XML is an alleged solution looking for a problem.

I know what you are talking about, The arguments in favor of XML as I
have seen them in many places are not very strong. And it also makes me
wonder about the motivations behind it. In this case I am coming from a
different perspective. XML is out there. And report building does seem
to be a problem for which it might be well suited.

> Sincerely,
>
> Gene Wirchenko

Arturo Hernandez
Tom Bradford - 01 Jul 2005 13:35 GMT
>      XML is an alleged solution looking for a problem.

This is a '1997' argument whose validity is no longer acceptibly spoken
as such a generalization.

The W3C has been more than happy to synthesize the problems for us
through the introduction of XML dependencies...  RDF, SVG, XHTML, et al.
  The problems are here, welcome or not.  Beyond that, there have
definitely been problems to which XML is an ideal solution.  Mixed
content is a major one.   Parts explosion is another.

There have been many who have argued that the relational model is
all-encompassing, and that is definitely true.  But just because you
'can' do something with a particular solution, it doesn't necessarily
mean that you 'should' do something with a particular solution, or even
that the solution is ideal in all cases.

Signature

Tom Bradford - http://www.tbradford.org/
 EmbedDB API - http://www.embeddb.org/
Spinneret DB - http://www.spinneret.org/

Gene Wirchenko - 01 Jul 2005 17:34 GMT
>>      XML is an alleged solution looking for a problem.
>
>This is a '1997' argument whose validity is no longer acceptibly spoken
>as such a generalization.

    No, this is an argument that I just made.  It is 2005.

>The W3C has been more than happy to synthesize the problems for us
>through the introduction of XML dependencies...  RDF, SVG, XHTML, et al.
>   The problems are here, welcome or not.  Beyond that, there have
>definitely been problems to which XML is an ideal solution.  Mixed
>content is a major one.   Parts explosion is another.

    The problem has been solved before.  XML is solving old problems.

>There have been many who have argued that the relational model is
>all-encompassing, and that is definitely true.  But just because you
>'can' do something with a particular solution, it doesn't necessarily
>mean that you 'should' do something with a particular solution, or even
>that the solution is ideal in all cases.

    And just because you can come up with an alternative does not
make that alternative the end-all and be-all.  That is the way the XML
pushers push though.

Sincerely,

Gene Wirchenko
Anne & Lynn Wheeler - 01 Jul 2005 17:47 GMT
>      The problem has been solved before.  XML is solving old problems.

gml was invented in '69 at the science center
http://www.garlic.com/~lynn/subtopic.html#545tech

by "G", "M", and "L". and gml support added to the existing cms script
processing command. later in the 70s, it was standardized in iso
as SGML
http://www.garlic.com/~lynn/subtopic.html#sgml

also in somewhat the same mid-70s time-frame ... the original
relational, sql implementation (system/r) was done on the
same platform at sjr
http://www.garlic.com/~lynn/subtopic.html#systemr

Signature

Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Tom Bradford - 01 Jul 2005 17:54 GMT
>      The problem has been solved before.  XML is solving old problems.

What was the solution for mixed content then?  There are many people
that would argue that SGML was no solution at all because the document
structure was not explicit enough on its own to imply any form of
validity or well-formedness, thus a DTD was required with every parse.
XML, in that sense was a step in the right direction, especially
considering that the infrastructure of the web was already built on top
of SGML, which for the most part, looks like XML.

>      And just because you can come up with an alternative does not
> make that alternative the end-all and be-all.  That is the way the XML
> pushers push though.

I've seen the same arguments from people like Fabian Pascal... that
somehow the relational model, which was an alternative to previous
models, is the end all be all.  Fact is that XML zealots admit that XML
is not the end all be all for data representation, and they often design
systems that are hybrids between relational storage and XML
representation/storage, leaving each respective system to doing what
they do best.  You don't often find that type of opional flexibility in
relational zealots.

Signature

Tom Bradford - http://www.tbradford.org/
 EmbedDB API - http://www.embeddb.org/
Spinneret DB - http://www.spinneret.org/

Tom Ivar Helbekkmo - 01 Jul 2005 19:07 GMT
> There are many people that would argue that SGML was no solution at
> all because the document structure was not explicit enough on its
> own to imply any form of validity or well-formedness, thus a DTD was
> required with every parse.  XML, in that sense was a step in the
> right direction, [...]

Well, yes, but all it took was to turn off the various short cut
options in SGML, and you had exactly the same thing.  Those options
were bad ideas in the first place, and many of us have them off by
default.

-tih
Signature

Don't ascribe to stupidity what can be adequately explained by ignorance.

Marshall  Spight - 13 Jul 2005 04:04 GMT
> >      The problem has been solved before.  XML is solving old problems.
>
> What was the solution for mixed content then?

An insight I have had lately is that the xml people are
at heart document management people. By and large they
don't have any history with data management, and don't
understand what it is for or what it needs to do.

Then they try to evaluate things like sql *as a document
management system* and note that it doesn't work very well.
Where is the full text search? How come I can't find
related documents? What do I do about mixed content?
All questions that make sense in the document management field
and really miss the point of data management entirely.

The nasty bit seems to come in to play because the xml people have
somehow gotten it into their heads that data management and
document management are really the same problem. They think
that their techniques have some relevance to the problems of
structure, integrity, and manipulation, when in fact they
don't, at all, because the two fields are almst completely disjoint.

> There are many people
> that would argue that SGML was no solution at all because the document
[quoted text clipped - 3 lines]
> considering that the infrastructure of the web was already built on top
> of SGML, which for the most part, looks like XML.

Note how much this paragragh has to do with document management
and how it has nothing to do with data management.

> >      And just because you can come up with an alternative does not
> > make that alternative the end-all and be-all.  That is the way the XML
[quoted text clipped - 8 lines]
> they do best.  You don't often find that type of opional flexibility in
> relational zealots.

Speaking as a relational zealot, let me just say that there are lots
of document management issues to which I don't think the field of
data management has much to offer. Full text indexing, for example.

I still don't see that xml has any least interesting thing to
contribute to the field of data management, though. I suppose it
might, but you'd think if it did, someone could point out something,
something about *data management* and not *document management* that
is better/easier/cleaner/simpler with xml that with the RM. Or even
than just with sql.

Of course, this does nothing to explain away people like Jan Hidders,
who clearly have a deep understanding of the RM and yet remain
fascinated with xml. I have to say I find that a puzzle.

Marshall
arthernan@hotmail.com - 13 Jul 2005 16:20 GMT
>An insight I have had lately is that the xml people are
>at heart document management people. By and large they
>don't have any history with data management.

>The nasty bit seems to come in to play because the xml people have
>somehow gotten it into their heads that data management and
>document management are really the same problem

Very interesting ideas. Really.

>I still don't see that xml has any least interesting thing to
>contribute to the field of data management.

My argument in favor of "XML" or rather hierarchical output, is not
what xml/document management  would contribute to data management, but
the opposite. Everyday I deal with the need to create documents out of
relational databases. I want to leave the data relational. But for
document construction I have to somehow make a data structure that in
many instances looks more like a hierarchy of data.

XQuery is an XML query language with XML output. And suggests the
concept of a XML database. What I want is query syntax that will allow
me to create Hierarchical data out ouf a relational database, and then
this output could be made into XML. To me XML is to hierarchical data,
what CSV (Comma Separated Values) could be to tabular  data.

I know I can query the data and then let's say in VB or PL/SQL compose
an XML document. But there ought to be a better solution than that.

Arturo Hernandez
Marshall  Spight - 13 Jul 2005 03:35 GMT
> >      XML is an alleged solution looking for a problem.
>
[quoted text clipped - 3 lines]
> The W3C has been more than happy to synthesize the problems for us
> through the introduction of XML dependencies...  RDF, SVG, XHTML, et al.

I see only one way to parse this sentence, and in it,

>    The problems are here, welcome or not.  Beyond that, there have
> definitely been problems to which XML is an ideal solution.  Mixed
[quoted text clipped - 10 lines]
>   EmbedDB API - http://www.embeddb.org/
> Spinneret DB - http://www.spinneret.org/
Jan Hidders - 01 Jul 2005 21:12 GMT
>      Looking at the information and shape of multiple reports, they
> resemble more XML (hierarchical) than tabular data. So why not have a
[quoted text clipped - 9 lines]
> and producing and XML iteratively. But leaves programmers like me
> hoping for a richer syntax.

Richer than, say, SQL-2003 or Date & Darwen's "Tutorial-D"? Note that
these also allow you to define hierarchical data-structures (although
not as unrestricted as XQuery would) but are easier to optimize than
XQuery and are therefore much more likely to give you scalable solutions.

-- Jan Hidders
arthernan@hotmail.com - 13 Jul 2005 01:28 GMT
Richer than, say, SQL-2003 or Date & Darwen's "Tutorial-D"? Note that
these also allow you to define hierarchical data-structures (although
not as unrestricted as XQuery would) but are easier to optimize than
XQuery and are therefore much more likely to give you scalable
solutions.

-- Jan Hidders

OK I had to do some reading before I answered. And I looked at Rel's
documentation which claims to be the most thorought implementation ot
Tutorial D. I also read

SQL:2003 Has Been Published by  Andrew Eisenberg, Krishna Kulkarni, Jim
Melton, Jan-Eike Michels, Fred Zemke
http://www.sigmod.org/record/issues/0403/index.html#standards

which in turned referred to

SQL/XML is making good progress by Andrew Eisenberg
http://portal.acm.org/citation.cfm?id=565141&coll=portal&dl=ACM

I also have read about XQuery.

I did not find anything about Tutorial D that produced a hierarchical
output syntactically. Maybe I am looking on the wrong place.

SQL/XML looks very similar to XQuery. Where XQuery looks interesting
for querying hierarchical data to produce hierarchical data. Databases
are mostly relational and the syntax of XQuery for this propose seems
overly redundant. Here is an extract of "SQL/XML is making good
progress"

<highemps>
   { for $e in table ("Sample_db",
                      "Andrew",
                "*password*",
                "HR.ADMIN.EMPLOYEE"
                      ) /EMPLOYEE/row
     where $e/SALARY > 40000
     return
         <emp>
            {  $e/FIRSTNAME, $e/LASTNAME }
         </emp>
   }
</highemps>

the syntax:     "for $e in table     where       return"

Is very reminiscent of SQL but there is so much "extra" stuff here. The
parameters for the TABLE function are strings, why. Shouldn't there
be an easier way. "/EMPLOYEE/row" is again something that does not add
to the operation being done. I could go on and on.

I do not trying to blast the SQL/XML committee. They are proposing a
solution. In my thinking they are just too focused on the XML aspect
rather than the more broad Hierarchical output requirement.

To illustrate; the statement "select lastname,firstname from employee"
uses the select clause to specify column names, and positions. The
output is clearly tabular whether stored as text, comma separated or
even XML. I

I could only imagine having to write

for $e in table employee
     where $e/SALARY > 40000
     return '"'||$e/FIRSTNAME||'","'||$e/LASTNAME||'"'

Just to get a tabular output. As defined on some SQL/CSV standard.

I am not a database theorist. But I do use databases on a daily basis.
And I will read "any" reference given to me.

Arturo Hernandez
Jan Hidders - 13 Jul 2005 21:39 GMT
> I am not a database theorist. But I do use databases on a daily basis.
> And I will read "any" reference given to me.

From a practical point of view the SQL/XML and XQuery approaches will
probably have industrial strength implementations soon that you can
actually use. I know for example that DataDirect Technologies has an
XQuery implementation in beta that allows you to access your relational
data and their SQL/XML implementation is already in 2.0.

-- Jan Hidders
Gene Wirchenko - 13 Jul 2005 22:15 GMT
>> I am not a database theorist. But I do use databases on a daily basis.
>> And I will read "any" reference given to me.
>
> From a practical point of view the SQL/XML and XQuery approaches will
>probably have industrial strength implementations soon that you can
>actually use. I know for example that DataDirect Technologies has an

    Real Soon Now, eh?

>XQuery implementation in beta that allows you to access your relational
>data and their SQL/XML implementation is already in 2.0.

Sincerely,

Gene Wirchenko
Jan Hidders - 13 Jul 2005 23:06 GMT
>>>I am not a database theorist. But I do use databases on a daily basis.
>>>And I will read "any" reference given to me.
[quoted text clipped - 4 lines]
>
>      Real Soon Now, eh?

Yep. Right now it's still in beta but we already had the honour of using
it and as someone who does research on query language implementation I
can confirm that it is already quite mature.

-- Jan Hidders
Gene Wirchenko - 14 Jul 2005 01:27 GMT
[snip]

>>>From a practical point of view the SQL/XML and XQuery approaches will
>>>probably have industrial strength implementations soon that you can
>>>actually use. I know for example that DataDirect Technologies has an

>>      Real Soon Now, eh?
>
>Yep. Right now it's still in beta but we already had the honour of using
>it and as someone who does research on query language implementation I
>can confirm that it is already quite mature.

    "still in beta" and "already quite mature" are not synonyms.

Sincerely,

Gene Wirchenko
dawn - 14 Jul 2005 01:50 GMT
> [snip]
>
[quoted text clipped - 9 lines]
>
>      "still in beta" and "already quite mature" are not synonyms.

Nor are they mutually exclusive, even if it is only the rare product
that can be described by both.  smiles.  --dawn

> Sincerely,
>
> Gene Wirchenko
Mikito Harakiri - 13 Jul 2005 22:18 GMT
> and their SQL/XML implementation is already in 2.0.

It is well known that some very big name database vendor labeled their
first release as 2.0, because they figured out that nobody would bother
buying release 1.0.
Jan Hidders - 13 Jul 2005 23:17 GMT
>>and their SQL/XML implementation is already in 2.0.
>  
> It is well known that some very big name database vendor labeled their
> first release as 2.0, because they figured out that nobody would bother
> buying release 1.0.

It's not a DBMS and, yes, there was a 1.0 edition.

-- Jan Hidders
arthernan@hotmail.com - 14 Jul 2005 01:08 GMT
> actually use. I know for example that DataDirect Technologies has an
>XQuery implementation in beta that allows you to access your relational
>data and their SQL/XML implementation is already in 2.0.
>
>-- Jan Hidders

I looked at DataDirect Technologies website and I found a few examples
on their XML/SQL implementation. Here is the link
http://www.datadirect.com/developer/xquery/topics/sqlxml-xquery-nativexml/index.ssp

This is an attempt to justify in detail parts of what I have been
"argumenting" in general. Here is a code example form the link
above. I took some aliasing off that I thought would be superficial for
the discussion:

select CustId,
    xmlelement(name "Customer",
        xmlattributes(CustId),
        xmlforest(c.Name, City),
        xmlelement(name projects,
        (select xmlagg(xmlelement(name project,
            xmlattributes(ProjId),
            xmlforest(p.name)))
        from Projects p
        where p.CustId=c.CustId))) as "customer-projects"
from Customers c

On first impression this code looks complex and verbose. It is better
than what XQuery provides. I can see already part of the source of
animosity towards XML. Why are all the functions prefixed with XML.
When SQL was developed there wasn't any SQLSELECT SQLFROM SQLWHERE.
It's almost like they are drilling it into my head, and I end up typing
more code. It is also harder to read. If the emphasis had been in
hierarchical output instead the code could have looked like:

select CustId,
     node(name Customer, attribute CustId, value
          nodes(node(c.Name),
                node(City),
                (select aggnodes(node(name Project, attribute ProjID,
value node(p.Name))
                 from Projects p where p.CustId=c.CustId))) as
"customer-projects"

I personally think the second is easier to read. This is closer to a
better syntactic solution than what I had seen before. Not my code
example but the concept. One thing that still looks awkward to me is
the combination of keywords and functions. I know people are familiar
with SQL but select(select-clause, from-clause, where-clause, order-by,
having......) could be better since functions do seem more hierarchical
than statements. But again if we leave it to our marketing side of the
brain......

Let's see how that would look:

select CustId,
     node("Customer", att(CustId),
          node(node(c.Name),
               node(City),
               selectnodes(
                     node("Project", att(ProjID), node(p.Name)),
                     Projects p,p.CustId=c.CustId)) as
"customer-projects"
from Customers c

I think this is easier to read, I am assuming overloading of the
function node, and I eliminated the "name" keyword from the first
parameter since Oracle does it on their xmlelement implementation. The
overloading allows doing away with the difference between "xmlforest
and "xmlelement". I also bound "select" and "xmlagg" together.

Anyway I am not trying to makeup my own language. I am going to look
into possibly using XML/SQL.

Arturo Hernandez
Mikito Harakiri - 14 Jul 2005 17:33 GMT
http://www.datadirect.com/developer/xquery/topics/sqlxml-xquery-nativexml/index.ssp

> This is an attempt to justify in detail parts of what I have been
> "argumenting" in general. Here is a code example form the link
[quoted text clipped - 12 lines]
>         where p.CustId=c.CustId))) as "customer-projects"
> from Customers c

This is supposedly simpler than

select c.*, p.*
from Projects p, Customers p
where p.CustId=c.CustId

right? BTW, where is hierarchy that XML nuts are so much talking about?

Quote from the article:
"To many intelligent and articulate XML programmers, "XML is just
Unicode with pointy brackets" is almost a statement of faith."

What a misnomer. "intelligent XML programmer"
arthernan@hotmail.com - 14 Jul 2005 21:59 GMT
>select c.*, p.*
>from Projects p, Customers p
>where p.CustId=c.CustId
>
>right? BTW, where is hierarchy that XML nuts are so much talking about?

I think you are missing the point. Looking at the first example with
the ugly XML prefixes, I think it would be safe to assume there are
many Projects per Customer. So there are just two hierarchy levels on
this example.

Reporting tools easily convert a table result like this into a
hierarchy, where customer data can appear as a group header and project
data as detail data.  I doubt that anybody would like to present a
report without a group header; assuming the header contains several
fields, which is very common.

Now the following is a more dramatic example to illustrate the concept.
This comes from an actual production report. I will just write out
table names and indent to represent the hierarchy.

event
     participant
           agenda
           survey_headings
           ce_data
           faculty
           participant

OK there is more but I'll leave it there. Currently multiple statements
are issued to the database for every node under the participant node.
That is for every participant in a given event that makes for a lot of
roundtrips to the server, and a lot of sparse code.

If I wanted to put them all on the same tabular result I would have to
include 7 tables on the same FROM-CLAUSE. And maybe have a cross
product between 5 tables. Somehow the reporting tool would decipher
that? All this if we want to avoid a hierarchical output from the
database where the final report output is clearly hierarchical.

I am still considering Jan Hidders suggestion. I just have not had
enough time, I don't even know if what I looked at was XML/SQL v2 or
v1.

Arturo Hernandez
Mikito Harakiri - 14 Jul 2005 23:02 GMT
> >select c.*, p.*
> >from Projects p, Customers p
[quoted text clipped - 9 lines]
> Reporting tools easily convert a table result like this into a
> hierarchy,

Excuse me, but hierarchy with 2 levels is not really a hierarchy.

> where customer data can appear as a group header and project
> data as detail data.  I doubt that anybody would like to present a
> report without a group header; assuming the header contains several
> fields, which is very common.

Nonsence. Displaying master-detail is an elementary application
programming skill.

> Now the following is a more dramatic example to illustrate the concept.
> This comes from an actual production report. I will just write out
[quoted text clipped - 7 lines]
>             faculty
>             participant

So you have several tables, and the join graph is a tree? Nothing
special here.

> OK there is more but I'll leave it there. Currently multiple statements
> are issued to the database for every node under the participant node.
> That is for every participant in a given event that makes for a lot of
> roundtrips to the server, and a lot of sparse code.

How big is the report. There is a natural limit on humans ability to
digest information. When the data volume is big the front end is forced
to present the information in a compact form. It would be either
1. A Report with Aggregated Data, or
2. Local Data view
By local view of the data I mean that there is no aggreagation, but
simple data cutoff, e.g. only a page of the big list is shown, or only
a node with its direct children is displayed. How do you want to
present the information to the end user in your case and not overwhelm
him?

Multiple roundtrips to a server are not a problem. I'm talking about
basics of application programming performance. It is a problem when one
of your queries return a list of data, and you issue a new query once
per each row returned.  Once again, if you designed your front-end
correctly, such that your data display is not a humongous mess, then
it's not hard to avoid this problem.

You are seriously dilluted if you think XML would deliver any scalable
solution out of the box. By definition XML=slooooooow.

> If I wanted to put them all on the same tabular result I would have to
> include 7 tables on the same FROM-CLAUSE. And maybe have a cross
> product between 5 tables.

Cross product smells design error. If you need data from 7 views, shoot
7 queries. An extra Roundtrip to a server itself is not an evil. It is
bad when these trips are out of control, that is they are buried in
some loop within application programming code. Also, the data from 7
queries sounds like a big volume of data to me. An end-user is not
going to digest all this information in a second. The latency of user
processing your report indicates that, perhaps, he can afford waiting a
minute or two while the report is generating.

> Somehow the reporting tool would decipher
> that? All this if we want to avoid a hierarchical output from the
> database where the final report output is clearly hierarchical.

What is "clearly hierarchical" about it? If your front-end uses a tree
or tree table widget, then I will be convinced. Though, I don't see any
use for XML in this case, either.
arthernan@hotmail.com - 15 Jul 2005 16:46 GMT
>> event
>>       participant
[quoted text clipped - 6 lines]
>So you have several tables, and the join graph is a tree? Nothing
>special here.

I don't think you are giving me the benefit of the doubt. I'll be more
specific. An event in our database is a class held in a hotel or like.
Each participant gets a packet with several pages which include an
agenda, a participant list, faculty, CE forms, a survey form and this
is probably just a third of the current contents of the report.

Now how many roundtrips? We have classes of 300 people so you do the
math.

Now let me save some time here. Someone could say aha!! you are talking
about separate reports. Well this report is made into a single
postscript file one class at a time. And then it's sent to be printed
by a third party.

>going to digest all this information in a second. The latency of user
>processing your report indicates that, perhaps, he can afford waiting a
>minute or two while the report is generating.

You are correct; the user can wait on this one. And for the size of the
report 5 to 20 minutes is't actually pretty impressive. But maintaining
these kinds of reports is a nightmare. All the queries are separated
from each other. And fields that were already queried at the top level
"have" to be queried again. I wander if the self imposed tabular output
restriction is preventing the report from running in less than a
minute.

Any report can be made from tabular results, that is how most people
traditionally do them. Tools have been developed so a simple two level
hierarchy is easy to handle. Even this complex report is working with
tabular input. The finished content in this case is clearly
hierarchical, and it is in a substantial amount of cases.

Arturo Hernandez
Mikito Harakiri - 15 Jul 2005 18:08 GMT
> >> event
> >>       participant
[quoted text clipped - 12 lines]
> agenda, a participant list, faculty, CE forms, a survey form and this
> is probably just a third of the current contents of the report.

Does every participant gets identical package or not? As they
participate in the same event I assume they do.

> Now how many roundtrips? We have classes of 300 people so you do the
> math.
[quoted text clipped - 21 lines]
> tabular input. The finished content in this case is clearly
> hierarchical, and it is in a substantial amount of cases.

I still fail to understand why the output is hierarchical. One way to
improve communication is to express your example formally in SQL. About
1 page, no more, although you can indicate places where the scale goes
up.

Then, we have a discussion.
arthernan@hotmail.com - 15 Jul 2005 19:30 GMT
>Does every participant gets identical package or not? As they
>participate in the same event I assume they do.

I would say 70 to 80 percent is identical because their personal data
is prefilled in many places.

>I still fail to understand why the output is hierarchical. One way to
>improve communication is to express your example formally in SQL. About
>1 page, no more, although you can indicate places where the scale goes
>up.
>
>Then, we have a discussion.

I hear the part about writing formal SQL. Let me try this first. The
Event is root level, Participant is second level, Faculty would be
third level. I personally don't care much for XML as a "Brand" but I'll
use it to illustrate.

<event>
  <location>Singapur</location>
  <start day>1-Jan-2006</startday>
  <participant>
     <FirstName>Peter</name>
     <LastName>Gonzales</name>
     <agenda>
        <item>
           <day>1</day>
           <subject>Why XML is bad</subject>
        </item>
        <item
           <day>2</day>
           <subject>Is Bill Gates related to Bush? </subject>
        </item>
     </agenda>
     </ce_forms>
        <form> .....  </form>
        <form> .....  </form>
     <ce form>
  <participant>
           ...
  </participant>
</event>

Ok this is just a portion of how the XML could look like. This looks a
lot more like the final content. Versus a very big spreadsheet. That is
my whole argument. I don't think anybody intends that reports should be
big spreadsheets. But even from the report design perspective having
the ability of using only one query to produce the data to fill out the
entire report seems more manageable. It's like a table of contents.

That last sentence just made me think of another example. This report
is hierarchical in the same sense many books are

book
  introduction
     paragraph
        sentence
  chapter 1
     section 1.1
     section 1.2
     exercises
     bibliography
  chapter 2

My contention is that even when a simple master-detail report could be
done from tabular or hierarchical input. That it might be better to
design the data transformation on the query rather than in the report
formation. There are gray areas here. I would like to hear from
somebody that might have tried it and see how did his/her experience
turned out. And I will admit until I test this idea I am not completely
sure how it will turn out. I am just frustated with the continuos
strugle between sparse code on the report writer and sql functions or
views. And the bottle neck looks like it is the tabular output.

Arturo Hernandez
Lauri - 15 Jul 2005 20:05 GMT
> My contention is that even when a simple master-detail report could be
> done from tabular or hierarchical input. That it might be better to
[quoted text clipped - 5 lines]
> strugle between sparse code on the report writer and sql functions or
> views. And the bottle neck looks like it is the tabular output.

There was a related discussion in a previous thread about how nested
relations could help.  Of course the current DBMS'es don't support such
constructs, so this is just "theory".

see: http://tinyurl.com/clk43

regards,
Lauri Pietarinen
Mikito Harakiri - 15 Jul 2005 20:38 GMT
> The
> Event is root level, Participant is second level, Faculty would be
[quoted text clipped - 32 lines]
> the ability of using only one query to produce the data to fill out the
> entire report seems more manageable. It's like a table of contents.

Interesting. Reduced to essentials you have

Participants ---< Registrations >--- Sessions

Unfortunately, the report is not simple list of the participants with
the sessions they are registered in. This report is fragmented and
buried somehere in the bottom or middle layer of the document you are
producing.

Let me get the other perspective, though. I'm a participant and I want
to know what sessions I'm registered to participate tomorrow. Ok, look
up in the Event brochure, browse through the table of content, find the
right page, see my schedule. This traditional way of searching
information in the booklet is slowly dying. Today, I just go to the web
page, login, get to my schedule page, and print it. With mobile device
I don't even have to print anything.

In short, you are fighting the mess of things organized the wrong way.
There is no future in it. This conclusion is not really helful to solve
your real problem, but that's the theory group, right;-?

P.S. Spoken by somebody who is still doing the things traditional way
-- writing a book.
Marshall  Spight - 15 Jul 2005 18:09 GMT
> >> event
> >>       participant
[quoted text clipped - 15 lines]
> Now how many roundtrips? We have classes of 300 people so you do the
> math.

I didn't quite follow that. Could you be specific? You have one
roundtrip per? Participant, event, which?

Marshall
arthernan@hotmail.com - 18 Jul 2005 17:28 GMT
>I didn't quite follow that. Could you be specific? You have one
>roundtrip per? Participant, event, which?
>
>Marshall

I *think* it is for every subreport, every time it displays the page.
About 10 subreports per participant and classes up to 300 participants.
Of those you can count on 5 returning the same rows every time, and the
rest return personalized data. This is using Crystal Reports.

Arturo Hernandez
Gene Wirchenko - 15 Jul 2005 00:08 GMT
[snip]

>Now the following is a more dramatic example to illustrate the concept.
>This comes from an actual production report. I will just write out
[quoted text clipped - 7 lines]
>            faculty
>            participant
            ^^^^^^^^^^^
    ???

>OK there is more but I'll leave it there. Currently multiple statements
>are issued to the database for every node under the participant node.
[quoted text clipped - 3 lines]
>If I wanted to put them all on the same tabular result I would have to
>include 7 tables on the same FROM-CLAUSE. And maybe have a cross

    So?

>product between 5 tables. Somehow the reporting tool would decipher
>that? All this if we want to avoid a hierarchical output from the

    Easily.  Create a view.

>database where the final report output is clearly hierarchical.

    And what about another "hierarchical report" with a different set
of joins?  You are stuck with your first hierarchy.

         invoice
              invoice line
                   product
is fine for printing invoices.  It is not so good when you want to
know what invoices a given product appears in.  I had that exact
problem in an accounting system.  I ended up having to write a program
to grovel through the batches.

    The beauty of relationally-organised data is that many more
questions can be easily answered.

[snip]

Sincerely,

Gene Wirchenko
arthernan@hotmail.com - 18 Jul 2005 18:09 GMT
>     The beauty of relationally-organised data is that many more
>questions can be easily answered.

I couldn't agree more, the database is and should stay relational. Or
at least that is the way I am thinking now.

>>product between 5 tables. Somehow the reporting tool would decipher
>>that? All this if we want to avoid a hierarchical output from the
>
>     Easily.  Create a view.

How is a view going to solve the problem. There is an agenda, and a
list of participants to the class, each participant gets both and they
are both tied to an event. Put them on the same query and you get a
cross product. How will a view fix this problem.

Arturo Hernandez
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.