On Dec 18, 4:12 am, "Roy Hann" <specia...@processed.almost.meat>
wrote:
> I've got a couple of customers who have found out the hard way that parsing
> gargantuan XML documents can kill the biggest systems. Presented with the
> option of using schema-based transformation or switching to files of
> fixed-width fields they both opted to abandon XML entirely and are delighted
> with their new fixed-width field files.
Or use XML-complient structure:
<comma_separated_doc>
a,b,c,d
e,f,g,h
...
</comma_separated_doc>
It is much funnier read on the dailywtf.com, but I'm unable to recover
the reference:-)
>> The thing that makes XML attractive to some people is not that it
>> would be a good basis on which to build a dtabase, but that it seems
>> convenient for data exchange.
>
> Is XML actually attractive to significant numbers of people?
I'd divide the answer into two. For the original use SGML/XML and
related markup languages (like HTML) were meant for, i.e. markup of
running text compatible with plain text editors, they still hold quite a
lot of value.
The match for data is much weaker. The only real value I can find in XML
there is that currently we don't really have a commonly accepted and
implemented format in which to serialize datasets. At least serializing
in-transit ones into a metaformat that all the other people use gives us
some minimal sharing of tools, like parsers, browsers and APIs.
Sometimes we even get to the point of serializing to a proper, shared,
concrete format (say vCal or RSS), so that the compatibility runs
deeper.
But of course that is not how most people use XML; they use it for
everything. Then Bad Things Happen: using XML implies inserting an extra
layer of complexity into your system. Its textual encoding entails
bloat. The rich hierarchical structure which serves text so well invites
all of the usual data modelling mistakes. People worry too much about
the metaformat to lay the proper groundwork for the actual formats. Data
ends up being locked up inside a serialization, when we'd really like
random access. Tying applications to a serialization, i.e. a physical
data format, makes data independence impossible to achieve. And of
course the rest of the big picture, like data language design,
transactional guarantees and integrity still haven't been solved, so
that everybody ends up cooking their own and more often than not getting
it horribly wrong. The list goes on.
XML has its uses. Unfortunately the hype causes people to misapply it.

Signature
Sampo Syreeni, aka decoy - mailto:decoy@iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Anne & Lynn Wheeler - 19 Dec 2007 15:55 GMT
> I'd divide the answer into two. For the original use SGML/XML and
> related markup languages (like HTML) were meant for, i.e. markup of
> running text compatible with plain text editors, they still hold quite
> a lot of value.
i.e. SGML was originally GML before becoming "standard gml" with
ISO standard. GML
http://www.garlic.com/~lynn/subtopic.html#sgml
was invented at the science center in '69
http://www.garlic.com/~lynn/subtopic.html#545tech
(G, M, and L, are the inventors initial motivated requirement for
acronym with those letters) ... at the time, somewhat motivated by the
need for use in legal documents.
http://xml.coverpages.org/sgmlhist0.html
science center was also responsible for virtual machine implementation
and a lot of timesharing and interactive related applications.
the original documentation formater developed at the science center was
called *script*, used "dot" formating commands ... somewhat similar
to earlier implementation done for ctss
http://en.wikipedia.org/wiki/RUNOFF
and
http://mit.edu/Saltzer/www/publications/CC-244.html
aka some of the ctss people went to the science center on 4th flr of 545
tech sq and some went to multics on 5th flr.
the initial "gml" implementation was done by adding gml tags to script
document formater.
cern was also a large virtual machine implementation ... using various
applications ... including a script clone written by univ. of waterloo.
this talks about evolution from sgml into html:
http://infomesh.net/html/history/early
the first webserver outside of europe was on the virtual machine
system at slac (slac and cern shared a lot of software)
http://www.slac.stanford.edu/history/earlyweb/history.shtml
the science center's virtual machine implementation was also
used at sjr for the original relational/sql implementation
http://www.garlic.com/~lynn/subtopic.html#systemr
Lon Stowell - 22 Dec 2007 01:56 GMT
>> I'd divide the answer into two. For the original use SGML/XML and
>> related markup languages (like HTML) were meant for, i.e. markup of
[quoted text clipped - 12 lines]
> need for use in legal documents.
> http://xml.coverpages.org/sgmlhist0.html
Ah, I get misty eyed reading about GML, SGML and remembering SNADS,
DISOSS and the Common Suppository.