Database Forum / General DB Topics / DB Theory / October 2005
Looking for how to model 3D objects in 2D relational databases
|
|
Thread rating:  |
Doug C - 09 Oct 2005 09:15 GMT Hi - I am looking for sources of any kind that have to do with modelling objects, specifically organic molecules and are 3D, in a 2D relational database. Can anyone suggest anything?
Thanks, Doug C.
Marshall Spight - 09 Oct 2005 17:01 GMT > Hi - I am looking for sources of any kind that have to do with > modelling objects, specifically organic molecules and are 3D, in a 2D > relational database. Can anyone suggest anything? The first thing you need to observe is that relational databases are not 2D. A dimension is a measure independent of any other dimensions. The columns of a relational table are independente; they are dimensions.
You could have a table with three columns: x, y, and z. The rows model points in space. Notice how you now have 3D data in the table. Notice also that you could add 11 other columns and have 14 dimensional data.
If you just want to model points in space with various attributes, it's easy to extend the above technique. If you want to also model edges, faces, colors, various rendering parameters, etc. then you need to first figure out what you want to model, and then design the tables for it. You'd likely have a table for each of edges, faces, etc. Parameters of each go in the same table as columns.
HTH
Marshall
Gene Wirchenko - 10 Oct 2005 21:59 GMT [snip]
>The first thing you need to observe is that relational databases >are not 2D. A dimension is a measure independent of any other >dimensions. The columns of a relational table are independente; >they are dimensions. That has bothered me for some time now.
Given a PK, the columns are not independent. By that, the dimensionality of a table would be the number of columns in the key. What am I missing?
[snip]
Sincerely,
Gene Wirchenko
Mikito Harakiri - 10 Oct 2005 22:10 GMT > [snip] > [quoted text clipped - 8 lines] > dimensionality of a table would be the number of columns in the key. > What am I missing? Exactly. This is why some tables are even called dimensions in DW terminology. For example, it would be ridiculos to suggest that
Time(year, month, date)
is a 3 dimensional entity. On the other hand, we have fact tables, which could clearly have any number of dimensions.
JOG - 11 Oct 2005 03:06 GMT > > [snip] > > [quoted text clipped - 16 lines] > is a 3 dimensional entity. On the other hand, we have fact tables, > which could clearly have any number of dimensions. You have to be really specific when you are talking about dimensions, as their are few terms that possess such a range of subtly different connotations across maths and computer science. (Although to the OP, there are no definitions of a dimension that make RM 2D).
Each Time(year, month, date) may be plotted as a point in a 3 dimensional space, certainly making the entity 3D. Face recognition algorithms use similar techniques, where a 100x100 image might be modelled as a point in a euclidian space with 10,000 dimensions, or its dimensionality reduced to a much more tractable vector of n "feature" coefficients, in an n dimensional space. As such, almost anything can be viewed as a dimensional entity (whether it seems initially intuitive or not - and the model will of course leave that intuition to you).
Mikito Harakiri - 11 Oct 2005 03:17 GMT > Each Time(year, month, date) may be plotted as a point in a 3 > dimensional space, certainly making the entity 3D. The "year, month, date" part is a calendar artifact. It is indeed formally 3 dimensional. Let me ask a related question though. Is
1024
a 4 dimensional object? Because we can represent it as
number of thousands number of hundreds ... ------------------- ------------------ ... 1 0 ...
JOG - 11 Oct 2005 03:33 GMT > > Each Time(year, month, date) may be plotted as a point in a 3 > > dimensional space, certainly making the entity 3D. [quoted text clipped - 9 lines] > ------------------- ------------------ ... > 1 0 ... No, I totally agree. The term dimension is so abstract in terms of information modelling I struggle to define exactly what it means. It's often used as a synonym for "property of " an entity, which are rarely wholly independent, and often confused with the term as used with the 3 physical dimensions, which are of course orthogonal.
dawn - 12 Oct 2005 01:08 GMT > > > Each Time(year, month, date) may be plotted as a point in a 3 > > > dimensional space, certainly making the entity 3D. [quoted text clipped - 15 lines] > wholly independent, and often confused with the term as used with the 3 > physical dimensions, which are of course orthogonal. Some possible glossary entries for "dimension" & perhaps also "multidimensional" ?
from mathematics, a tuple with n elements is said to have a dimension of n. for numerical data visualization, n dimensional tuples can be plotted in an n-dimensional graph. For example, points in space can be graphed using three dimensions, such as x, y, and z axes.
from business intelligence online analytical processing (OLAP), a portion of a candidate key for a "fact table" serves as a key to a "dimension table". Each such key part is a dimension. Where there are two or more dimensions, the data is "multidimensional".
from data visualization, if a logical view (not necessarily SQL VIEW) of data maps directly to a visualization of that data in a fully populated (possibly including "nulls") table of single-valued row-column cells, that view might be classified as a two-dimensional view of data. In cases where some cells would be empty of any values or nulls from the view or where some cells hold multiple values, the data is "multidimensional".
------end of first shot at such entries in case mAsterdam can incoporate them into the cdt glossary
Anyone can feel free to fix up the wording or add other variations to the mix. I think the original question was how to take data that is n-dimensional from the first (mathematical) definition and use data visualization (the last definition) to view that data, possibly in order to then turn such a view into a relational view of the data (not necessarily the way to go about it, but that might be what was intended?) --dawn
Christian Paro - 14 Oct 2005 18:50 GMT Perhaps it would be better to tackle this question from the more specific context of modelling molecular structures *before* trying to wade through the tarpit of theoretical semantics.
Modelling *a* molecule as a relational database is straightforward enough - the molecule is a graph consisting of some set of atoms and a relation enumerating the bonds between each. I'm assuming, for the moment, that you are only trying to store a simple "ball-and-stick" model rather than some fancy quantum representation that deals with all the physical causes and effects responsible for and resulting from each of the bonds within the molecule. All you need in this case is three relations:
- A relation mapping atomic number to whatever info you're interested in using for each element present. A "periodic table", you could say.
- A relation listing each of the atoms in an instance of the molecule being described by a generated ID and relating each to the appropriate element by atominc number.
- A relation which associates each atom with each of the other atoms in the molecule, enumerating the molecule's bonds by way of an adjacency list.
A simple database for the friendly water molecule would look like this (my apologies for the ugly notation, I haven't much patience for trying to format things in plain text):
------------------------------------------------------------------
tbl_periodic (atomic_number, name): {(1, Hydrogen), (16, Oxygen)}
tbl_atoms (id, atomic_number): {(1, 1), (2, 1), (3, 16)}
tbl_bonds (id1, id2): {(1, 3), (2, 3), (3, 1), (3, 2)}
------------------------------------------------------------------
At least it would if you want your bonds to be undirected, you could also represent the polarity relationship by making them directed - but good luck with non-polar bonds.
Now, it *does* seen awfully silly to use an entire database for each molecule, so you'd probably make some adjustments to this concept for a practical implementation. You could, for instance, add a "molecule" table which identifies molecules and use the molecule id as a foreign key. To illustrate, let's make this change to our example db's structure and add Hydrochloric Acid to the dataset:
------------------------------------------------------------------
tbl_periodic (atomic_number, name): {(1, Hydrogen), (16, Oxygen), (17, Chlorine)}
tbl_molecules (mid, name): {(1, Water), (2, Hydrochloric Acid)}
tbl_atoms (mid, aid, atomic_number): {(1, 1, 1), (1, 2, 1), (1, 3, 16), (2, 1, 1), (2, 1, 17)}
tbl_bonds (mid, aid1, aid2): {(1, 1, 3), (1, 2, 3), (1, 3, 1), (1, 3, 2), (2, 1, 2), (2, 2, 1)}
------------------------------------------------------------------
And thus you have a well-normalized relational database that can store multiple molecules and represent a simple ball-and-stick model of their structure. Fancier models would require a bit more work to tie in the extra information, and you would probably benefit from adding a set of rules that can identify and remove descriptions of impossible molecular structures (bonds that violate orbital rules, 'molecules' that don't form a connected graph...), but this should be a good starting point.
------------------------------------------------------------------
Now, to my take on the dimensionality issue: It's really just a perfect example of how diction often seems to fall apart at the boundry of two disparate technical disciplines.
When you speak of a molecule as a 3-dimensional structure, you are probably referring to the fact that most molecules are non-planar structures whose physical manifestations exist in a (for all practical purposes, anyhow) 3-dimensional data model I'll refer to as the "real world".
>From a computer science and database theory perspective, however, N-dimensional seems to be interpereted more along the lines of "having N independent measurable quantites". I could be off by a bit on this one, being just a lowly undergrad student with a fuzzy memory and sloppy study habits, but it seems to be that the dimensionality of anything in a software context is really determined less by the nature of the thing being modeled than by the cardinality of the set of attributes which are to be accounted for in the digital realm. A cluster of balloons could be an element in a 3-dimensional data structure which tracks the cluster's latitude, longitude, and altitude as it flies over a radar dish - but it could just as easily be an element in a 2-dimensional structure recording the fact that there were 99 of them, and that they were red.
>From a user perspective, databases (or at least all of the tables within) are 2-dimensional. This is just a side effect of using tables to display the contents of a database. Throw in some OpenGL and render the same data as a 3-dimensional network graph and the data suddenly becomes 3-dimensional. This hasn't much to do with either the nature of the thing being modeled or the nature of the model underlying the view. It's just a pragmatic observation that the men in the cave may be 3D, but their shadows are not. The mistake comes in confusing the shadow for that by which it is cast.
In the case of our molecular database, we have two 2-D relations (the periodic table and the molecule table) and two 3-D relations (the atom table and the bond table). This isn't because the molecules are 3-D, since the first version of the database did fine with only 2-D relations, but because an extra dimension was required to disambiguate atoms and bonds belonging to disparate molecular structures. Also, depending on what information you extract from it and how you put it together for viewing, the result might be a 1-D string of characters representing the molecule's formula, a 2-D adjacency matrix or Lewis dot-diagram, or a rotatable 3-D visualization of the molecule's physical structure. Colour code the atoms by element and that 3-D visualization is actually representing four dimensions of data - three for position and one for colour/type.
In essence, the problem with "dimensionality" doesn't seem to be that it's a difficult concept - but that its meaning is largely dependent upon the context in which it is being applied.
|
|
|