Database Forum / General DB Topics / General DB Topics / October 2004
starting out... setting up DB for home library (long)
|
|
Thread rating:  |
nuk - 17 Oct 2004 23:42 GMT Hello,
I'm just starting out w/ SQL databases. I'm running Mac OS X 10.3.5 'Panther' and have both MySQL and PostgreSQL installed and running. Haven't completely decided which one to use yet; may end up playing with both off and on. I've played w/ most versions of Linux and the free BSD's over the years, and have browsed thru a few DB books, but that's about it for my experience w/ databases and SQL to date. Please bear w/ me and correct me if I use the wrong terminology, etc.
The first 'project' I'd like to accomplish is to set up a DB for keeping track of the books in my home library. Figured it'll be easier to decide now on some of the layout of things rather than try to change in midstream later.
One issue I'm not entirely sure how to handle is that some of the books have multiple authors, and other ones, in addition to having multiple authors, are compilations of stories/articles. I could probably live w/o indexing the individual articles/stories in the compilation books, but I would very much like to be able to search by author, *any author*, and find which books they were involved with.
I could see having a separate table w/ an 'author_id' field as a primary key and then something like 'author_name_first', 'author_name_last', 'author_name_suffix', etc., but that leads into my next question...
I don't have a bunch of separate data files, one being a list of author names, one being titles, etc. I probably will end up picking up one book at a time, and recording the applicable information off of it and entering it. What would be the best way to do the data entry? Some sort of CSV text file(s) that I could feed to the DB software, or to set up some sort of data entry form? I'm guessing something maybe web-based could be put together... I've got Apache running on this machine, I suppose getting PHP running wouldn't be that much more difficult. Learning to use either/both of them wouldn't be a bad thing, just that much more time (not that this is exactly a rush project! ;p ).
Any suggestions or comments?
TIA,
nuk
nuk - 20 Oct 2004 15:03 GMT BTT
 Signature I know more than enough *nix to do some very destructive things, and not nearly enough to do very many useful things.
nuk - 20 Oct 2004 22:45 GMT At the risk of stepping in another religious war here... is there anything specific that would make one DB or the other (MySQL or PostgreSQL or others such as Firebird) preferable to another in a complete beginner situation such as this? I'm reasonably familiar w/ the basic differences btwn GPL and BSD licenses, I doubt I'll be stressing either one at all, etc. sub-SELECTs sound like they'd be handy to have (supposedly coming in mysql 4.1?), but the mounds of documentation and examples for mysql are awfully handy as well.
Thanks,
nuk
 Signature I know more than enough *nix to do some very destructive things, and not nearly enough to do very many useful things.
Kenneth Downs - 20 Oct 2004 16:30 GMT > Hello, > [quoted text clipped - 5 lines] > about it for my experience w/ databases and SQL to date. Please bear w/ > me and correct me if I use the wrong terminology, etc. I would guess with a panix email address you would be no stranger to *nix :)
Some general comments:
1) mySQL lacks very critical server-side features. mySQl also does not, AFAIK, have a completely free license. PostreSQL has all necessary server-side features to let you go as far as you want with this, and it is under the BSD license.
2) As for the design, I suggest you google on "normalization" and read through 1st, 2nd, and 3rd normal forms. You want to get your list of information into 3rd normal form as a good start for a beginner. I will give a short summary here:
-> 1nf says no lists or things inside of columns. This means you don't have a table called BOOKS with a column called AUTHORS that contains the value "John Smyth, Joseph Bleaugh,..."
-> 2nf says no repeating columns in a table, very similar to 1nf. It means you don't have a table BOOKS with columns called AUTHOR1, AUTHOR2, AUTHOR3 and so forth. Instead you have table BOOKS, table AUTHORS,and a cross-reference that links them: BOOKS_AUTHORS.
-> 3nf says don't repeat information at a detail level that can be put at a higher level. Meaning? In the example above of AUTHORS, BOOKS and AUTHORS_BOOKS, don't put the author's birthdate into the AUTHORS_BOOKS x-ref, put it into the author's table.
3) Use the Normal Forms! I cannot stress this enough. Look at your requirements again, learn the normal forms and put your data into those forms. Then ask more questions.
It so happens I've got a database generator running in Java that builds PostgreSQL databases without messing around with code, you can specify the table layouts in a gnumeric spreadsheet. It will be GPL when released, but I can share the current version as of Friday. Once the builder phase is complete (today or tommorrow, hence the Friday availability), I go directly to code generation for a web interface, if you are interested let me know.
 Signature Kenneth Downs Use first initial plus last name at last name plus literal "fam.net" to email me
nuk - 20 Oct 2004 17:09 GMT > I would guess with a panix email address you would be no stranger to *nix :) Stranger, no. Guru, no. As familiar as I used to be, heck no ;)
> Some general comments: > > 1) mySQL lacks very critical server-side features. mySQl also does not, > AFAIK, have a completely free license. PostreSQL has all necessary > server-side features to let you go as far as you want with this, and it is > under the BSD license. Is this something that would come into play in a small home scenario? I don't forsee (hah! I'm sure no one ever does) there being much use for this outside my household, nor much demand/load on it besides myself doing occasional searches.
> 2) As for the design, I suggest you google on "normalization" and read > through 1st, 2nd, and 3rd normal forms. You want to get your list of > information into 3rd normal form as a good start for a beginner. I will > give a short summary here: Thanks for the pointer in the right direction; I'll start searching and reading up!
> It so happens I've got a database generator running in Java that builds > PostgreSQL databases without messing around with code, you can specify the > table layouts in a gnumeric spreadsheet. It will be GPL when released, but > I can share the current version as of Friday. Once the builder phase is > complete (today or tommorrow, hence the Friday availability), I go directly > to code generation for a web interface, if you are interested let me know. It does sound interesting, but for the time being I'd like to see if I can figure out how to make this happen by using the code. Might give up and throw in the towel at some point in the future, dunno. Do you see the various web/php tools like phpMyAdmin, phpPGadmin, etc. as being useful, a crutch, what?
Thanks again for the reply,
nuk
 Signature I know more than enough *nix to do some very destructive things, and not nearly enough to do very many useful things.
Kenneth Downs - 20 Oct 2004 17:52 GMT > Do you see > the various web/php tools like phpMyAdmin, phpPGadmin, etc. as being > useful, a crutch, what? I use pgAdmin3 for Postgres3 and find it very useful. A crutch? Nah, more like training. It really brings the system to life because it shows me everything, and gives me ideas for what to read up on in the manual.
 Signature Kenneth Downs Use first initial plus last name at last name plus literal "fam.net" to email me
nuk - 20 Oct 2004 18:36 GMT >> Do you see >> the various web/php tools like phpMyAdmin, phpPGadmin, etc. as being [quoted text clipped - 3 lines] > like training. It really brings the system to life because it shows me > everything, and gives me ideas for what to read up on in the manual. It does look nice... but currently no Mac OS X version :(
nuk
 Signature I know more than enough *nix to do some very destructive things, and not nearly enough to do very many useful things.
Kenneth Downs - 20 Oct 2004 19:13 GMT >>> Do you see >>> the various web/php tools like phpMyAdmin, phpPGadmin, etc. as being [quoted text clipped - 8 lines] > > nuk Didn't know that.
Well, perhaps you will be able to report back at some point on how you make out with one that works on MacOS. Many people, myself included, like anything cross-platform, and I would readily drop pgAdmin3 for something Good Enough that was also cross-platform.
 Signature Kenneth Downs Use first initial plus last name at last name plus literal "fam.net" to email me
Gene Wirchenko - 20 Oct 2004 17:24 GMT >> Hello, >> [quoted text clipped - 29 lines] > you have table BOOKS, table AUTHORS,and a cross-reference > that links them: BOOKS_AUTHORS. No. "A relation is in second normal form if every nonprimary key attribute is functionally dependent on the whole primary key." -- Jeffrey A. Hoffer et al, "Modern Systems Analysis and Design / Third Edition", p. 396.
>-> 3nf says don't repeat information at a detail level that > can be put at a higher level. Meaning? In the example > above of AUTHORS, BOOKS and AUTHORS_BOOKS, don't put the > author's birthdate into the AUTHORS_BOOKS x-ref, put > it into the author's table. No. "A relation is in third normal form (3NF) if it is in second normal form and there are no functional (transitive) dependencies between two (or more) nonprimary key attributes." -- Hoffer, p. 397.
>3) Use the Normal Forms! I cannot stress this enough. Look at your > requirements again, learn the normal forms and put your data into [quoted text clipped - 6 lines] >complete (today or tommorrow, hence the Friday availability), I go directly >to code generation for a web interface, if you are interested let me know. Sincerely,
Gene Wirchenko
Computerese Irregular Verb Conjugation: I have preferences. You have biases. He/She has prejudices.
Tim Smith - 24 Oct 2004 04:53 GMT > 1) mySQL lacks very critical server-side features. mySQl also does not, > AFAIK, have a completely free license. PostreSQL has all necessary > server-side features to let you go as far as you want with this, and it is > under the BSD license. The server-side features MySQL lacks are not critical at all for the kind of thing the original poster has in mind.
I'd say MySQL is the clear winner for his application, because it is easier to get decent performance out of it (although it doesn't sound like he has a big enough application for this to matter), easier to poke around in (phpMyAdmin), has more and better introductory books available at places like Barnes and Nobel, and if he ever decides to move his database to his web site (everyone who makes a database of their home library or CD or DVD collection eventually decides it would be cool to put it on their web site so their friends can look at their stuff :-)), it is far more likely his hosting company will be running MySQL than PostgreSQL. (Assuming he doesn't have a problem with the license).
So, I'd recommend people start with MySQL, unless they are doing something big that *requires* something not in MySQL, but be aware of the differences between MySQL and PostgreSQL (and perhaps Firebird and MaxDB?), and try not to build too many dependencies on a particular DB into their stuff.
 Signature --Tim Smith
Kenneth Downs - 25 Oct 2004 02:08 GMT >> 1) mySQL lacks very critical server-side features. mySQl also does not, >> AFAIK, have a completely free license. PostreSQL has all necessary [quoted text clipped - 21 lines] > MaxDB?), and try not to build too many dependencies on a particular DB > into their stuff. Sometimes 1/2 of 1 is zero.
 Signature Kenneth Downs Use first initial plus last name at last name plus literal "fam.net" to email me
Laconic2 - 20 Oct 2004 18:48 GMT > Hello, > [quoted text clipped - 38 lines] > > nuk Laconic2 - 20 Oct 2004 18:54 GMT I suggest you learn how to do it right, even if it doesn't make much difference for a small home project.
You may never get another golden opportunity to learn the difference between good DB design and lousy DB design in a situation where you master the subject matter as well as you master the home library subject.
Unfortunately, you'll get some disagreement about which way is "right", or even about whether or not there is more than one right way to do it.
One place I suggest you look is: www.databaseanswers.com
They've got some 200 data models in there. Maybe one of them models the home library. You might even be able to download an MS Access dB already built.
On data normalization: is more useful for critiquing a bad design than for arriving at a good one. It's also good for doing data analysis in retrospect.
Tim Smith - 24 Oct 2004 05:05 GMT > I don't have a bunch of separate data files, one being a list of author > names, one being titles, etc. I probably will end up picking up one book [quoted text clipped - 3 lines] > of data entry form? I'm guessing something maybe web-based could be put > together... I've got Apache running on this machine, I How much hackery are you willing to do? If the answer is "a lot!", then you could put together something that lets you enter the ISBN, goes and searches for the book at Amazon or Barnes & Noble, and grabs the author(s), title, and whatever else you want out of the results with a screen scraper.
Or, with less hacking, I believe Amazon offers a web services interface for doing searches and getting information, so you can probably do the above with nice clean SOAP requests (or whatever web services scheme they use), instead of an ugly screen scraper.
You can also search www.loc.gov by ISBN, so the screen scraper idea might work there, too. I don't know if they have a web services interface. This would also get you the LOC call number, which could be useful if you want to organize your library physically to match other libraries.
If you want to totally go overboard in a geekly way...the Library of Congress gives the *size* of the books, too. I've never been able to decide what height is best for all the adjustable shelves in my bookcases. If you too have that problem, then a program to look up the height of all your books, and calculate how to best set the shelf heights, would be awesome. It would indicate a man with WAAAAAY too much time on his hands, but still would be awesome. :-)
 Signature --Tim Smith
|
|
|