Database Forum / General DB Topics / DB Theory / July 2008
Calculated value dilemma
|
|
Thread rating:  |
Rainboy - 16 Jul 2008 11:57 GMT Hi all
I am designing a database for my charity (we are a local special needs charity) to replace our existing one, and my main goals are to make the database accurate, easy to use, and able to expand (the current database is not!).
The database will hold contact and membership details for about 300 families, staff/volunteers and other contacts, along with event booking (we hold about 5 a week) and payment details.
So, for each family we need to see how much they owe, plus a full list of all their transactions. I am considering having a transaction table, which details every payment anyone has ever made.
IDEA 1 To follow normalisation rules I should avoid having calculated values in the table, and I should therefore simply calculate the balance whenever I need it by querying the table. I guess as the database grows I would have to periodically 'consolidate' the balance to maintain efficiency.
However, I see each updated balance as a snapshot in time - when I look back to figure out 'who paid what when', I want to be 100% confident that the database will return the same balance it would have returned at the time. If a rogue entry were to somehow find its way into the table, it would mess all the balances up...
IDEA 2 As keeping historically accurate info is important, 'hard-coding' the balance in the table seems to me like it might be the most appropriate solution, even though it breaks the normalisation rules.
Any suggestions would be greatly appreciated!
Mark
PS I've never designed a database before, so please let me know if I'm completely barking up the wrong tree!
Bob Badour - 16 Jul 2008 13:13 GMT > Hi all > [quoted text clipped - 17 lines] > grows I would have to periodically 'consolidate' the balance to > maintain efficiency. Where did you learn that was part of a normalization rule? It's just plain wrong.
> However, I see each updated balance as a snapshot in time - when I > look back to figure out 'who paid what when', I want to be 100% > confident that the database will return the same balance it would have > returned at the time. If a rogue entry were to somehow find its way > into the table, it would mess all the balances up... Historical data should accurately reflect the history. What if a calculation changes? What if a calculation depends on the instantaneous value of some attribute where no history is stored? As a hypothetical situation, suppose families with 5 or more kids get a special discount. When baby #5 is born, the family doesn't suddenly get a discount on all their past transactions.
> IDEA 2 > As keeping historically accurate info is important, 'hard-coding' the > balance in the table seems to me like it might be the most appropriate > solution, even though it breaks the normalisation rules. Exactly which normal form do you suppose it violates?
> Any suggestions would be greatly appreciated! > > Mark > > PS I've never designed a database before, so please let me know if > I'm completely barking up the wrong tree! I think you have been misinformed. Where are you getting your information?
Rainboy - 16 Jul 2008 15:00 GMT > > Hi all > [quoted text clipped - 51 lines] > > - Show quoted text - Hmmmm...
Well, I had assumed that the table would be violating 3NF because the calculated value of the balance would be dependent on the transaction amount, which is not the primary key.
Ah, I think I'm starting to understand... is historical data treated differently because as soon as an entry is 'in the past', the balance (at that time) no longer depends on the transaction amount, and it does only depend on the primary key of transaction ID. So I just have a balance column, and whenever I want someone's current balance, I just query this historical table for the most recent entry? And I won't be breaking any rules? Smooth.
Although (assuming I've got this right) something does intuitively bother me about this... can't quite put my finger on it! I feel a bit wary that the database's accuracy will depend on someone always entering in the correct balance to square with the latest transaction amount... what if they don't? I know it can be automated, but I don't feel that comfy knowing that the database structure doesn't take care of it, and it'll have to be some extra logic. Does this make sense?
Mark
PS Thanks for your reply. I hope you don't mind bearing with me while I get my head around these new concepts!
JOG - 17 Jul 2008 12:33 GMT > > > Hi all > [quoted text clipped - 57 lines] > calculated value of the balance would be dependent on the transaction > amount, which is not the primary key. 3NF is deprecated really. You should start at BCNF when normalizing. Regards, J.
> Ah, I think I'm starting to understand... is historical data treated > differently because as soon as an entry is 'in the past', the balance [quoted text clipped - 16 lines] > PS Thanks for your reply. I hope you don't mind bearing with me > while I get my head around these new concepts! Brian Selzer - 18 Jul 2008 04:33 GMT >> > > Hi all >> [quoted text clipped - 62 lines] >> calculated value of the balance would be dependent on the transaction >> amount, which is not the primary key. The balance, while dependent upon the transaction amount, is not functionally dependent upon it, so the dependency has absolutely nothing to do with 3NF, or any of the classic normal forms for that matter--including 5NF.
> 3NF is deprecated really. You should start at BCNF when normalizing. > Regards, J. BCNF is 3NF or at least what 3NF should have been to begin with.
>> Ah, I think I'm starting to understand... is historical data treated >> differently because as soon as an entry is 'in the past', the balance [quoted text clipped - 11 lines] >> feel that comfy knowing that the database structure doesn't take care >> of it, and it'll have to be some extra logic. Does this make sense? You are justified in your concern. The problem with storing both an aggregate and the values that combine to form the aggregate is not due to a lack of normalization, but is a design flaw nonetheless. Bob's response was woefully inadequate in answering your question. Storing both the result of an aggregate and the values that combine to yield the aggregate result is generally not a good idea, unless what you're storing is qualified by a date/time stamp or some other temporal device, such as an open or closed interval. While there is still a link between the transaction amounts and the balance, that link is frozen in time--that is, tied to a particular point in time (even if that point is indeterminate, as is the case for an open interval). Another downside of storing an unqualified aggregate result is that in order to verify its value, you need to retain all of the transaction amounts that when combined yield the aggregate result. That may be five or ten years worth of transactions, and that's a problem if you are only retaining the data for the last year.
>> Mark >> >> PS Thanks for your reply. I hope you don't mind bearing with me >> while I get my head around these new concepts! Tony Toews [MVP] - 16 Jul 2008 21:17 GMT >As keeping historically accurate info is important, 'hard-coding' the >balance in the table seems to me like it might be the most appropriate >solution, Put the balance on the parent table. But also create a method of double checking that number by adding up the individual transactions, telling the user there is a problem, log that problem so they can bring it to your attention at a later date and then fix the balance. Also create this method before even coding the logic to update the balance. Thus you can use it as part of your testing.
Tony
 Signature Tony Toews, Microsoft Access MVP Please respond only in the newsgroups so that others can read the entire thread of messages. Microsoft Access Links, Hints, Tips & Accounting Systems at http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
Evan Keel - 16 Jul 2008 21:53 GMT > >As keeping historically accurate info is important, 'hard-coding' the > >balance in the table seems to me like it might be the most appropriate [quoted text clipped - 7 lines] > > Tony Can't wait to see the responses to this.
Tony Toews [MVP] - 17 Jul 2008 23:49 GMT >Can't wait to see the responses to this. So why don't you respond then?
Tony
 Signature Tony Toews, Microsoft Access MVP Please respond only in the newsgroups so that others can read the entire thread of messages. Microsoft Access Links, Hints, Tips & Accounting Systems at http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
JOG - 18 Jul 2008 01:32 GMT On Jul 17, 11:49 pm, "Tony Toews [MVP]" <tto...@telusplanet.net> wrote:
> >Can't wait to see the responses to this. > [quoted text clipped - 8 lines] > Microsoft Access Links, Hints, Tips & Accounting Systems athttp://www.granite.ab.ca/accsmstr.htm > Tony's Microsoft Access Blog -http://msmvps.com/blogs/access/ What does MVP stand for? I looked up and wikipedia and it tells me MVP stands for "Most Valuable Player" - which in turn translates to "Man of the Match" in UK english. Man of the Match at Access? Wuh? Yours Bamboozled, J.
Bob Badour - 18 Jul 2008 02:35 GMT > On Jul 17, 11:49 pm, "Tony Toews [MVP]" <tto...@telusplanet.net> > wrote: [quoted text clipped - 16 lines] > of the Match" in UK english. Man of the Match at Access? Wuh? Yours > Bamboozled, J. It stands for "Most Vociferous Person"
Tony Toews [MVP] - 18 Jul 2008 04:11 GMT >It stands for "Most Vociferous Person" Also Most Vicious Psychotic and Most Vocal Pr*ck.
Tony
 Signature Tony Toews, Microsoft Access MVP Please respond only in the newsgroups so that others can read the entire thread of messages. Microsoft Access Links, Hints, Tips & Accounting Systems at http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
Tony Toews [MVP] - 18 Jul 2008 08:19 GMT >What does MVP stand for? I looked up and wikipedia and it tells me MVP >stands for "Most Valuable Player" - which in turn translates to "Man >of the Match" in UK english. Man of the Match at Access? Wuh? Yours >Bamboozled, J. http://mvp.support.microsoft.com/
Tony
 Signature Tony Toews, Microsoft Access MVP Please respond only in the newsgroups so that others can read the entire thread of messages. Microsoft Access Links, Hints, Tips & Accounting Systems at http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
Roy Hann - 18 Jul 2008 06:52 GMT >> >As keeping historically accurate info is important, 'hard-coding' the >> >balance in the table seems to me like it might be the most appropriate [quoted text clipped - 11 lines] >> >> Tony
> Can't wait to see the responses to this. Oddly this is one of the first such postings of Tony's I don't really have a problem with. In fact I entirely approve of the idea of imposing integrity constraints before coding. My main gripe is that it is something of a recipe and although the business process it describes is potentially sound, I wouldn't have assumed any favorite process of mine would automatically suit everyone else.
If the OP notices that and considers it, he might be OK.
 Signature Roy
Tim X - 17 Jul 2008 10:06 GMT > Hi all > [quoted text clipped - 17 lines] > grows I would have to periodically 'consolidate' the balance to > maintain efficiency. I think you may be a little off the mark with your interpretation of 3NF. Its not about not storing calculated values. Although a bit simplistic, think of it as not having the same data stored in multiple places because then you can get data inconsistencies when one copy is updated and the other is not - how do you determine which is correct?
Also, don't worry too much about efficiency initially. What your describing is normal expected application of the technology. You also don't seem to be dealing with that many records given common data set sizes and the processing power of hardware these days. Optimizing too early is often a fatal mistake. Wait and see (ideally in a testing stage rather than a production stage - you will be surprised how much performance can be affected by the right use of indexes. If performance becomes an issue and you cannot resolve it with available database features (partitioning, indexing, summary tables, views etc) then you can consider using summary tables and data archiving etc.
> However, I see each updated balance as a snapshot in time - when I > look back to figure out 'who paid what when', I want to be 100% > confident that the database will return the same balance it would have > returned at the time. If a rogue entry were to somehow find its way > into the table, it would mess all the balances up... Think about it from a different perspective. The problem isn't that the balances get all messed up. The problem is allowing historical data to be modified. If you cannot change the balance or date of a record and you cannot insert a record with a retrospective date, then your historical data doesn't change and hence any calculations based on that data doesn't change. Note that most accountants and auditors would not be very impressed with any system that allowed either updates of past records or insertion of records retrospectively.
> IDEA 2 > As keeping historically accurate info is important, 'hard-coding' the > balance in the table seems to me like it might be the most appropriate > solution, even though it breaks the normalisation rules. Personally, I'd avoid storing the balance unless the calculation of that balance is too resource hungry. As long as you can't change data retrospectively and can't insert records retrospectively, just calculate the balance when you need it.
What I would possibly do is -
1. Use a timestamp on the records that is automatically set when you insert the record. i.e. not set by the user.
2. If performance becomes an issue, consider having either some form of partitioning or end of month process. Track when the end of month process has been run and use that to generate a historical summary table. If/when you need a detailed report that covers more than the current period, use the data in the historical summary table.
3. Take advantage of 300 years of modern accounting practices and design your system as a double entry accounting system. This provides additional checks and balances and you tend to get lots of other benefits, such as simplified auditing processes. Include the ability to enter bank statements or deposit receipts. This can be used to ensure everything balances. Don't allow a month to be closed off until everythinig balances.
4. Ensure you have some type of journaling function that will enable adjustments. Data will be entered incorrectly - that is the one thing you can guarantee - for example, simple transposing of numbers during data entry etc. You need to be able to 'fix' such things and you need to be able to do it in an acceptable way from and accounting and auditing perspective. Rule of thumb, you should be able to reproduce *all* the data entry processes that have occured easily and pretty much transparently.
5. If your database supports constraints (foreign keys, check constraints, not null, primary key constraints, etc then use them. In the short term, it may feel like a lot of extra work, but once you start coding the application logic, things become a lot easier. Much better to prevent bad data from getting in than trying to fix it once it does.
6. Make sure you thoroughly know and understand how this information is recorded and managed now. I'm assuming it is a manual system. Being really familiar with current practice will provide valuable information in both the design of your underlying data model and the overall implementation. Don't try to just replicate the process using technology - your solution wil almost certainly require those who manage the process to refine their process, but make sure you understand their current process before trying to design them a new one.
Finally, make sure there isn't an affordable or open source solution that would suffice. Building such an application is a fun project and if its for an organisation you feel warrants such support, its generally a good thing. However, all applications need maintenance and support. If your not prepared or unable to provide this maintenance and support, they need to get it from somewhere else and it can come at a cost - particularly for one off custom apps. Given you have never designed and implemented a database and application before, you are almost certainly going to get things wrong. I've been working in the industry for over 20 years and I still get things wrong - the only difference is that I get it wrong less often and when I do it tends to be less critical.
good luck
Tim
 Signature tcross (at) rapttech dot com dot au
Ed Prochak - 21 Jul 2008 19:35 GMT []
> 6. Make sure you thoroughly know and understand how this information is > recorded and managed now. I'm assuming it is a manual system. Being [quoted text clipped - 4 lines] > process to refine their process, but make sure you understand their > current process before trying to design them a new one. A very important bit of advice for beginner developers. Good job Tim!
> Finally, make sure there isn't an affordable or open source solution > that would suffice. Building such an application is a fun project and if [quoted text clipped - 14 lines] > -- > tcross (at) rapttech dot com dot au Ed
|
|
|