Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / FoxPro / General FoxPro Topics / July 2008

Tip: Looking for answers? Try searching our database.

Confusing performance difference

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Tim Witort - 10 Jul 2008 17:37 GMT
VFP8 SP1 on XP Pro SP2

I have some code that was running a bit slowly:

 SELECT DISTINCT colToFix ;
 INTO ARRAY colArray ;
 FROM myTbl

 SELECT myTbl
 FOR EACH curVal IN colArray
   lcNewVal = ThisForm.Standardize(curVal)

   REPLACE colToFix WITH lcNewVal FOR colToFix = curVal
 ENDFOR

There is an index on the ColToFix column.  myTbl is a
free table.

Standardize is a method that makes sure the passed in
value follows certain formatting rules.

On a table with 50K records, this was taking almost 2
minutes.  There are about 10 distinct values in the
column in question.  All of the execution time was
happening in the 10 calls to REPLACE - some of them
taking over 20 seconds!  With such a small table and
indexed on the column in question, I would think
each REPLACE would take no more than a second.  I've
tested on many, many tables of the same size with the
same result, so it's not a corruption issue.

Playing around with it, I changed it to the following:

 SELECT myTbl
 SCAN
    REPLACE colToFix WITH ThisForm.Standardize(colToFix)
 ENDSCAN

Amazingly, this was 5 times faster!  How could performing
50K replacements and 50K calls to the standardize method
take *less* time than doing 10 calls to each?!?

This was also 5 times faster:

 SELECT myTbl
 REPLACE colToFix WITH ThisForm.Standardize(colToFix) ALL

I rarely have performance problems in VFP, so this one's
a bit of a head scratcher.  Any ideas on why this difference
and how to possibly make this run even faster?

-- TRW
_______________________________________
t i m  .  w i t o r t
_______________________________________
Dan Freeman - 10 Jul 2008 18:50 GMT
Is colToFix indexed? If not, VFP will pause to create one in memory on each
REPLACE ... FOR.

The other approaches won't worry about indexes.

Dan

> VFP8 SP1 on XP Pro SP2
>
[quoted text clipped - 51 lines]
> t i m  .  w i t o r t
> _______________________________________
Tim Witort - 14 Jul 2008 17:56 GMT
>> VFP8 SP1 on XP Pro SP2
>>
[quoted text clipped - 46 lines]
>> a bit of a head scratcher.  Any ideas on why this difference
>> and how to possibly make this run even faster?

> Is colToFix indexed? If not, VFP will pause to create one in memory on
> each REPLACE ... FOR.
>
> The other approaches won't worry about indexes.
>
> Dan

Yes, colToFix has an index on it.

-- TRW
_______________________________________
t i m  .  w i t o r t
_______________________________________
Gene Wirchenko - 10 Jul 2008 20:01 GMT
[snip]

>I rarely have performance problems in VFP, so this one's
>a bit of a head scratcher.  Any ideas on why this difference
>and how to possibly make this run even faster?

    For the first way, the program is going through the table about
eleven times: once in the initial select and then ten (or whatever)
times for the values.

    The second way goes through the table once.

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:
    I have preferences.
    You have biases.
    He/She has prejudices.
Tim Witort - 14 Jul 2008 18:04 GMT
Gene Wirchenko seemed to utter in news:gvmc74pmf77dfqlg9ul1jnn3ueevnh20u5@
4ax.com:

> [snip]
>
[quoted text clipped - 11 lines]
>
> Gene Wirchenko

But with an index on the column used in the FOR condition,
it will just use the index to go directly to the
correct records.  It shouldn't scan the entire table for
each REPLACE.  And the SCAN calls the function 50K times!
This is a pretty streamlined function, but I would think
that the SCAN would *at best* take a little longer than
the REPLACE FOR approach.  Yet it's 5 times faster!

I wonder if some strange RUSHMORE optimization is happening
that is taking extra time to setup and not paying back in
speedier REPLACEments.

-- TRW
_______________________________________
t i m  .  w i t o r t
_______________________________________
Gene Wirchenko - 14 Jul 2008 20:34 GMT
>Gene Wirchenko seemed to utter in news:gvmc74pmf77dfqlg9ul1jnn3ueevnh20u5@
>4ax.com:
[quoted text clipped - 10 lines]
>>
>>      The second way goes through the table once.

>But with an index on the column used in the FOR condition,
>it will just use the index to go directly to the
>correct records.  It shouldn't scan the entire table for

    Which correct records?  Your code touches EVERY one of them.

>each REPLACE.  And the SCAN calls the function 50K times!
>This is a pretty streamlined function, but I would think
>that the SCAN would *at best* take a little longer than
>the REPLACE FOR approach.  Yet it's 5 times faster!

    First scan.  All of the values that occur are extracted.

    Second to n+1 scans.  Each of the values is processed.  There are
quite possibly multiple occurrences of value.

    Since every value that occurs is processed, EVERY row is going to
be processed.

    Which do you think is the more efficient way to process?

      1) n passes through the data (using indexes or not)
      2) one pass through the data

    Note that processing a file sequentially is faster than
processing the same file in index order.  The sequential access does
not have jumping about.  The indexed access may have a lot; at best,
it will be the same as sequential (and that is unlikely to occur).

>I wonder if some strange RUSHMORE optimization is happening
>that is taking extra time to setup and not paying back in
>speedier REPLACEments.

    Nope.  You are doing a pessisimisation by separating out the
processing for each value into a pass.

    It does depend on what .Standardize() does.  If it is slow, then
your organisation of code could work.  If it is not, then the time it
takes to repeatedly execute is drowned out by the disk I/O.  It would
take a rather time-consuming method for it to not be drowned out by
disk I/O.

    Even if .Standardize() were somewhat slow, one could use caching
of the results.  Primary memory access is faster than secondary memory
access.

    A final note: curval is a resrved word, so you should use
something else.

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:
    I have preferences.
    You have biases.
    He/She has prejudices.
Christof Wollenhaupt - 15 Jul 2008 08:42 GMT
Hi Tim,

How large is the table, what is the record size and what size does the
indexed field have? This all would have impact on the performance. The
difference between the two approaches, for instance, could be that the first
one reads lots of record individually, whereas the SCAN loop reads huge
chunks of the table in one pass. On the other hand, REPLACE FOR requires a
table lock, whereas REPLACE in the second sample uses a record lock. A table
lock should be faster unless there are others working on the table or FOR
EACH loop iterates about the same number of times as the SCAN loop.

Many factors may be important here... If you want to see what is going on I
would suggest using FILEMON/PROCMON to monitor file activity during the two
samples.

Signature

Christof

Tim Witort - 29 Jul 2008 01:11 GMT
> Hi Tim,
>
[quoted text clipped - 11 lines]
> on I would suggest using FILEMON/PROCMON to monitor file activity
> during the two samples.

Good idea.  I might have to do that.

-- TRW
_______________________________________
t i m  .  w i t o r t
_______________________________________
Gene Wirchenko - 29 Jul 2008 02:52 GMT
"Christof Wollenhaupt" <msnews.microsoft.com.20050422@prolib.de>
wrote:

[snip]

>Many factors may be important here... If you want to see what is going on I
>would suggest using FILEMON/PROCMON to monitor file activity during the two
                    ^^^^^^^^^^^^^^^
>samples.

    Is this referring to the Sysinternals tools or something else?

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:
    I have preferences.
    You have biases.
    He/She has prejudices.
Christof Wollenhaupt - 29 Jul 2008 19:56 GMT
Hi Gene,

>     Is this referring to the Sysinternals tools or something else?

Yes, the Sysinternals tools (now Microsoft). Sorry, I should have been more
verbose on that.

http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx

Signature

Christof

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.