From: Johnny Lee (typo_pl@hotmail.com)
Date: Fri Oct 03 2003 - 01:25:47 EDT
Summary:
There are several speedups here and some size reduction with these changes.
Import and building the RTF doc has been reduced from 23 seconds to ~15 
seconds total.
Some of the changes may not be acceptable since I'm concentrating on 
speeding up loading the RTF spec and I haven't been able to determine if 
some of my assumptions are valid.
I'm just proposing a solution. It may not be the best solution since I 
barely know the code and the coding style. I did increase the size of 
fp_Container objects by 4 bytes and the size of UT_Stringbuf objects by 4 
bytes. Feel free to code up a different solution.
The code for change #3 is in the attached bug5291e.txt.
The code for change #1, #2, and #4 is in bug5291f.txt.
Change #1
==============
After profiling Abiword with all the previous 5291 patches, the function 
listed at the top was fp_TableContainer::getCellAtRowColumn.
Looking at the code for this function, there's a linear search thru the 
cells in a table to find the cell in the appropriate spot.
When I dumped the position portion of the cell objects, they turned up in 
sorted order, from top to bottom, left to right.
*** If this assumption is not correct, then the patch will probably fail to 
work correctly. ***
Using that sorted array, I replaced the linear search with a binary search.
This saved about 4 seconds on building the document.
Change #2
==============
Profiling the new version, the top function listed was 
fp_Container::clearBrokenContainers. This is a recursive function which is 
called millions of times while building the document.
It took me a long time to figure out the function of 'broken containers'.
Robert Wilhelm had a patch so that clearBrokenContainers would only write to 
the m_pMyBrokenContainer field if it was non-NULL.
I added code to see how often the code NULLed out this field.
Less then 20 times for the RTF spec! Millions of calls to NULL a variable 
less than 20 times.
The reason it has to do this recursively is that the parent container 
doesn't know anything about broken containers in its children.
And how many broken containers are there?
fp_Container::setMyBrokenContainer is only called about 120 times.
So I added a field to fp_Container which represents the number of broken 
containers in the current container and any children.
When fp_Container::setMyBrokenContainer is called, the code goes from the 
current container up to its parent, grandparent, etc. and increments the 
count of broken containers in each of them.
When fp_Container::clearBrokenContainers is called, if m_pMyBrokenContainer 
is cleared, the broken--container-count is decremented in the current 
container, and in its parent and grandparent, etc.
When it loops through the children containers, the code checks their 
broken-container-count. If it is zero, then we don't bother calling 
clearBrokenContainers on them.
This saves a couple of seconds building the RTF spec.
Change #3
==============
Profiling again, showed the following functions near the top:
operator delete
UT_String::UT_String
UT_StringPtrMap::find_slot
UT_StringPtrMap::pick
Looking at the code, UT_StringPtrMap::pick creates a UT_String object which 
allocates space for the string, copies the string, and gets passed to 
UT_StringPtrMap::find_slot. Afterwards, the UT_String is destroyed which 
includes freeing the allocated string.
I modified UT_Stringbuf and UT_String so that you can pass in a 
null-terminated string and no memory is allocated for the string, the code 
just references the string passed in - ideal when used for a read-only 
string, such as checking if something is in a hash table.
My code changes add a few checks to the base string classes so that you 
can't modify the external string. You may want to come up with another 
solution that doesn't involve such extensive changes in the class. The idea 
is to get rid of the memory allocation/deallocation, etc. when the string is 
only examined. There are other places in ut_hash.cpp where this occurs as 
well.
Change #4
==============
Finally, a patch that deals more with reducing size than speed. I noticed in 
fp_TableContainer.cpp that there's lotsa code similar to this:
         getNthRow(i)->requisition = 0;
         getNthRow(i)->allocation = 0;
         getNthRow(i)->spacing = m_iRowSpacing;
         getNthRow(i)->need_expand = 0;
         getNthRow(i)->need_shrink = 0;
         getNthRow(i)->expand = 0;
         getNthRow(i)->shrink = 0;
I looked at the disassembly of the code. The compiler didn't realize that 
the result from getNthRow(i) was the same for every line in this block of 
code. So the compiler generated code which would calculate the result for 
each statement.
I created a variable and assigned the result from getNthRow/getNthCol to the 
variable and used the variable instead.
This saved about 16 bytes each time with VC6.
_________________________________________________________________
Instant message during games with MSN Messenger 6.0. Download it now FREE!  
http://msnmessenger-download.com
This archive was generated by hypermail 2.1.4 : Fri Oct 03 2003 - 01:41:20 EDT