On Tue, 2009-07-14 at 17:16 +0100, Tomas Frydrych wrote:
> Hi Robert,
> 
> Robert Wilhelm wrote:
> > Hello Tomas,
> > 
> > this patch is in reply to 
> > http://www.abisource.com/mailinglists/abiword-dev/2006/Feb/0081.html
> > 
> > Some things take a while...
> 
> :) Probably about as long since I have contributed anything ...
> 
> > It avoids quadratic behavior and many calls to g_utf8_pointer_to_offset
> > by iterating only one time through the string.
> 
> The patch looks good; getting rid of the g_utf8_pointer_to_offset() in
> this case makes sense, I think (though I do not see how this change
> avoids quadratic behaviour, as the while loop is equivalent of the
> g_utf8_pointer_to_offset() function call).
> 
> Tomas
Hi Tomas,
thanks for the review.
                for(int i = 0; i < pGlyphs->num_glyphs; ++i)
                {
                        int iOff = pGlyphs->log_clusters[i];
-                       pLogOffsets[i] =  g_utf8_pointer_to_offset
(pUtf8, pUtf8 + iOff);
+                       while (s <  pUtf8 + iOff)
+                       {
+                               s = g_utf8_next_char (s);
+                               offset++;
+                       }
+                       pLogOffsets[i] =  offset;
                }
        }
The point is that we remember s and offset between the iterations of
outer for loop. Therefore inner while loop is very short.
Before the inner while loop always went from pUtf8 till pUtf8+iOff.
For my testcase (first doc from #9411), we are now down to 42s from
50s. 
Best regards,
row 
Received on Tue Jul 14 21:12:31 2009
This archive was generated by hypermail 2.1.8 : Tue Jul 14 2009 - 21:12:31 CEST