[EDIT - Fix unintentional italics] Another ~5% off by getting rid of the branch ...

jasonmoo · on Sept 5, 2012

Hey thanks for the suggestions. One thing that I found is that testing both strings for null actually performs faster because you may have parameters of different lengths and bailing on the null character of the shortest is ideal. And actually testing for an ampersand would probably bail even earlier in cases of duplicate params.

And maybe you can shed some light on this. I am a fan of the while loop too, but I found the for loop to be more performant. Is it possible that there are optimizations for variables touched in the for(;;) definition that do not exist for the while() definition?

gsg · on Sept 5, 2012

How is testing for a terminator the right approach? The comparator is only supposed to look at a single parameter, and the only null byte in the url string is at the end...

alexkus · on Sept 5, 2012

Both implementations have exactly the same outcome for all inputs, it's just one has less code (well, code that is less complex) inside the loop and therefore is more pipeline friendly and/or just executes faster.

Yes the string comparator is comparing items like:

  "a=3&a=4" and "a=4"

but it doesn't really matter since it'll never put such items in the wrong order. It's a quirk of passing the original url as 'const char *' so that it can't be modified (to temporarily replace the & symbols with NULs) and the OP was trying to avoid the cost of copying the input string except when creating the output string.

You should only look at the next byte in either string if both of the current bytes are non-NUL. If both are equal, and one is non-NUL then the other is non-NUL and so you can continue.

If they're non-equal then you stop and return the difference between the two which will be +ve or -ve.

If they're equal and one is NUL then you stop and return the difference which will be 0 since they are equal.

gsg · on Sept 5, 2012

I'm aware that it will return the right answer, but I was worried that it may look at too much of the url string in order to do so.

Consider the (admittedly perverse) input "test.php?a=1&a=1&a=1&a=1&a=1". At each comparison, the comparator you suggest will look at the entire remaining part of the string.

Maybe duplicate elements are considered invalid input and aren't a concern, I'm not sure. But fixing it is not hard and doesn't require fiddling about with copying strings or adding terminating nulls: just find and remember the parameter length once, and use a length-based comparator.

alexkus · on Sept 5, 2012

True, and it's a simple change to stop the comparison at either a NUL or a '&', so that it doesn't that extra unnecessary time.

The other two improvements I'd be looking to do are:

1) try to rework the insertion sort to make it use a binary search (if over a certain threshold that is yet to be determined by experimentation). Should be faster than looping through each element each time.

2) Not automatically extend the tail each time (where insertion somewhere between head and tail is required). The element being added may be one from the head so it's best to work out where it goes first and then pick the extension direction that requires the fewest number of elements to be shuffled up. Also the shuffling can be done in bulk using memmove() (can't use memcpy since the areas of memory will overlap when shuffling up by more than one position); this should also be faster than doing them one at a time.

Anyway, i'm off on holiday so I've got about another 15 minutes to look at this so I'm not going to get far. If I remember I'll try and check up on the status of it in github when I get back.