09-07-2013 07:00 AM
Another solutioon. 🙂
(The other case is empty and thus add 0)
/Y
09-07-2013 08:35 AM
The data type is fixed point, meaning the decimal place is fixed for the number and you define where the decimal is located.
For your example
_.11101100100000000000000
0.< ----- 23 bits of fraction --->
Those fractional bits have the decimal weight you are looking for
2^-1,2^-2,2^-3....2^-23
It is set to unsigned, otherwise the most significant bit, MSB, the left most bit, would be used for the sign, +-, of the input.
This does have the short comming that the width is fixed for your input. You could zero pad the right side to 32bits wide, use the int->fxp cast, with a <+32,0> to give more flexibility.
09-07-2013 08:44 AM
Does more structure use ( loops ) mean more memory usage? If the job gets done without any for loop or case structure, isn't it good?
09-07-2013 10:37 AM - edited 09-07-2013 10:47 AM
@NapDynamite wrote:
Does more structure use ( loops ) mean more memory usage? If the job gets done without any for loop or case structure, isn't it good?
Not necessarily. Loops are higly optimized! Even the scan from string must loop internally one way or another, but it probably also contains a lot of safety code, e.g. to verify the string and generate errors if the string e.g. contains funny characters or is otherwise problematic. None of the loop solutions contain any kind of error handling and thus could be slightly more efficient, because they do less!
Since the input strings are very small (~23 bytes here), memory use is not of concern. My solution creates a DBL array at the beginning and does the addition in place. Other solutions build the DBL array, just to sum it later. I would think there are relatively similar, but it does not really matter here.
I have not measured the relative performance between "power of two" and "scale by power of two". Maybe the compiled code is the same for both in the case of DBL inputs.
If performance is of concern (and here it isn't!) and you can guarantee that the input string is always well formed (can you?), I problaby would use a hybrid solution with elements from several of the above solutions as well as a lookup table for the exponentiation, e.g. as follows. (other case is "default" and wired across). Of course the LUT needs to be pre-calculated for the maximum number of characters expected, but also taking account of the limited mantissa size of DBL.
If you want to add some crude sanity checks, you could add a third case (48: wired across, 49: shown case, default: generate error and stop loop). However, case structure with more than two cases are always slower than case structures with only two cases.
Here is also a no-Loop version. It might be able to better leverage the SIMD features of the CPU.
I have not done any benchmarking and it is probably difficult because the code is so minimal.
.
09-07-2013 11:24 AM - edited 09-07-2013 11:28 AM
09-07-2013 11:40 AM
altenbach wrote:
You might be able to sqeak some more performance by subtracting by the byte array by 48 instead of comparing to 48 and changing the boolean array to a 0,1 number array. Of course, this would leave you open to issues if you input something other than a 0 or 1 into the string.
09-07-2013 12:11 PM
@crossrulz :
Indeed, we would have been able to think it! but no, it's a bit slower.
09-07-2013 12:39 PM
ouadji wrote:Indeed, we would have been able to think it! but no, it's a bit slower.
Debugging needs to be off and you need to avoid constant folding. Your current test is completely meaningless and does not measure the execution of the actual code.
As I said, this will be difficult to do any meaningful benchmark on this. Try again.
09-07-2013 12:55 PM
Debugging is "on" precisely to avoid constant folding.
These are comparison tests, and I don't care to know the exact time of execution.
Only comparisons are important and meaningful.
completely meaningless ? certainly not.
if you are able to do better, please, show us.
09-07-2013 01:04 PM
@ouadji wrote:
Debugging is "on" precisely to avoid constant folding.
These are comparison tests, and I don't care to know the exact time of execution.
Only comparisons are important and meaningful.
completely meaningless ? certainly not.
Debugging overhead for code containing a loop is more complicated that debugging overhead in the absence of loops. You are basically measuring the debugging overhead. While you get comparison results, they are not meaningful and thus misleading.
ouadji wrote:if you are able to do better, please, show us.
I already said that benchmarking all this reliably will be very difficult and would probably cost me a couple of hours. I have not tried and I don't have time for it this weekend.