Ask HN: How is it possible to get -0.0 in a sum?

sparkie 3 days ago

It depends on the FP rounding mode. If rounding mode is FE_TOWARDZERO/FE_UPWARD/FE_TONEAREST then the case you gave is the only one I'm aware of. If rounding mode is FE_DOWNWARD (towards negative infinity) then other calculations that result in a zero will give a -0.0.

Here's an example of -1.0f + 1.0f resulting in -0.0: https://godbolt.org/z/5qvqsdh9P

gus_massa 2 days ago
Thanks! [Sorry for the delay.]
---
FYI: For more context, I'm trying to send a PR to Chez Scheme (and indirectly to Racket) https://github.com/cisco/ChezScheme/pull/959 to reduce expressions like
```
  (+ 1.0 (length L))  ;  ==>  (+ 1.0 (fixnum->flonum (length L)))
```
where the "fixnums" are small integers and "flonums" are double.
It's fine, unless you have the case
```
  (+ -0.0 (length L))  ;  =wrong=>  (+ -0.0 (fixnum->flonum (length L)))
```
because if the length is 0, it get's transformed into 0.0 instead of -0.0
There are a few corner cases, in particular because it's possible to have
```
   (+ 1.0 x (length L))
```
and I really want to avoid the runtime check of (length L) == 0 if possible.
So I took a look, asked there, and now your opinion confirms what I got so far. My C is not very good, so it's nice to have a example of how the rounding directions are used. Luckily Chez Scheme only uses the default rounding and it's probably correct to cut a few corners. I'll take a looks for a few days in case there is some surprise.
- sparkie a day ago
  I'm not sure you can avoid the check, but you can avoid a branch.
  An AVX-512 extension has a `vfixupimm` instruction[1] which can adjust special floating point values. You could use this to adjust all zeroes to -0 but leave any non-zeroes untouched. It isn't very obvious how to use though.
  vfixupimmsd dst, src, fixup, flag * The `flag` is for error reporting - we can set it to zero to ignore errors. * `dst` and `src` are a floating point value - they can be the same register. * The instruction first checks `src` and turns any denormals into zero if the MXCSR.DAZ flag is set. * It then categorizes `src` as one of {QNAN, SNAN, ZERO, ONE, NEG_INF, POS_ING, NEG_VALUE, POS_VALUE} * `fixup` is an array of 8 nybbles (a 32-bit int) and is looked up based on the categorization of `src` {QNAN = 0 ... POS_VALUE = 7} * The values of each nybble denote which value to place into `dst`: 0x0 : dst (unchanged) 0x1 : src (with denormals as zero if MXCSR.DAZ is set) 0x2 : QNaN(src) 0x3 : QNAN_Indefinite 0x4 : -INF 0x5 : +INF 0x6 : src < 0 ? -INF : +INF 0x7 : -0 0x8 : +0 0x9 : -1 0xA : +1 0xB : 1/2 0xC : 90.0 0xD : PI/2 0xE : MAX_FLOAT 0xF : -MAX_FLOAT
  You want to set the nybble for categorization ZERO (bits 11..8) to 0x7 (-0) in `fixup`. This would mean you want `fixup` to be equal to `0x00000700`. So usage would be:
  static __m128i zerofixup = { 0x700 }; double fixnum_to_flonum(int64_t fixnum) { __m128d flonum = { (double)fixnum }; return _mm_cvtsd_f64(_mm_fixupimm_sd(flonum, flonum, zerofixup, 0)); }
  Which compiles to just 4 instructions, with no branches:
  .FIXUP: .long 1792 # 0x700 .long 0 # 0x0 .long 0 # 0x0 .long 0 # 0x0 fixnum_to_flonum: vcvtsi2sd xmm0, xmm0, rdi vmovq xmm0, xmm0 vfixupimmsd xmm0, xmm0, qword ptr [rip + .FIXUP], 0 ret
  It can be extended to operate on 8 int64->double at a time (__m512d) with little extra cost.
  You could maybe use this optimization where the instruction is available and just stick with a branch version otherwise, or figure out some other way to make it branchless - though I can't think of any other way which would be any faster than a branch.
  [1]:https://www.intel.com/content/www/us/en/docs/intrinsics-guid...

gethly 2 days ago

i would guess that because of how *** * floats are in binary computers, you have something like -0.0000000000000000000000000000000000001 and when you round it you end up with -0.0. Same goes for positive value, you're just not used to write the + sign before every number, so seeing the minus feels strange.

dcminter 2 days ago

You're answering a question that OP did not ask.

kazinator 3 days ago

What happens if we take the smallest (as in closest to zero) negative subnormal and add it to itself?

gus_massa 3 days ago

Copying the example by sparkie, something like this? https://godbolt.org/z/xhdnb9ax3 I get +0.0 if I comment the round to negative option.