NaNs, Uninitialized Variables, and C++

I’ve become a big fan of fail fast programming, or what Matthew Wilson colorfully refers to as “hairshirt programming.” In other words, if I make a mistake in my code, or if I use a piece of code in a way that violates the assumptions under which it was written, I want to know as soon as possible. C++ is fairly good at doing this at compile time, thanks to a static, reasonably strong, extensible type system, and thanks to add-on features like BOOST_STATIC_ASSERT. For problems that can’t be detected at compile time, the liberal use of assertions can help enforce preconditions and postconditions, and classes like std::string and boost::array add error checking over C-style strings and arrays.

When I first read about signaling NaNs, I thought that they would be a wonderful addition to this fail fast toolbox. As documented, a signaling NaN is basically magic value that, when assigned to a floating point variable, would cause any attempt to use that variable to throw an immediate exception. This could be an immensely useful tool for tracking down uninitialized variable use and for tracing the operation of legacy code (to answer questions such as, “Is this variable really unused along this code path?”).

Unfortunately, signaling NaNs don’t work nearly as well as advertised. But first, some background…

NaNs are used in the IEEE 754 floating point standard to describe indeterminate forms such as 0/0, ∞ - ∞, 0 * ∞, and so on. NaNs are not the same thing as infinity (either positive or negative infinity). Attempting to do arithmetic on a NaN (almost always) results in another NaN. So if 0/0 is not a number, then 0/0 + 1 is also not a number, and 0/0 * 100 is not a number, and so on. Under some circumstances, this NaN propagation may be what you want; under other circumstances, you may want to catch NaNs as soon as possible (to invoke error recovery, or to launch a separate calculation path, or whatever). Presumably to accommodate this, IEEE 754 distinguishes between quiet NaNs (which quietly propagate when used) and signaling NaNs (which throw exceptions when used).

NaNs are represented by storing certain bit patterns in place of a “normal” floating point number. For details, see here. Interestingly, many of the bits in a NaN encoding are unused. (I.e., a NaN can be represented by many different bit patterns.) I’ve seen suggestions that the unused bits could be used to encode a line number or module number, although I’ve seen no code that takes advantage of this.

C++ supports quiet and signaling NaNs through its std::numeric_limits class template. For example:

#include <limits>
#include <iostream>

int main(int argc, char *argv[])
{
  if (std::numeric_limits<double>::has_quiet_NaN) {
      double d = std::numeric_limits<double>::quiet_NaN();
      std::cout << "The bit pattern of a quiet NaN on your platform is 0x"
           << std::hex << *((long long int*)&d) << endl;
  } else {
      std::cout << "You do not have a quiet NaN for doubles" << endl;
  }
}

Similarly, std::numeric_limits<double>::signaling_NaN() returns a signaling NaN. Unfortunately, as far as I can tell, this feature of the Standard C++ Library is completely useless:

You can store a signaling NaN to a variable by directly assigning the appropriate bit pattern:

void set_snan(double& f)
{
  *((long long*)&f) = 0x7ff0000000000001;
}

Or, depending on your C++ library implementation, you might find it simpler to steal its signaling_NaN() implementation:

// Assign a signaling NaN using Dinkumware's implementation; works in MSVC
// and C++Builder.
void set_snan(double& f)
{
  memcpy(&f, &_Snan._Double, sizeof(f));
}

Even after you take care of assigning a signaling NaN, making sure that it actually signals is non-trivial:

To summarize:

So, to conclude, signaling NaNs are probably not a good approach to handling uninitialized variables for the “fail fast” toolbox. In fact, I don’t know what practical use signaling NaNs have in the C++ standard at all.

Is there a good solution for catching uninitialized variable use? I haven’t yet had time to test out options against my code base to see what would work for me, but a few possibilities come to mind:

Finally, for anyone who’s persistent enough to have read this far, here’s a complete test program for playing with the various issues raised in this posting. Tested on Visual C++ 2008 and Debian 4.0’s g++.

#include <float.h>
#include <iostream>
#include <sstream>
#include <limits>
#include <iomanip>

#if defined(__unix)
#include <fenv.h>
#endif

#if defined(_YMATH)

// (Ab)use Dinkumware's implementation if found.

void set_snan(long double& f)
{
 memcpy(&f, &_LSnan._Long_double, sizeof(f));
}

void set_snan(double& f)
{
  memcpy(&f, &_Snan._Double, sizeof(f));
}

void set_snan(float& f)
{
  memcpy(&f, &_FSnan._Float, sizeof(f));
}

#else

// Add some type safety to our evil, non-portable bit-flipping.
#include <boost/static_assert.hp>
BOOST_STATIC_ASSERT(
  sizeof(long double) == sizeof(long long) + sizeof(long)
  && sizeof(double) == sizeof(long long)
  && sizeof(float) == sizeof(long));

void set_snan(long double& f)
{
  *((long long*)&f) = 0x0000000000000001LL;
  *((long*)&f + 2) = 0x7fff;
}

void set_snan(double& f)
{
  *((long long*)&f) = 0x7ff0000000000001LL;
}

void set_snan(float& f)
{
  *((long*)&f) = 0x7f800001L;
}

#endif

// Return a string containing p's raw bits, in hex value.
// Assume little endian.
template<typename T>
std::string ascii_bits(const T& p)
{
  std::ostringstream o;
  o << "0x" << std::setfill('0');
  for (int i = sizeof(p) - 1; i >= 0; i--) {
    o << std::hex << std::setw(2)
   << int(reinterpret_cast<const unsigned char*>(&p)[i]);
  }
  return o.str();
}

using namespace std;

int main(int argc, char* argv[])
{
  typedef double TYPE_TO_TEST;
  TYPE_TO_TEST f, g;

  // Enable exceptions.  A real app would be more selective and may
  // need to save the previous mask.
#if !defined(__unix)
 _control87(0, _EM_INVALID);
#else
  feenableexcept(FE_ALL_EXCEPT);
#endif

  f = std::numeric_limits<TYPE_TO_TEST>::quiet_NaN();
  cout << "Has quiet NaN?  "
       << std::numeric_limits<TYPE_TO_TEST>::has_quiet_NaN << endl;
  cout << "Quiet NaN is printed like this: " << f << endl;
  cout << "Bit pattern for quiet NaN:      "
       << ascii_bits(f) << endl;

  set_snan(f);

  cout << "Has signaling NaN?  "
       << std::numeric_limits<TYPE_TO_TEST>::has_signaling_NaN << endl;
  cout << "Bit pattern for signaling NaN:  " << ascii_bits(f) << endl;

  g = f;

  cout << "Depending on your compiler, you may see this." << endl;

  g = f + 1;

  cout << "You should never see this." << endl;

  return 0;
}