Bugs in C++ — returned memory refs

One of the major causes of bugs in C++ is bad memory pointers and references. This is one reason people like garbage collectors so much, along with memory leaks and code simplification.

The compilers do try to watch for obvious memory mistakes. For example, all of the following functions generate warnings compiling with VC++9. The warnings are not errors; they do not stop you from linking and running. You can live dangerously if you want.

// these compile with warnings
const int& unsafe_ref_a( ) { int a; return a; }
const int& unsafe_ref_b( ) { const int a = 1; return a; }
const int& unsafe_ref_c( ) { return 1; }
const int& unsafe_ref_e( ) { return (1 + 2); }

const int* unsafe_ptr_a( ) { int a; return &a; }

Oddly, the following ARE errors — they do not compile. This seems inconsistent with the functions above being only warnings, since they’re the same with pointers used instead of refs. But the rules of pointers and refs are different, and the functions are obviously wrong anyway.

// these don't compile
const int* unsafe_ptr_b( ) { const int a = 1; return &a; }
const int* unsafe_ptr_c( ) { return &1; }
const int* unsafe_ptr_e( ) { return &(1 + 2); }

You can get around all the warnings and errors though, and write code with all the mistakes displayed above and with none of the compiler complaints. C++ tries hard to be type safe, but hardly tries at all to be memory safe.

// this code has all the same bugs as the code above, and the
// VC++9 compiler does not complain at all.

template< typename T >
    const T* get_ptr( const T& a) { return &a; }
template< typename T >
    const T& get_ref( const T& a) { return a; }

const int& unsafe_ref_a( ) { int a; return get_ref( a); }
const int& unsafe_ref_b( ) { const int a = 1;
                             return get_ref( a); }
const int& unsafe_ref_c( ) { return get_ref( 1); }
const int& unsafe_ref_e( ) { return get_ref( (1 + 2)); }

const int* unsafe_ptr_a( ) { int a; return get_ptr( a); }
const int* unsafe_ptr_b( ) { const int a = 1;
                             return get_ptr( a); }
const int* unsafe_ptr_c( ) { return get_ptr( 1); }
const int* unsafe_ptr_e( ) { return get_ptr( (1 + 2)); }

These errors are less obvious but just as bad as the earlier ones. I’ve seen mistakes like this many times when reviewing code.

The challenge is that a function that returns a ref or pointer type does not make assumptions about the object’s lifetime. Sometimes it is up to the caller to be know the relationship between passed-in parameters and returned values.

The problem is somewhat exacerbated by r-values (unnamed temporary variables), which can be passed as “const T&”. R-value (&&) types in C++0x may help the compiler a little in catching these kinds of problems, but these dangers are inherent in C++. You just have to watch out.

Function types in C++

I’m going to talk about something very simple today: function types and pointers in C++. This is not an overview — it’s a close look at one small thing. To keep it short I’m not going to look at method pointers or classes like std::function<> and std::reference_closure<>. I’m also going to ignore lambdas and lambda-like function syntax for now.

I’ll need a few assertive header files to start.

# include <boost/static_assert.hpp>
# include <boost/mpl/same_as.hpp>
# include <boost/mpl/assert.hpp>

I’ll also need some declarations.

int raw_fn( char) throw( );

// typedef int (type_raw_fn)(char)throw();
//   illegal - throw() not allowed in typedef

typedef int (  type_raw_fn)( char);
typedef int (* type_ptr_fn)( char);
typedef int (& type_ref_fn)( char);

The function raw_fn(..) is of type int(char). It is not type int(*)(char) or int(&)(char). The throw() is not part of the type. The following sizeof(..) expressions for pointers all equal sizeof( void* ).

The type int(char) has an unknown size. The following sizeof() expressions do not compile.

Note that the following DO compile. They yield sizeof( void* ) even though they are not pointer types. Their values do not depend on how long or short the function body is. The compiler changes these to pointer types when they appear in a parameter list.

The following confirms that the raw, ref, and ptr function types are distinct. It also shows that type_raw_fn does not always automatically convert to a pointer type.

BOOST_MPL_ASSERT(
    (boost::is_same< type_raw_fn , int(char)   >));
BOOST_MPL_ASSERT(
    (boost::is_same< type_ptr_fn , int(*)(char)>));
BOOST_MPL_ASSERT(
    (boost::is_same< type_ref_fn , int(&)(char)>));

BOOST_MPL_ASSERT(
    (boost::is_same< type_raw_fn*, int(*)(char)>));
BOOST_MPL_ASSERT(
    (boost::is_same< type_raw_fn&, int(&)(char)>));

// Raw type is not converted to pointer here.
BOOST_MPL_ASSERT_NOT(
    (boost::is_same< type_raw_fn , type_ptr_fn >));
BOOST_MPL_ASSERT_NOT(
    (boost::is_same< int(char)   , int(*)(char)>));

// Raw type is distinct from ref type.
BOOST_MPL_ASSERT_NOT(
    (boost::is_same< type_raw_fn , type_ref_fn >));
// Another way to assert the same thing.
BOOST_STATIC_ASSERT(
    ! (boost::is_same< type_raw_fn , type_ref_fn >::value));

// The following assert fails. Adding () makes the type
// declaration illegal.
//   BOOST_MPL_ASSERT(
//       (boost::is_same< type_raw_fn , int()(char)  >));

The following test shows the compiler treats type_raw_fn and type_ptr_fn the same when they are function params. The two types are distinct, but in this case the compiler converts type_raw_fn to type_ptr_fn.

// The raw and ref types can be overloaded.
int raw_ref_distict( type_raw_fn) { return 1; }
int raw_ref_distict( type_ref_fn) { return 2; }

// The ptr and ref types can be overloaded.
int ptr_ref_distict( type_ptr_fn) { return 1; }
int ptr_ref_distict( type_ref_fn) { return 2; }

// The raw and ptr types cannot be overloaded!
// You cannot compile either one of these separate,
// but not both together because the compiler
// converts int(char) into int(*)(char).
/*
int raw_ptr_distict( type_raw_fn) { return 1; }
int raw_ptr_distict( type_ptr_fn) { return 2; }
*/

Also note that pointer-to-pointer-to-pointer function classes can be declared this way. Remember that raw types are converted to pointers, but pointers are not converted or dereferenced automatically.

// The compiler does not collapse these types.
int ptr_distinct( int(*)(char)) { return 1; }
int ptr_distinct( int(**)(char)) { return 2; }
int ptr_distinct( int(***)(char)) { return 3; }
int ptr_distinct( int(***&)(char)) { return 4; }

// These do not compile. Pointer to reference is illegal.
//   int illegal_param( int(&*)(char)  ) ;
//   int illegal_param( int(*&*)(char)  ) ;

There are a lot of ways to declare function parameter variables. It’s good to get familiar with them so you can tell when you are overloading.

// Eight different ways to declare param_ptr_fn(..)
// with exactly the same parameter type.
// These all declare one function; they do not
// overload.
// The last four are declared with a non-pointer
// type which is converted to a pointer type.
int param_ptr_fn( int(*)(char)  ) ;
int param_ptr_fn( int(*f)(char) ) ;
int param_ptr_fn( type_ptr_fn  f) ;
int param_ptr_fn( type_raw_fn* f) ;
int param_ptr_fn( int(char)     ) ;
int param_ptr_fn( int(f)(char)  ) ;
int param_ptr_fn( int f(char)   ) ;
int param_ptr_fn( type_raw_fn  f) ;

// Four different ways to declare param_ref_fn().
// The parameter lists are identical.
int param_ref_fn( int(&)(char)  ) ;
int param_ref_fn( int(&f)(char) ) ;
int param_ref_fn( type_ref_fn  f) ;
int param_ref_fn( type_raw_fn& f) ;

// These type declarations are illegal:
//   int illegal_raw_param( int(char)  f);
//   int illegal_ptr_param( int(char)* f);
//   int illegal_ref_param( int(char)& f);

Finally a bunch of examples. These mainly show that the compiler is pretty free about converting between raw functions and function pointers.

void test( type_raw_fn f)
{
    // These are equivalent. There are several
    // ways you can call a function pointer.
    f( 'x');
    (f)( 'x');
    (&f)( 'x');
    (*f)( 'x');
    (***f)( 'x'); /* yikes */
    (&*&*&*&f)( 'x'); /* yikes!! */
    // (**&&**f)( 'x'); - illegal

    // These are also equivalent. You can use this
    // syntax on a normal function as well as on a
    // function variable. Notice f, &f, and *f are
    // all converted to the same function pointer.
    param_ptr_fn( f);
    (param_ptr_fn)( f);
    (&param_ptr_fn)( &f);
    (*param_ptr_fn)( ****f);
    (***param_ptr_fn)( *f);
    (&*&*&*&param_ptr_fn)( *f);

    // You can bind a type_raw_fn variable as a function
    // parameter, you you cannot declare a type_raw_fn as
    // a block variable.
    // All of these choke the compiler:
    //    type_raw_fn f2( f);
    //    type_raw_fn f2 = type_raw_fn( f);
    //    type_raw_fn f2 = f;
    //    type_raw_fn f2 = &f;
    //    type_raw_fn f2 = *f;

    // Function pointer variables are OK.
    // They auto-convert from raw to pointer.
    type_raw_fn* p1 =  f; /* auto convert */
    type_raw_fn* p2 = &f;
    type_ptr_fn  p3 =  f; /* auto convert */
    type_ptr_fn  p4 = &f;
    type_ptr_fn  p5 = ***&***f;
    type_ptr_fn  p6 = ***&***&f;

    // Function ref vars are also OK.
    // But there are no automatic conversions
    // between pointer and ref.
    type_raw_fn& r1 = f;
    type_ref_fn  r2 = f;
    type_ref_fn  r3 = **&***f;
    type_ref_fn  r4 = **&**&*&f;

    // We can treat a function pointer variable
    // like a non-pointer except we cannot take
    // the address of a pointer. Just like it is
    // not legal to have two ampersands next
    // to each other in the above examples,
    // you cannot have (&p1)('x'). But you can
    // have (&*p1)('x').
    p1( 'x');
    (p1)( 'x');
    (*p1)( 'x');
    (***p1)( 'x');
    (&*&*&*p1)( 'x');
    // (&p1)( 'x'); - illegal, pointer to pointer
    // (&*&*&*&p1)( 'x'); - illegal

    // You can call a function in a ref var
    // with the same syntax as a raw function.
    r1( 'x');
    (r1)( 'x');
    (&r1)( 'x');
    (*r1)( 'x');
    (***r1)( 'x');
    (&*&*&*&r1)( 'x');

    // You can test function pointers for
    // equality. Raw variables are converted
    // to pointers.
    bool t1 = (&f == &f);
    bool t2 = ( f ==  f);
    bool t3 = ( f ==  p1);
    // A de-ref'ed pointer converts back to
    // a pointer.
    bool t4 = ( f == *p1);
    // A ref converts to a pointer.
    bool t5 = ( f ==  r1);
    // Everything converts to a pointer.
    bool t6 = ( f == *r1);
    bool t7 = ( f == **r1);
    bool t8 = ( **f == **r1);
    bool t9 = ( *&*&*&f == ***p1);
}

Arrrg, what a mess. That’s all for now, and I’m sure it’s enough.

I wish I could conclude with something theoretical, but this paper is about shining a light in a dark C++ corner. In your own code you probably treat function variables in a uniform way, maybe always as pointers. But when you review other people’s code you run into other habits, so it’s good to occasionally look and see how far the syntax goes.

← Previous Page