Meta functions and parallel programming: map, filter, and reduce

(This is a continuation of my previous post, Why is parallel programming difficult?.)

When programming, we’re always dealing with collections — objects that hold other objects. You cannot go far without them — they’re as fundamental as bools and forks (if statements).

In Erlang and Lisp we often use lists as the default generic collection. Lists are elegant structures, defined recursively from pairs. They are easy to build and take apart.

But lists impose an ordering on their contents even when what you really want an unordered collection. And since you’ve got a list it’s easy to start using it in an ordered way, and write code that uses that order. Even when it’s not necessary.

When you do that you add an unnecessary constraint to you code, and you can inadvertently lose parallelism. Ordering doesn’t have to break parallelism, but it often does.

And of course it doesn’t matter much since we mostly write linear programs. We don’t program parallel. But we might start getting parallel in the near future, so it doesn’t hurt to raise your awareness now.

One way to stay order-free is to use the meta-functions map, filter, and reduce when working with lists (or any other collection). These meta-functions are order-free if you use them like you’re supposed to.

map Apply a function to each element in a collection.
Collect the return values into a collection the same type as the source.
filter Apply a predicate function to each element in a collection.
Return a subset of the original collection containing only the elements where the predicate returned true.
reduce Apply a function to a small group elements from the collection. Keep doing this to the original elements and to the return values until there is only one value left. Return that reduced value.

Binary reduction applies an associative function to pairs in the collection, and then applies the function to the results of the pairs, etc, until it finally boils down to a single value. So if a collection has 11 values, the first reduction invokes the reduction function 5 times (in parallel), which reduces the collection to 6 values. The next reduction leaves 3 values, then 2, and finally 1 which is the return value.

When there is only one element in the original collection, reduce usually returns that value without applying the function. Or it could apply the binary function to the one element and a default value. Reduction for odd-lot remainders with higher-arity functions also has to be specified.

When the original collection has no elements reduce might return a special (default) value. Or it might throw an error.

If the source collection is unordered, the reducing function should be commutative as well as associative.

These are my personal definitions for these functions, but I’ve seen some variation. Reduce in particular seems to have different meanings in different situations.

For me, reduce is not the same as fold. Fold (or accumulate) sweeps through a collection, repeatedly applying a function to the next element and an accumulated value. If the collection is ordered, a left-fold sweeps through the elements from first to last, and a right-fold sweeps from last to first. Fold keeps the accumulating value separate from the element values, while reduce doesn’t have an accumulating value (although it may have a default value).

Distinguishing between reduce and fold is not universal. For example, the reduce functions in Python, Common Lisp, and JavaScript are really fold functions. At least to me they are.

In my definition, map and filter should be able to apply their functions in parallel or in any order without changing the outcome. So you should not use map to print out the items of a list in order, for example.

Because the predicate functions of filter are usually small and fast, it is rare to see a multi-threaded stand-alone filter. But it is sometimes combined with map and/or reduce to something like filter-map. For example, Qt provides a QtConcurrent::filteredReduced function.

If you map or filter a list, the returned list will reflect the order in the original list. This is how it should be — the result collection should keep the structure of the original collection as much as possible. But if you’re using a list to represent an unordered collection, or if you’re going to reduce mapped results in an order-independent way, then you can relax this constraint.

Maybe I’ll talk about that next time.

Transformative pattern: re-factoring a function to make it tail recursive

A few days ago I talked a little about tail recursion, and I mentioned a pattern I called iterator/builder to transform some simple recursive functions into tail-recursive functions. The transformation looks like this (in Erlang).

% not tail recursive
factorial_a( 0 ) -> 1;
factorial_a( N ) -> N * factorial_a( N - 1 ).

% tail recursive
factorial_b( N ) -> fact_tail( N, 1 ).
fact_tail( 0, Product ) -> Product;
fact_tail( N, Product ) -> fact_tail( N - 1, N * Product ).

In this example of the iterator/builder pattern, N is the iterator and Product is the builder. Here are some more examples.

% count items in a list
% not tail recursive
count_a( [     ] ) -> 0;
count_a( [_ | R] ) -> 1 + count_a( R).

% tail recursive
count_b( List           ) -> count_t( List, 0).
count_t( [     ], Total ) -> Total;
count_t( [_ | R], Total ) -> count_t( R, 1 + Total ).

% triangle number (0 + 1 + 2 + 3 ...)
% not tail recursive
triangle_a( 0 ) -> 0;
triangle_a( N ) -> N + triangle_a( N - 1 ).

% tail recursive
triangle_b( N      ) -> triangle_t( N, 0 ).
triangle_t( 0, Sum ) -> Sum;
triangle_t( N, Sum ) -> triangle_t( N - 1, N + Sum ).

Transforming these functions to be tail-recursive is simple. The pattern looks like this:

% Iterator/Builder pattern to transform a recursive
% function into a tail-recursive function.
% This is a common pattern. It is probably described
% elsewhere with a different a name.
%
% Pattern variables:
%   null_pattern - matches null iter
%   null_value   - initial value
%   iter_pattern - matches non-null iter
%   iter_first   - first element value
%   iter_rest    - iter with first element removed
%   fn_name( iter )
%   combine_expression( iter_first, iter )

% Before (not tail recursive):
fn_name( null_pattern ) ->
  null_value;
fn_name( iter_pattern ) ->
  combine_expression( iter_first, fn_name( iter_rest )).

% After (tail recursive):
fn_name( X ) ->
  fn_name_tail( pre_transform( X ), null_value ).
fn_name_tail( null_pattern, Collect ) ->
  Collect;
fn_name_tail( iter_pattern, Collect ) ->
  fn_name_tail( iter_rest,
    combine_expression( iter_first, Collect )).

This is not perfect however. Consider the triangle function above. triangle_a(3) calculates (3+(2+(1+0))) while triangle_b(3) calculates (1+(2+(3+0))). If you change N+Sum to Sum+N in triangle_b you end up calculating (((0+3)+2)+1) instead. This is the left-fold vs right-fold problem that you sometimes run into when you flatten recursion, and also shows how the initial value for the fold operation (zero in this case) can move around. For an operation like + (plus) it doesn’t matter since + is associative and commutative, but it matters in other cases. Consider the stutter/1 function.

% stutter( [a,b,c] ) -> [a,a,b,b,c,c]

% not tail recursive:
stutter_a( [     ] ) -> [];
stutter_a( [H | R] ) -> glom( H, stutter_a( R )).

% tail recursive
stutter_b( A ) -> lists:reverse( stutter_t( A, [] )).
stutter_t( []     , C ) -> C;
stutter_t( [H | R], C ) -> stutter_t( R, glom( H, C )).

% another tail recursive version
stutter_c( A ) -> stutter_t( lists:reverse( A ), [] ).

% combine expression for stutter
glom( H, R ) -> [H, H | R].

stutter_a([a,b,c]) calculates glom(a,glom(b,glom(c,[]))) while stutter_b([a,b,c]) calculates glom(c,glom(b,glom(a,[]))) and then reverses the result. stutter_c([a,b,c]) starts by reversing the iterator and so calculates glom(a,glom(b,glom(c,[]))) just like stutter_a.

So the simple transformative pattern described above sometimes has to be modified if the combine expression is not associative and commutative. Sometimes you can fix the result, sometimes you can reverse the initial iterator, sometimes you need an extra end-of-iterator parameter when you reverse the iterator, and sometimes you have to change the combine expression.

And of course sometimes you just want to leave the function alone and forget about tail recursing. This kind of code transformation or re-factoring requires some understanding of the problem before being applied.

Erlang as C++ code generator

In my last post I talked about how templates could be extended to manipulate and transform bits of the C++ syntax tree. Parsed C++ syntax objects, like block statements and parameter lists, are just the kind of thing you need when generating code.

But there are other ways to approach the problem. You can generate code as text, spitting out source code files to be compiled. Or you can generate code with the preprocessor.

With the preprocessor you don’t deal with syntax trees of course, since the preprocessor doesn’t know how to parse C++. It only knows how to tokenize the text stream, and so it deals with tokens. Tokens are syntax primitive, and aren’t as abstract as statements or expressions, but you can do a lot with just tokens. Sometimes you want to work on that level anyway, like when you want to generate a piece of code with unmatched curly brackets, maybe to open a namespace.

Of course the preprocessor is weak without recursion, looping, or data structures. But you can imagine a sophisticated language, embedded in C++ source like the preprocessor, that inserted generated C++ token streams. You’d want a language with recursion, data structures, and data types that could be manipulated and transformed.

The Erlang language gives us a hint of what this might look like. (We could also look to Prolog as a model, although it may be overkill, at least for what I have in mind.)

Erlang appears well suited to code generation. It’s declarative and functional like the C++ template language. Functional means a function’s return value depends only on its arguments and will always be the same if you pass in the same parameters. In other words, my_struct<43,char*> will evaluate to the same type as long as the arguments 43 and char* don’t change.

Erlang’s pattern matching and native list structure are good for simple parsing and list processing, which are also useful for code generation. Erlang-style pattern matching and guards are particularly useful if you’re working without strong typing.

Here’s what an Erlang-like preprocessing language might look like embedded in C++. Here we define a simple function, chop_off_head/1, which returns a list sans the first element. (The /1 part of the name means it take one argument.)

// Warning - this is all fantasy/speculative C++
// It will NOT compile.

// How would we embed a language like Erlang in
// C++ source code?

// 1. Surround the meta code in a special wrapper,
// like you do with asm { ... }
erlang {
  -module( cpp).
  -export( [chop_off_head_a/1]).
  chop_off_head_a( [] ) -> [];
  chop_off_head_a( [A | Rest] ) -> Rest.
}

// 2. Make it clear that we're defining parts of a module
// in the special wrapper, and we're not in "shell" mode.
erlang:module( cpp ) {
  -export( [chop_off_head_b/1]).
  chop_off_head_b( [] ) -> [];
  chop_off_head_b( [A | Rest] ) -> Rest.
}

// 3. Keep the Erlang code in a separate file. This way
// we can also load/test the Erlang code in a native
// environment. Allow a "-module(cpp)." line in the
// included file.
erlang:module( cpp ) {
# include "cpp.erl"
}

We have to distinguish module definitions from commands. Modules only define Erlang functions and do not generate any C++ code until the functions are called.

How do we feed C++ tokens to the Erlang functions, and how are the return tokens injected back into the compiled program? I’m thinking the code might look something like this.

// Warning - this is all fantasy/speculative C++
// It will NOT compile.

// What would a call to an embedded Erlang function
// look like?

  template< typename ... TYPEs >
  struct
mystruct_a
  {
    // We could set an erlang variable to build a list
    // and then pass that to the function. The last function
    // returns a list of cpp tokens to be taken up by the
    // compiler.
    erlang {
      X =
      -cpp_tokens_start
        TYPEs ...
      -cpp_tokens_end.
      cpp:generate_member_vars_a( X ).
    }
  };

  template< typename ... TYPEs >
  struct
mystruct_b
  {
    // Or we could unpack TYPEs into a comma-separated list.
    erlang:cpp:generate_member_vars_b( [ TYPEs ... ] ).
  };

In these examples I assume TYPEs… expands to a comma-separated list of types which will have to have some standard representation for the embedded Erlang, maybe as atoms or Erlang tuples.

In the mystruct_a example, X is set to a list of all the tokens of the expanded TYPEs… including the comma separators. Since a type can consist of many tokens (think int const * const &) the Erlang function will need the comma tokens. Unless we get the compiler to parse the tokens for us and set X to an Erlang list of type object.

I use -cpp_tokens_start and -cpp_tokens_end (which is NOT standard Erlang) to escape out of Erlang and tell the compiler to tokenize everything between as C++, and to bundle all those tokens up into a list and assign them to X. If we use delimiters like { and } (curly brackets) we’ll have a harder time specifying unmatched } in our escaped C++ token stream.

Maybe the C++ parser could compose the C++ tokens between -cpp_tokens_start and -cpp_tokens_end into more abstract objects, like s.

The last function in the wrapper, cpp:generate_member_vars_a(X), presumably returns a list of C++ tokens that the compiler takes and and treats as C++ code.

In mystruct_b I was thinking [TYPEs…] would expand into something like [int,float,char] which is an Erlang list. In this case the commas would not be tokenized since they are part of the Erlang syntax.

Finally, here’s what the Erlang code used above might look like.

-module( cpp).

% File contains functions to parse, manipulate,
% and generate C++ token streams.


% == exports ==========================

-export( [make_type/1]).
-export( [make_variable/2]).
-export( [make_numbered_name/2]).
-export( [parse_tokens_type_list/1]).
-export( [transform_types_to_variables/3]).
-export( [generate_tokens_for_variable_decl/1]).
-export( [generate_tokens_for_variable_decls/1]).
-export( [generate_member_vars/1]).


% == record declarations ====================

-record( type, {name}).
-record( variable, {name, type}).


% == make_type/1 =======================

make_type( Name ) ->
  #type{ name=Name }.


% == make_variable/2 =======================

make_variable( Name, Type = #type{} ) ->
  #variable{ name=Name, type=Type }.


% == make_numbered_name/2 =======================
%
% Generates a string from a prefix and a number:
%   make_numbered_name( "my_name_prefix_", 3245) ->
%   "my_name_prefix_3245"

make_numbered_name( Prefix, Counter ) ->
  list_to_atom(
    string:concat(
      Prefix,
      integer_to_list( Counter ))).


% == parse_tokens_type_list/1 =======================
%
% This only parses simple type lists, where each type is
% a single token. Works with type lists like this:
%   type1 , type2 , type3
% Does not work for a type consisting of more than one
% token, like this:
%   type1 * , type2 const & , int ( * ) ( int , char )

parse_tokens_type_list( [] ) ->
  [];
parse_tokens_type_list( [Arg | Rest] ) ->
  [ make_type( Arg)
  | lists:reverse(
      parse_tokens_type_list_tail( Rest, [] ))].

parse_tokens_type_list_tail( [], Collect ) ->
  Collect;
parse_tokens_type_list_tail( [',', Arg | Rest], Collect ) ->
  parse_tokens_type_list_tail(
    Rest,
    [ make_type( Arg) | Collect ]).


% == transform_types_to_variables/1 ======================
%
% From a list of types, generates variable declarations
% separated by semicolons. The variables will have generated
% names. The following:
%   transform_types_to_variables( [int, char], "var_", 34)
% yields a list of 6 tokens like this:
%   int var_34 ; char var_35 ;
%
% Can be used to generate class-member variables.

transform_types_to_variables( Types, Prefix, Counter ) ->
  lists:reverse(
    transform_types_to_variables_tail(
      Types, Prefix, Counter, [] )).

transform_types_to_variables_tail( [], _, _, Collect ) ->
  Collect;
transform_types_to_variables_tail(
    [Type = #type{} | Rest], Prefix, Counter, Collect ) ->
  transform_types_to_variables_tail(
    Rest,
    Prefix,
    Counter + 1,
    [ make_variable(
	    make_numbered_name( Prefix, Counter ),
        Type)
    | Collect
    ]).


% == generate_tokens_for_variable/1 ======================

generate_tokens_for_variable_decl( Var = #variable{} ) ->
  [ (Var#variable.type)#type.name,
    Var#variable.name,
    ';'
  ].


% == generate_tokens_for_variable_decls/1 ======================

generate_tokens_for_variable_decls( Vars ) ->
  lists:append(
    lists:reverse(
      generate_tokens_for_variable_decls_tail( Vars, [] ))).

generate_tokens_for_variable_decls_tail( [], Collect ) ->
  Collect ;
generate_tokens_for_variable_decls_tail(
    [Var = #variable{} | Rest], Collect ) ->
  generate_tokens_for_variable_decls_tail(
    Rest,
    [generate_tokens_for_variable_decl( Var ) | Collect]).


% == generate_member_vars/1 ======================
%
% Given a stream of tokens from a type list, generates
% a bunch of variable declarations.
%
% The following:
%   generate_member_vars( ['my_type_x', ',', 'my_type_y'])
% will generate the following C++ tokens:
%   my_type_x member_var_0 ;
%   my_type_y member_var_1 ;
%
% Another name for this could be:
%   transform_tokens_type_list_to_variable_decls

generate_member_vars( Tokens ) ->
  generate_tokens_for_variable_decls(
    transform_types_to_variables(
      parse_tokens_type_list( Tokens), "member_var_", 0)).

As you can see, expressing ideas in Erlang is a lot easier than expressing them as C++ template meta-functions, which is the reason I started thinking about using it for code generation. Erlang is a clean and easy-to-understand language that could work well at the token level or higher. But you can also see how the Erlang and C++ syntax clash, particularly when feeding C++ tokens to Erlang functions, although my proposal above could probably be improved.

I firmly believe that code-generating languages and development environments will become much more important in the future, and the C++ language is clearly moving in that direction. One tool that could help is a re-vamped C++ preprocessor, maybe based on a language like Erlang or Prolog. Or maybe it could even look like Javascript, which is imparative, and would be good at treating the code as a character stream. Or maybe C++ templates really are the best solution, especially if they could be extended to work with smaller fragments of parse tree.

It’s something to think about.

Next Page →