In the world of computer science, algorithms play a crucial role in tackling complex problems. Algorithms are are at the core of computing. One intriguing approach is Genetic Algorithms (GAs), which draw inspiration from natural selection and genetics. These algorithms excel at solving optimization and search problems by mimicking evolutionary processes. When it comes to putting these ideas into code, C++ stands out as a powerful and versatile language.
The subset of the standard C++ library known as the Standard Template Library (STL) was originally designed around generic algorithms code that processes sequences of any type of values in a type-safe manner. The goal was to use predefined algorithms for almost every task, instead of hand-coding loops every time you need to process a collection of data.
Table of Contents
- Stream Iterators
- Algorithm Complexity
- Function Objects
- Adaptable Function Objects
- Function Pointer Adapters
- STL Algorithms Quick Reference
Stream Iterators
Like any good software library, the Standard C++ Library aims to make common tasks easier. In earlier articles, we mentioned that instead of using loops, you can employ generic algorithms. However, our examples have been using explicit loops to display output so far. Since displaying output is a frequent task, it would be great if there was an automated way to handle it.
That’s where stream iterators come in. A stream iterator allows you to use a stream as either an input or an output sequence. To eliminate the output loop in the C++ program, for instance, you can do something like the following.
#include <algorithm>
#include <cstddef>
#include <iostream>
#include <iterator>
using namespace std;
bool gt15(int x) {
return 15 < x;
}
int main() {
int a[] = {10, 20, 30};
const size_t SIZE = sizeof a / sizeof a[0];
remove_copy_if(a, a + SIZE, ostream_iterator<int>(cout, "\n"), gt15);
}
In this example we have replaced the output sequence b in the third argument to remove_copy_if( ) with an output stream iterator, which is an instance of the ostream_iterator class template declared in the <iterator> header. Output stream iterators overload their copy-assignment operators to write to their stream. This particular instance of ostream_iterator is attached to the output stream cout. Every time remove_copy_if( ) assigns an integer from the sequence a to cout through this iterator, the iterator writes the integer to cout and also automatically writes an instance of the separator string found in its second argument, which in this case contains just the newline character.
It is just as easy to write to a file instead of to cout, of course. All you have to do is provide an output file stream instead of cout:
#include <algorithm>
#include <cstddef>
#include <fstream>
#include <iterator>
using namespace std;
bool gt15(int x) {
return 15 < x;
}
int main() {
int a[] = {10, 20, 30};
const size_t SIZE = sizeof a / sizeof a[0];
ofstream outf("integers.out");
remove_copy_if(a, a + SIZE, ostream_iterator<int>(outf, "\n"), gt15);
}
An input stream iterator allows an algorithm to get its input sequence from an input stream. This is accomplished by having both the constructor and operator++( ) read the next element from the underlying stream and by overloading operator*( ) to yield the value previously read. Since algorithms require two pointers to delimit an input sequence, you can construct an istream_iterator in two ways, as you can see in the program that follows.
#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include "../require.h"
using namespace std;
bool gt15(int x) {
return 15 < x;
}
int main() {
ifstream inf("integers.dat");
assure(inf, "integers.dat");
remove_copy_if(istream_iterator<int>(inf), istream_iterator<int>(), ostream_iterator<int>(cout, "\n"), gt15);
}
The first argument to replace_copy_if( ) in this program attaches an istream_iterator object to the input file stream containing integers. The second argument uses the default constructor of the istream_iterator class. This call constructs a special value of istream_iterator that indicates end-of-file, so that when the first iterator finally encounters the end of the physical file, it compares equal to the value istream_iterator<int>( ), allowing the algorithm to terminate correctly. Note that this example avoids using an explicit array altogether.
Algorithm Complexity
Using a software library is a matter of trust. You’re counting on the people who created it to not only make it work correctly but also to make it work as fast as possible. Sometimes, it’s better to create your own loops instead of using pre-made algorithms that might slow things down.
To ensure that libraries meet high standards, the C++ standard not only says what an algorithm should do but also how fast it should do it and how much space it should use. If an algorithm doesn’t meet these standards, it doesn’t meet the rules set by the standard. We measure how well an algorithm works in terms of its efficiency, and this is called its “complexity.”
The standard tries to be specific about how many operations an algorithm should use whenever possible. For example, the count_if()
algorithm tells you how many elements in a sequence meet a certain condition. So, if you use count_if()
on a list of numbers, like the ones we talked about earlier, it will tell you how many of them are greater than 15:
size_t n = count_if(a, a + SIZE, gt15);
Since count_if( ) must look at every element exactly once, it is specified to make a number of comparisons exactly equal to the number of elements in the sequence. Naturally, the copy( ) algorithm has the same specification.
Other algorithms can be specified to take at most a certain number of operations. The find( ) algorithm searches through a sequence in order until it encounters an element equal to its third argument:
int* p = find(a, a + SIZE, 20);
It stops as soon as the element is found and returns a pointer to that first occurrence. If it doesn’t find one, it returns a pointer one position past the end of the sequence (a+SIZE in this example). Therefore, find is said to make at most a number of comparisons equal to the number of elements in the sequence.
Sometimes the number of operations an algorithm takes cannot be measured with such precision. In such cases, the standard specifies the algorithm’s asymptotic complexity, which is a measure of how the algorithm behaves with large sequences compared to well-known formulas. A good example is the sort( ) algorithm, which the standard says takes approximately n log n comparisons on average? (n is the number of elements in the sequence) . Such complexity measures give a feel for the cost of an algorithm and at least give a meaningful basis for comparing algorithms. As you’ll see, the find( ) member function for the set container has logarithmic complexity, which means that the cost of searching for an element in a set will, for large sets, be proportional to the logarithm of the number of elements. This is much smaller than the number of elements for large n, so it is always better to search a set by using its find( ) member function rather than by using the generic find( ) algorithm.
Function Objects
As you study some of the examples earlier in this tutorial, you will probably notice the limited utility of the function gt15( ). What if you want to use a number other than 15 as a comparison threshold? You may need a gt20( ) or gt25( ) or others as well. Having to write a separate function for each such comparison has two distasteful difficulties:
- You may have to write a lot of functions!
- You must know all required values when you write your application code.
The second limitation means that you can’t use runtime values to govern your searches, which is downright unacceptable. Overcoming this difficulty requires a way to pass information to predicates at runtime. For example, you would need a greater-than function that you can initialize with an arbitrary comparison value. Unfortunately, you can’t pass that value as a function parameter, because unary predicates, such as our gt15( ), are applied to each value in a sequence individually and must therefore take only one parameter.
The way out of this dilemma is, as always, to create an abstraction. In this case, we need an abstraction that can act like a function as well as store state, without disturbing the number of function parameters it accepts when used. This abstraction is called a function object .
A function object is an instance of a class that overloads operator( ), the function call operator. This operator allows an object to be used with function call syntax. As with any other object, you can initialize it via its constructors. Here is a function object that can be used in place of gt15( ):
#include <iostream>
using namespace std;
class gt_n {
int value;
public:
gt_n(int val) : value(val) {}
bool operator()(int n) {
return n > value;
}
};
int main() {
gt_n f(4);
cout << f(3) << endl; // Prints 0 (for false)
cout << f(5) << endl; // Prints 1 (for true)
}
The fixed value to compare against (4) is passed when the function object f is created. The expression f(3) is then evaluated by the compiler as the following function call: f.operator()(3);
which returns the value of the expression 3 > value, which is false when value is 4, as it is in this example.
Since such comparisons apply to types other than int, it would make sense to define gt_n( ) as a class template. It turns out you don’t have to do it yourself, though the standard library has already done it for you. The following descriptions of function objects should not only make that topic clear, but also give you a better understanding of how the generic algorithms work.
Classification of function objects
The standard C++ library classifies function objects based on the number of arguments that their operator( ) takes and the kind of value it returns. This classification is organized according to whether a function object’s operator( ) takes zero, one, or two arguments, as the following definitions illustrate.
Generator: A type of function object that takes no arguments and returns a value of an arbitrary type. A random number generator is an example of a generator. The standard library provides one generator, the function rand( ) declared in <cstdlib>, and has some algorithms, such as generate_n( ), which apply generators to a sequence.
- Unary Function: A type of function object that takes a single argument of any type and returns a value that may be of a different type (which may be void).
- Binary Function: A type of function object that takes two arguments of any two (possibly distinct) types and returns a value of any type (including void).
- Unary Predicate: A Unary Function that returns a bool.
- Binary Predicate: A Binary Function that returns a bool.
- Strict Weak Ordering: A binary predicate that allows for a more general interpretation of equality. Some of the standard containers consider two elements equivalent if neither is less than the other (using operator<( )). This is important when comparing floating-point values, and objects of other types where operator==( ) is unreliable or unavailable. This notion also applies if you want to sort a sequence of data records (structs) on a subset of the struct’s fields, that comparison scheme is considered a strict weak ordering because two records with equal keys are not really ?equal? as total objects, but they are equal as far as the comparison you’re using is concerned.
In addition, certain algorithms make assumptions about the operations available for the types of objects they process. We will use the following terms to indicate these assumptions:
- LessThanComparable: A class that has a less-than operator<.
- Assignable: A class that has a copy-assignment operator= for its own type.
- EqualityComparable: A class that has an equivalence operator== for its own type.
Adaptable Function Objects
Standard function adapters such as bind1st( ) and bind2nd( ) make some assumptions about the function objects they process. To illustrate, consider the following expression from the last line of the earlier CountNotEqual.cpp program:
not1(bind1st(equal_to<int>(), 20))
The bind1st( ) adapter creates a unary function object of type binder1st, which simply stores an instance of equal_to<int> and the value 20. The binder1st::operator() function needs to know its argument type and its return type; otherwise, it will not have a valid declaration. The convention to solve this problem is to expect all function objects to provide nested type definitions for these types. For unary functions, the type names are argument_type and result_type; for binary function objects they are first_argument_type, second_argument_type, and result_type. Looking at the implementation of bind1st( ) and binder1st in the <functional> header reveals these expectations. First inspect bind1st( ), as it might appear in a typical library implementation:
template <class Op, class T>
binder1st<Op>
bind1st(const Op& f, const T& val)
{
typedef typename Op::first_argument_type Arg1_t;
return binder1st<Op>(f, Arg1_t(val));
}
Note that the template parameter, Op, which represents the type of the binary function being adapted by bind1st( ), must have a nested type named first_argument_type. Now notice how binder1st uses the type names in Op in its declaration of its function call operator:
// Inside the implementation for binder1st<Op>?
typename Op::result_type
operator()(const typename Op::second_argument_type& x)
const;
Function objects whose classes provide these type names are called adaptable function objects.
Since these names are expected of all standard function objects as well as of any function objects you create that you want to use with the function object adapters, the <functional> header provides two templates that define these types for you: unary_function and binary_function. You simply derive from these classes while filling in the argument types as template parameters. Suppose, for example, that we want to make the function object gt_n, defined earlier in this tutorial, adaptable. All we need to do is the following:
class gt_n : public unary_function<int, bool> {
int value;
public:
gt_n(int val) : value(val) {}
bool operator()(int n) {
return n > value;
}
};
All unary_function does is to provide the appropriate type definitions, which it infers from its template parameters as you can see in its definition:
template <class Arg, class Result>
struct unary_function {
typedef Arg argument_type;
typedef Result result_type;
};
These types become accessible through gt_n because it derives publicly from unary_function. The binary_function template behaves in a similar manner.
Function Pointer Adapters
Wherever a function-like entity is expected by an algorithm, you can supply either a pointer to an ordinary function or a function object. When the algorithm issues a call, if it is through a function pointer, than the native function-call mechanism is used. If it through a function object, then that objects operator( ) member executes. You saw earlier, for example, that we passed a raw function, gt_15( ), as a predicate to remove_copy_if( ) in the first C++ program of this article. We also passed pointers to functions returning random numbers to generate( ) and generate_n( ).
You cannot, however, use raw functions with function object adapters, such as bind2nd( ), because they assume the existence of type definitions for the argument and result types. Instead of manually converting your native functions into function objects yourself, the standard library provides a family of adapters to do the work for you. The ptr_fun( ) adapters take a pointer to a function and turn it into a function object. They are not designed for a function that takes no arguments?they must only be used with unary functions or binary functions.
The following program uses ptr_fun( ) to wrap a unary function.
#include <algorithm>
#include <cmath>
#include <functional>
#include <iostream>
#include <iterator>
#include <vector>
using namespace std;
int d[] = {123, 94, 10, 314, 315};
const int DSZ = sizeof d / sizeof *d;
bool isEven(int x) {
return x % 2 == 0;
}
int main() {
vector<bool> vb;
transform(d, d + DSZ, back_inserter(vb),
not1(ptr_fun(isEven)));
copy(vb.begin(), vb.end(),
ostream_iterator<bool>(cout, " "));
cout << endl;
// Output: 1 0 0 0 1
}
We can’?’t simply pass isEven to not1, because not1 needs to know the actual argument type and return type its argument uses. The ptr_fun( ) adapter deduces those types through template argument deduction. The definition of the unary version of ptr_fun( ) looks something like this:
template <class Arg, class Result>
pointer_to_unary_function<Arg, Result>
ptr_fun(Result (*fptr)(Arg))
{
return pointer_to_unary_function<Arg, Result>(fptr);
}
As you can see, this version of ptr_fun( ) deduces the argument and result types from fptr and uses them to initialize a pointer_to_unary_function object that stores fptr. The function call operator for pointer_to_unary_function just calls fptr, as you can see by the last line of its code:
template <class Arg, class Result>
class pointer_to_unary_function
: public unary_function<Arg, Result> {
Result (*fptr)(Arg); // stores the f-ptr
public:
pointer_to_unary_function(Result (*x)(Arg))
: fptr(x) {}
Result operator()(Arg x) const {return fptr(x);}
};
Since pointer_to_unary_function derives from unary_function, the appropriate type definitions come along for the ride and are available to not1.
There is also a binary version of ptr_fun( ), which returns a pointer_to_binary_function object (which derives from binary_function, of course) that behaves analogously to the unary case. The following program uses the binary version of ptr_fun( ) to raise numbers in a sequence to a power. It also reveals a ?gotcha? when passing overloaded functions to ptr_fun( ).
#include <algorithm>
#include <cmath>
#include <functional>
#include <iostream>
#include <iterator>
#include <vector>
using namespace std;
double d[] = { 01.23, 91.370, 56.661,
023.230, 19.959, 1.0, 3.14159 };
const int DSZ = sizeof d / sizeof *d;
int main() {
vector<double> vd;
transform(d, d + DSZ, back_inserter(vd),
bind2nd(ptr_fun<double, double, double>(pow), 2.0));
copy(vd.begin(), vd.end(),
ostream_iterator<double>(cout, " "));
cout << endl;
}
The pow( ) function is overloaded in the standard C++ header <cmath> for each of the floating-point data types, as follows:
float pow(float, int);
double pow(double, int);
long double pow(long double, int);
float pow(float, float);
double pow(double, double);
long double pow(long double, long double);
Since there are multiple versions of pow( ), the compiler has no way of knowing which to choose. In this case, we have to help the compiler by using explicit function template specialization.
An even trickier problem is that of converting a member function into a function object suitable for using with the generic algorithms. As a simple example, suppose we have the classical ?shape? problem and want to apply the draw( ) member function to each pointer in a container of Shape:
#include <algorithm>
#include <functional>
#include <iostream>
#include <vector>
#include "../purge.h"
using namespace std;
class Shape {
public:
virtual void draw() = 0;
virtual ~Shape() {}
};
class Circle : public Shape {
public:
virtual void draw() {
cout << "Circle::Draw()" << endl;
}
~Circle() {
cout << "Circle::~Circle()" << endl;
}
};
class Square : public Shape {
public:
virtual void draw() {
cout << "Square::Draw()" << endl;
}
~Square() {
cout << "Square::~Square()" << endl;
}
};
int main() {
vector<Shape*> vs;
vs.push_back(new Circle);
vs.push_back(new Square);
for_each(vs.begin(), vs.end(),
mem_fun(&Shape::draw));
purge(vs);
}
The for_each()
algorithm passes each element in a sequence to the function object specified as its third argument. To achieve this, we utilize mem_fun()
with a pointer to the member function, where the function object’s “argument” is the pointer to the object the member function is called for. The purge()
function, a custom creation, employs delete
on every element in the sequence.
mem_fun()
is employed to create function objects called using a pointer to the object for which the member function is invoked. On the other hand, mem_fun_ref()
directly calls the member function for an object. Both mem_fun()
and mem_fun_ref()
have overloads for member functions taking zero or one argument, multiplied by two to handle const vs. non-const member functions. However, the specifics of templates and overloading are managed behind the scenes; the key is knowing when to use mem_fun()
versus mem_fun_ref()
.
Suppose you have a container of objects (not pointers), and you want to call a member function that takes an argument. The argument you pass should come from a second container of objects. To accomplish this, use the second overloaded form of the transform( ) algorithm:
#include <algorithm>
#include <functional>
#include <iostream>
#include <iterator>
#include <vector>
using namespace std;
class Angle {
int degrees;
public:
Angle(int deg) : degrees(deg) {}
int mul(int times) {
return degrees *= times;
}
};
int main() {
vector<Angle> va;
for(int i = 0; i < 50; i += 10)
va.push_back(Angle(i));
int x[] = { 1, 2, 3, 4, 5 };
transform(va.begin(), va.end(), x,
ostream_iterator<int>(cout, " "),
mem_fun_ref(&Angle::mul));
cout << endl;
// Output: 0 20 60 120 200
}
Because the container is holding objects, mem_fun_ref( ) must be used with the pointer-to-member function. This version of transform( ) takes the start and end point of the first range (where the objects live); the starting point of the second range, which holds the arguments to the member function; the destination iterator, which in this case is standard output; and the function object to call for each object. This function object is created with mem_fun_ref( ) and the desired pointer to member. Notice that the transform( ) and for_each( ) template functions are incomplete; transform( ) requires that the function it calls return a value, and there is no for_each( ) that passes two arguments to the function it calls. Thus, you cannot call a member function that returns void and takes an argument using transform( ) or for_each( ).
Most any member function works with mem_fun_ref( ). You can also use standard library member functions, if your compiler doesn?t add any default arguments beyond the normal arguments specified in the standard . For example, suppose you?d like to read a file and search for blank lines; your compiler may allow you to use the string::empty( ) member function like this:
#include <algorithm>
#include <cassert>
#include <cstddef>
#include <fstream>
#include <functional>
#include <string>
#include <vector>
#include "../require.h"
using namespace std;
typedef vector<string>::iterator LSI;
int main(int argc, char* argv[]) {
char* fname = "FindBlanks.cpp";
if(argc > 1) fname = argv[1];
ifstream in(fname);
assure(in, fname);
vector<string> vs;
string s;
while(getline(in, s))
vs.push_back(s);
vector<string> cpy = vs; // For testing
LSI lsi = find_if(vs.begin(), vs.end(),
mem_fun_ref(&string::empty));
while(lsi != vs.end()) {
*lsi = "A BLANK LINE";
lsi = find_if(vs.begin(), vs.end(),
mem_fun_ref(&string::empty));
}
for(size_t i = 0; i < cpy.size(); i++)
if(cpy[i].size() == 0)
assert(vs[i] == "A BLANK LINE");
else
assert(vs[i] != "A BLANK LINE");
}
This example uses find_if( ) to locate the first blank line in the given range using mem_fun_ref( ) with string::empty( ). After the file is opened and read into the vector, the process is repeated to find every blank line in the file. Each time a blank line is found, it is replaced with the characters “A BLANK LINE.” All you have to do to accomplish this is dereference the iterator to select the current string.
STL Algorithms Quick Reference
This section offers a swift guide for pinpointing the right algorithm, leaving comprehensive exploration and detailed performance considerations to other references. Our focus is to swiftly familiarize you with the algorithms, assuming you’ll delve into specialized references for more in-depth details.
While the algorithms are commonly presented with full template declaration syntax, we omit it here, assuming your awareness of their template nature. The function declarations make it easy to discern template arguments, and the type names for arguments describe the required iterator types. This approach enhances readability, with the full declaration available in the template header file if needed.
Iterator class names convey the expected iterator type without the need for interface base classes. Your compiler will alert you if required iteration operations are absent. Brief descriptions of iterator variations follow.
InputIterator: An Input Iterator allows reading elements in a forward pass using operator++ and operator*. It can be tested with operator== and operator!=.
OutputIterator: An Output Iterator allows writing elements in a forward pass using operator++ and operator*. It cannot be tested with operator== and operator!=. Output Iterators are designed for continuous writing without checking for the destination’s end marker, crucial for compatibility with ostream and insert iterators like back_inserter().
ForwardIterator: A Forward Iterator relaxes the restrictions of Input and Output Iterators. It allows both reading and writing in a forward pass, supporting comparisons within the same range.
BidirectionalIterator: A Bidirectional Iterator extends the capabilities of a Forward Iterator by allowing backward movement with operator–.
RandomAccessIterator: A Random Access Iterator mirrors regular pointer operations. It supports jumps forward and backward, subscripting with operator[], comparison with operator<, operator>, and other integral operations, making it crucial for efficient algorithms like sorting.
Iterator names used in algorithm descriptions include the listed types, sometimes with “1” or “2” distinctions and additional arguments like function objects.
The mathematical range notation [first, last) indicates a range between the iterator pointing to the initial element and the past-the-end iterator. Algorithms often organize based on problem-solving, like searching or modifying elements, rather than the underlying behavior.
All algorithms are in the std namespace, typically found in <algorithm> unless specified otherwise in headers like <utility> or <numeric>.