[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

The other half of the STL is the algorithms,
which are templatized functions designed to work with the containers (or, as you
will see, anything that can behave like a container, including arrays** **and
**string** objects).

The STL was originally designed around the algorithms. The
goal was that you use algorithms for almost every piece of code that you write.
In this sense it was a bit of an experiment, and only time will tell how well it
works. The real test will be in how easy or difficult it is for the average
programmer to adapt. At the end of this chapter you’ll be able to decide
for yourself whether you find the algorithms addictive or too confusing to
remember. If you’re like me, you’ll resist them at first but then
tend to use them more and more.

Before you make your judgment, however, there’s one
other thing to consider. The STL algorithms provide a *vocabulary* with
which to describe solutions. That is, once you become familiar with the
algorithms you’ll have a new set of words with which to discuss what
you’re doing, and these words are at a higher level than what you’ve
had before. You don’t have to say “this loop moves through and
assigns from here to there ... oh, I see, it’s copying!” Instead,
you say **copy( )**. This is the kind of thing we’ve been doing in
computer programming from the beginning – creating more dense ways to
express *what* we’re doing and spending less time saying *how*
we’re doing it. Whether the STL algorithms and *generic programming
*are a great success in accomplishing this remains to be seen, but that is
certainly the objective.

A concept that is used heavily in the STL algorithms is the
*function object*, which was introduced in the previous chapter. A function
object has an overloaded **operator( )**, and the result is that a
template function can’t tell whether you’ve handed it a pointer to a
function or an object that has an **operator( )**; all the template
function knows is that it can attach an argument list to the object *as if*
it were a pointer to a function:

//: C05:FuncObject.cpp // Simple function objects #include <iostream> using namespace std; template<class UnaryFunc, class T> void callFunc(T& x, UnaryFunc f) { f(x); } void g(int& x) { x = 47; } struct UFunc { void operator()(int& x) { x = 48; } }; int main() { int y = 0; callFunc(y, g); cout << y << endl; y = 0; callFunc(y, UFunc()); cout << y << endl; } ///:~

The template **callFunc( )** says “give me an
**f** and an **x**, and I’ll write the code **f(x)**.” In
**main( )**, you can see that it doesn’t matter if **f** is a
pointer to a function (as in the case of **g( )**), or if it’s a
function object (which is created as a temporary object by the expression
**UFunc( )**). Notice you can only accomplish this genericity with a
template function; a non-template function is too particular about its argument
types to allow such a thing. The STL algorithms use this flexibility to take
either a function pointer or a function object, but you’ll usually find
that creating a function object is more powerful and flexible.

The function object is actually a variation on the theme of a
*callback*, which is described in the design patterns chapter. A callback
allows you to vary the behavior of a function or object by passing, as an
argument, a way to execute some other piece of code. Here, we are handing
**callFunc( )** a pointer to a function or a function object.

The following descriptions of function objects should not only
make that topic clear, but also give you an introduction to the way the STL
algorithms work.

Just as the STL classifies iterators (based on their
capabilities), it also classifies function objects based on the number of
arguments that their **operator( )** takes and the kind of value
returned by that operator (of course, this is also true for function pointers
when you treat them as function objects). The classification of function objects
in the STL is based on whether the **operator( )** takes zero, one or
two arguments, and if it returns a **bool** or non-**bool**
value.

**Generator**: Takes no arguments, and returns a value of
the desired type. A **RandomNumberGenerator** is a special case.

**UnaryFunction**: Takes a single argument of any type and
returns a value which may be of a different type.

**BinaryFunction**: Takes two arguments of any two types
and returns a value of any type.

A special case of the unary and binary functions is the
*predicate*, which simply means a function that returns a **bool**. A
predicate is a function you use to make a** true**/**false**
decision.

**Predicate**: This can also be called a
**UnaryPredicate**. It takes a single argument of any type and returns a
**bool**.

**BinaryPredicate**: Takes two arguments of any two types
and returns a **bool**.

**StrictWeakOrdering**: A binary predicate that says that
if you have two objects and neither one is less than the other, they can be
regarded as equivalent to each other.

In addition, there are sometimes qualifications on object
types that are passed to an algorithm. These qualifications are given in the
template argument type identifier name:

**LessThanComparable**: A class that has a less-than
**operator<**.

**Assignable**: A class that has an assignment
**operator=** for its own type.

The STL has, in the header file **<functional>**, a
set of templates that will automatically create function objects for you. These
generated function objects are admittedly simple, but the goal is to provide
very basic functionality that will allow you to compose more complicated
function objects, and in many situations this is all you’ll need. Also,
you’ll see that there are some *function object adapters* that allow
you to take the simple function objects and make them slightly more
complicated.

Here are the templates that generate function objects, along
with the expressions that they effect.

Name |
Type |
Result produced by generated function object |
---|---|---|

plus |
BinaryFunction |
arg1 + arg2 |

minus |
BinaryFunction |
arg1 - arg2 |

multiplies |
BinaryFunction |
arg1 * arg2 |

divides |
BinaryFunction |
arg1 / arg2 |

modulus |
BinaryFunction |
arg1 % arg2 |

negate |
UnaryFunction |
- arg1 |

equal_to |
BinaryPredicate |
arg1 == arg2 |

not_equal_to |
BinaryPredicate |
arg1 != arg2 |

greater |
BinaryPredicate |
arg1 > arg2 |

less |
BinaryPredicate |
arg1 < arg2 |

greater_equal |
BinaryPredicate |
arg1 >= arg2 |

less_equal |
BinaryPredicate |
arg1 <= arg2 |

logical_and |
BinaryPredicate |
arg1 && arg2 |

logical_or |
BinaryPredicate |
arg1 || arg2 |

logical_not |
UnaryPredicate |
!arg1 |

not1( ) |
Unary Logical |
!(UnaryPredicate(arg1)) |

not2( ) |
Binary Logical |
!(BinaryPredicate(arg1, arg2)) |

The following example provides simple tests for each of the
built-in basic function object templates. This way, you can see how to use each
one, along with their resulting behavior.

//: C05:FunctionObjects.cpp // Using the predefined function object templates // in the Standard C++ library // This will be defined shortly: #include "Generators.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; template<typename T> void print(vector<T>& v, char* msg = "") { if(*msg != 0) cout << msg << ":" << endl; copy(v.begin(), v.end(), ostream_iterator<T>(cout, " ")); cout << endl; } template<typename Contain, typename UnaryFunc> void testUnary(Contain& source, Contain& dest, UnaryFunc f) { transform(source.begin(), source.end(), dest.begin(), f); } template<typename Contain1, typename Contain2, typename BinaryFunc> void testBinary(Contain1& src1, Contain1& src2, Contain2& dest, BinaryFunc f) { transform(src1.begin(), src1.end(), src2.begin(), dest.begin(), f); } // Executes the expression, then stringizes the // expression into the print statement: #define T(EXPR) EXPR; print(r, "After " #EXPR); // For Boolean tests: #define B(EXPR) EXPR; print(br,"After " #EXPR); // Boolean random generator: struct BRand { BRand() { srand(time(0)); } bool operator()() { return rand() > RAND_MAX / 2; } }; int main() { const int sz = 10; const int max = 50; vector<int> x(sz), y(sz), r(sz); // An integer random number generator: URandGen urg(max); generate_n(x.begin(), sz, urg); generate_n(y.begin(), sz, urg); // Add one to each to guarantee nonzero divide: transform(y.begin(), y.end(), y.begin(), bind2nd(plus<int>(), 1)); // Guarantee one pair of elements is ==: x[0] = y[0]; print(x, "x"); print(y, "y"); // Operate on each element pair of x & y, // putting the result into r: T(testBinary(x, y, r, plus<int>())); T(testBinary(x, y, r, minus<int>())); T(testBinary(x, y, r, multiplies<int>())); T(testBinary(x, y, r, divides<int>())); T(testBinary(x, y, r, modulus<int>())); T(testUnary(x, r, negate<int>())); vector<bool> br(sz); // For Boolean results B(testBinary(x, y, br, equal_to<int>())); B(testBinary(x, y, br, not_equal_to<int>())); B(testBinary(x, y, br, greater<int>())); B(testBinary(x, y, br, less<int>())); B(testBinary(x, y, br, greater_equal<int>())); B(testBinary(x, y, br, less_equal<int>())); B(testBinary(x, y, br, not2(greater_equal<int>()))); B(testBinary(x,y,br,not2(less_equal<int>()))); vector<bool> b1(sz), b2(sz); generate_n(b1.begin(), sz, BRand()); generate_n(b2.begin(), sz, BRand()); print(b1, "b1"); print(b2, "b2"); B(testBinary(b1, b2, br, logical_and<int>())); B(testBinary(b1, b2, br, logical_or<int>())); B(testUnary(b1, br, logical_not<int>())); B(testUnary(b1, br, not1(logical_not<int>()))); } ///:~

To keep this example small, some tools are created. The
**print( )** template is designed to print any **vector<T>**,
along with an optional message. Since **print( )** uses the STL
**copy( )** algorithm to send objects to **cout** via an
**ostream_iterator**, the **ostream_iterator** must know the type of
object it is printing, and therefore the **print( )** template must know
this type also. However, you’ll see in **main( )** that the
compiler can deduce the type of **T** when you hand it a
**vector<T>**, so you don’t have to hand it the template argument
explicitly; you just say **print(x)** to print the **vector<T>
x**.

The next two template functions automate the process of
testing the various function object templates. There are two since the function
objects are either unary or binary. In **testUnary( )**, you pass a
source and destination vector, and a unary function object to apply to the
source vector to produce the destination vector. In **testBinary( )**,
there are two source vectors which are fed to a binary function to produce the
destination vector. In both cases, the template functions simply turn around and
call the **transform( )** algorithm, although the tests could certainly
be more complex.

For each test, you want to see a string describing what the
test is, followed by the results of the test. To automate this, the preprocessor
comes in handy; the **T( )** and **B( )** macros each take the
expression you want to execute. They call that expression, then call
**print( )**, passing it the result vector (they assume the expression
changes a vector named **r** and **br**, respectively), and to produce the
message the expression is “string-ized” using the preprocessor. So
that way you see the code of the expression that is executed followed by the
result vector.

The last little tool is a generator object that creates random
**bool** values. To do this, it gets a random number from **rand( )**
and tests to see if it’s greater than **RAND_MAX/2**. If the random
numbers are evenly distributed, this should happen half the time.

In **main( )**, three **vector<int>** are
created: **x** and **y** for source values, and **r** for results. To
initialize **x** and **y** with random values no greater than 50, a
generator of type **URandGen** is used; this will be defined shortly. Since
there is one operation where elements of **x** are divided by elements of
**y**, we must ensure that there are no zero values of **y**. This is
accomplished using the **transform( )** algorithm, taking the source
values from **y** and putting the results back into **y**. The function
object for this is created with the expression:

bind2nd(plus<int>(), 1)

This uses the **plus** function object that adds two
objects together. It is thus a binary function which requires two arguments; we
only want to pass it one argument (the element from **y**) and have the other
argument be the value 1. A “binder” does the trick (we will look at
these next). The binder in this case says “make a new function object
which is the **plus** function object with the second argument fixed at
1.”

Another of the tests in the program compares the elements in
the two vectors for equality, so it is interesting to guarantee that at least
one pair of elements is equivalent; in this case element zero is
chosen.

Once the two vectors are printed, **T( )** is used to
test each of the function objects that produces a numerical value, and then
**B( )** is used to test each function object that produces a Boolean
result. The result is placed into a **vector<bool>**, and when this
vector is printed it produces a ‘**1**’ for a true value and a
‘**0**’ for a false value.

It’s common to want to take a binary function object and
to “bind” one of its arguments to a constant value. After binding,
you get a unary function object.

For example, suppose you want to find integers that are less
than a particular value, say 20. Sensibly enough, the STL algorithms have a
function called **find_if( )** that will search through a sequence;
however, **find_if( )** requires a unary predicate to tell it if this is
what you’re looking for. This unary predicate can of course be some
function object that you have written by hand, but it can also be created using
the built-in function object templates. In this case, the **less** template
will work, but that produces a binary predicate, so we need some way of forming
a unary predicate. The binder templates (which work with any binary function
object, not just binary predicates) give you two choices:

**bind1st(const BinaryFunction& op, const T&
t);****bind2nd(const BinaryFunction& op, const T&
t);**

Both bind **t **to one of the arguments of **op**, but
**bind1st( )** binds **t** to the first argument, and
**bind2nd( )** binds **t** to the second argument. With **less**,
the function object that provides the solution to our exercise is:

bind2nd(less<int>(), 20);

This produces a new function object that returns true if its
argument is less than 20. Here it is, used with
**find_if( )**:

//: C05:Binder1.cpp // Using STL "binders" #include "Generators.h" #include "copy_if.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 10; const int max = 40; vector<int> a(sz), r; URandGen urg(max); ostream_iterator<int> out(cout, " "); generate_n(a.begin(), sz, urg); copy(a.begin(), a.end(), out); int* d = find_if(a.begin(), a.end(), bind2nd(less<int>(), 20)); cout << "\n *d = " << *d << endl; // copy_if() is not in the Standard C++ library // but is defined later in the chapter: copy_if(a.begin(), a.end(), back_inserter(r), bind2nd(less<int>(), 20)); copy(r.begin(), r.end(), out); cout << endl; } ///:~

The **vector<int>** **a** is filled with random
numbers between 0 and **max**. **find_if( )** finds the first element
in **a** that satisfies the predicate (that is, which is less than 20) and
returns an iterator to it (here, the type of the iterator is actually just
**int*** although I could have been more precise and said
**vector<int>::iterator** instead).

A more interesting algorithm to use is **copy_if( )**,
which isn’t part of the STL but is defined at the end of this chapter.
This algorithm only copies an element from the source to the destination if that
element satisfies a predicate. So the resulting vector will only contain
elements that are less than 20.

Here’s a second example, using a
**vector<string>** and replacing strings that satisfy particular
conditions:

//: C05:Binder2.cpp // More binders #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { ostream_iterator<string> out(cout, " "); vector<string> v, r; v.push_back("Hi"); v.push_back("Hi"); v.push_back("Hey"); v.push_back("Hee"); v.push_back("Hi"); copy(v.begin(), v.end(), out); cout << endl; // Replace each "Hi" with "Ho": replace_copy_if(v.begin(), v.end(), back_inserter(r), bind2nd(equal_to<string>(), "Hi"), "Ho"); copy(r.begin(), r.end(), out); cout << endl; // Replace anything that's not "Hi" with "Ho": replace_if(v.begin(), v.end(), not1(bind2nd(equal_to<string>(),"Hi")),"Ho"); copy(v.begin(), v.end(), out); cout << endl; } ///:~

This uses another pair of STL algorithms. The first,
**replace_copy_if( )**, copies each element from a source range to a
destination range, performing replacements on those that satisfy a particular
unary predicate. The second, **replace_if( )**, doesn’t do any
copying but instead performs the replacements directly into the original
range.

A binder doesn’t have to produce a unary predicate; it
can also create a unary function (that is, a function that returns something
other than **bool**). For example, suppose you’d like to multiply every
element in a **vector** by 10. Using a binder with the
**transform( )** algorithm does the trick:

//: C05:Binder3.cpp // Binders aren't limited to producing predicates #include "Generators.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; int main() { ostream_iterator<int> out(cout, " "); vector<int> v(15); generate(v.begin(), v.end(), URandGen(20)); copy(v.begin(), v.end(), out); cout << endl; transform(v.begin(), v.end(), v.begin(), bind2nd(multiplies<int>(), 10)); copy(v.begin(), v.end(), out); cout << endl; } ///:~

Since the third argument to **transform( ) **is the
same as the first, the resulting elements are copied back into the source
vector. The function object created by **bind2nd( )** in this case
produces an **int** result.

The “bound” argument to a binder cannot be a
function object, but it does not have to be a compile-time constant. For
example:

//: C05:Binder4.cpp // The bound argument does not have // to be a compile-time constant #include "copy_if.h" #include "PrintSequence.h" #include "../require.h" #include <iostream> #include <algorithm> #include <functional> #include <cstdlib> using namespace std; int boundedRand() { return rand() % 100; } int main(int argc, char* argv[]) { requireArgs(argc, 1, "usage: Binder4 int"); const int sz = 20; int a[20], b[20] = {0}; generate(a, a + sz, boundedRand); int* end = copy_if(a, a + sz, b, bind2nd(greater<int>(), atoi(argv[1]))); // Sort for easier viewing: sort(a, a + sz); sort(b, end); print(a, a + sz, "array a", " "); print(b, end, "values greater than yours"," "); } ///:~

Here, an array is filled with random numbers between 0 and
100, and the user provides a value on the command line. In the
**copy_if( )** call, you can see that the bound argument to
**bind2nd( )** is the result of the function call **atoi( )**
(from **<cstdlib>**).

Any place in an STL algorithm where a function object is
required, it’s very conceivable that you’d like to use a function
pointer instead. Actually, you *can* use an ordinary function pointer
– that’s how the STL was designed, so that a “function
object” can actually be anything that can be dereferenced using an
argument list. For example, the **rand( )** random number generator can
be passed to **generate( )** or **generate_n( )** as a function
pointer, like this:

//: C05:RandGenTest.cpp // A little test of the random number generator #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; int main() { const int sz = 10000; int v[sz]; srand(time(0)); // Seed the random generator for(int i = 0; i < 300; i++) { // Using a naked pointer to function: generate(v, v + sz, std::rand); int count = count_if(v, v + sz, bind2nd(greater<int>(), RAND_MAX/2)); cout << (((double)count)/((double)sz)) * 100 << ' '; } } ///:~

The “iterators” in this case are just the starting
and past-the-end pointers for the array **v**, and the generator is just a
pointer to the standard library **rand( )** function. The program
repeatedly generates a group of random numbers, then it uses the STL algorithm
**count_if( )** and a predicate that tells whether a particular element
is greater than **RAND_MAX/2**. The result is the number of elements that
match this criterion; this is divided by the total number of elements and
multiplied by 100 to produce the percentage of elements greater than the
midpoint. If the random number generator is reasonable, this value should hover
at around 50% (of course, there are many other tests to determine if the random
number generator is reasonable).

The **ptr_fun( ) **adapters take a pointer to a
function and turn it into a function object. They are not designed for a
function that takes no arguments, like the one above (that is, a generator).
Instead, they are for unary functions and binary functions. However, these could
also be simply passed as if they were function objects, so the
**ptr_fun( )** adapters might at first appear to be redundant.
Here’s an example where using **ptr_fun( )** and simply passing
the address of the function both produce the same results:

//: C05:PtrFun1.cpp // Using ptr_fun() for single-argument functions #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; char* n[] = { "01.23", "91.370", "56.661", "023.230", "19.959", "1.0", "3.14159" }; const int nsz = sizeof n / sizeof *n; template<typename InputIter> void print(InputIter first, InputIter last) { while(first != last) cout << *first++ << "\t"; cout << endl; } int main() { print(n, n + nsz); vector<double> vd; transform(n, n + nsz, back_inserter(vd), atof); print(vd.begin(), vd.end()); transform(n,n + nsz,vd.begin(), ptr_fun(atof)); print(vd.begin(), vd.end()); } ///:~

The goal of this program is to convert an array of
**char*** which are ASCII representations of floating-point numbers into a
**vector<double>**. After defining this array and the
**print( )** template (which encapsulates the act of printing a range of
elements), you can see **transform( )** used with **atof( )** as
a “naked” pointer to a function, and then a second time with
**atof** passed to **ptr_fun( )**. The results are the same. So why
bother with **ptr_fun( )**? Well, the actual effect of
**ptr_fun( )** is to create a function object with an
**operator( )**. This function object can then be passed to other
template adapters, such as binders, to create new function objects. As
you’ll see a bit later, the SGI extensions to the STL contain a number of
other function templates to enable this, but in the Standard C++ STL there are
only the **bind1st( )** and **bind2nd( )** function templates,
and these expect binary function objects as their first arguments. In the above
example, only the **ptr_fun( )** for a unary function is used, and that
doesn’t work with the binders. So **ptr_fun( )** used with a unary
function in Standard C++ really is redundant (note that Gnu g++ uses the SGI
STL).

With a binary function and a binder, things can be a little
more interesting. This program produces the squares of the input vector
**d**:

//: C05:PtrFun2.cpp // Using ptr_fun() for two-argument functions #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cmath> using namespace std; double d[] = { 01.23, 91.370, 56.661, 023.230, 19.959, 1.0, 3.14159 }; const int dsz = sizeof d / sizeof *d; int main() { vector<double> vd; transform(d, d + dsz, back_inserter(vd), bind2nd(ptr_fun(pow), 2.0)); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, " ")); cout << endl; } ///:~

Here, **ptr_fun( )** is indispensable;
**bind2nd( )** *must* have a function object as its first argument
and a pointer to function won’t cut it.

A trickier problem is that of converting a member function
into a function object suitable for using in the STL algorithms. As a simple
example, suppose we have the “shape” problem and would like to apply
the **draw( )** member function to each pointer in a container of
**Shape**:

//: C05:MemFun1.cpp // Applying pointers to member functions #include "../purge.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; class Shape { public: virtual void draw() = 0; virtual ~Shape() {} }; class Circle : public Shape { public: virtual void draw() { cout << "Circle::Draw()" << endl; } ~Circle() { cout << "Circle::~Circle()" << endl; } }; class Square : public Shape { public: virtual void draw() { cout << "Square::Draw()" << endl; } ~Square() { cout << "Square::~Square()" << endl; } }; int main() { vector<Shape*> vs; vs.push_back(new Circle); vs.push_back(new Square); for_each(vs.begin(), vs.end(), mem_fun(&Shape::draw)); purge(vs); } ///:~

The **for_each( )** function does just what it sounds
like it does: passes each element in the range determined by the first two
(iterator) arguments to the function object which is its third argument. In this
case we want the function object to be created from one of the member functions
of the class itself, and so the function object’s “argument”
becomes the pointer to the object that the member function is called for. To
produce such a function object, the **mem_fun( )** template takes a
pointer to member as its argument.

The **mem_fun( )** functions are for producing
function objects that are called using a pointer to the object that the member
function is called for, while **mem_fun_ref( )** is used for calling the
member function directly for an object. One set of overloads of both
**mem_fun( )** and **mem_fun_ref( )** are for member functions
that take zero arguments and one argument, and this is multiplied by two to
handle **const** vs. non-**const** member functions. However, templates
and overloading takes care of sorting all of that out; all you need to remember
is when to use **mem_fun( )** vs. **mem_fun_ref( )**.

Suppose you have a container of objects (not pointers) and you
want to call a member function that takes an argument. The argument you pass
should come from a second container of objects. To accomplish this, the second
overloaded form of the **transform( )** algorithm is used:

//: C05:MemFun2.cpp // Applying pointers to member functions #include <algorithm> #include <vector> #include <iostream> #include <functional> using namespace std; class Angle { int degrees; public: Angle(int deg) : degrees(deg) {} int mul(int times) { return degrees *= times; } }; int main() { vector<Angle> va; for(int i = 0; i < 50; i += 10) va.push_back(Angle(i)); int x[] = { 1, 2, 3, 4, 5 }; transform(va.begin(), va.end(), x, ostream_iterator<int>(cout, " "), mem_fun_ref(&Angle::mul)); cout << endl; } ///:~

Because the container is holding objects,
**mem_fun_ref( )** must be used with the pointer-to-member function.
This version of **transform( )** takes the start and end point of the
first range (where the objects live), the starting point of second range which
holds the arguments to the member function, the destination iterator which in
this case is standard output, and the function object to call for each object;
this function object is created with **mem_fun_ref( )** and the desired
pointer to member. Notice the **transform( )** and
**for_each( )** template functions are incomplete;
**transform( )** requires that the function it calls return a value and
there is no **for_each( )** that passes two arguments to the function it
calls. Thus, you cannot call a member function that returns **void** and
takes an argument using **transform( )** or
**for_each( )**.

Any member function works, including those in the Standard
libraries. For example, suppose you’d like to read a file and search for
blank lines; you can use the **string::empty( )** member function like
this:

//: C05:FindBlanks.cpp // Demonstrate mem_fun_ref() with string::empty() #include "../require.h" #include <algorithm> #include <list> #include <string> #include <fstream> #include <functional> using namespace std; typedef list<string>::iterator LSI; LSI blank(LSI begin, LSI end) { return find_if(begin, end, mem_fun_ref(&string::empty)); } int main(int argc, char* argv[]) { requireArgs(argc, 1); ifstream in(argv[1]); assure(in, argv[1]); list<string> ls; string s; while(getline(in, s)) ls.push_back(s); LSI lsi = blank(ls.begin(), ls.end()); while(lsi != ls.end()) { *lsi = "A BLANK LINE"; lsi = blank(lsi, ls.end()); } string f(argv[1]); f += ".out"; ofstream out(f.c_str()); copy(ls.begin(), ls.end(), ostream_iterator<string>(out, "\n")); } ///:~

The **blank( )** function uses **find_if( )**
to locate the first blank line in the given range using
**mem_fun_ref( )** with **string::empty( )**. After the file is
opened and read into the **list**, **blank( )** is called repeated
times to find every blank line in the file. Notice that subsequent calls to
**blank( )** use the current version of the iterator so it moves forward
to the next one. Each time a blank line is found, it is replaced with the
characters “A BLANK LINE.” All you have to do to accomplish this is
dereference the iterator, and you select the current
**string**.

The SGI STL (mentioned at the end of the previous chapter)
also includes additional function object templates, which allow you to write
expressions that create even more complicated function objects. Consider a more
involved program which converts strings of digits into floating point numbers,
like **PtrFun2.cpp** but more general. First, here’s a generator that
creates strings of integers that represent floating-point values (including an
embedded decimal point):

//: C05:NumStringGen.h // A random number generator that produces // strings representing floating-point numbers #ifndef NUMSTRINGGEN_H #define NUMSTRINGGEN_H #include <string> #include <cstdlib> #include <ctime> class NumStringGen { const int sz; // Number of digits to make public: NumStringGen(int ssz = 5) : sz(ssz) { std::srand(std::time(0)); } std::string operator()() { static char n[] = "0123456789"; const int nsz = 10; std::string r(sz, ' '); for(int i = 0; i < sz; i++) if(i == sz/2) r[i] = '.'; // Insert a decimal point else r[i] = n[std::rand() % nsz]; return r; } }; #endif // NUMSTRINGGEN_H ///:~

You tell it how big the **string**s should be when you
create the **NumStringGen** object. The random number generator is used to
select digits, and a decimal point is placed in the middle.

The following program (which works with the Standard C++ STL
without the SGI extensions) uses **NumStringGen** to fill a
**vector<string>**. However, to use the Standard C library function
**atof( )** to convert the strings to floating-point numbers, the
**string** objects must first be turned into **char** pointers, since
there is no automatic type conversion from **string** to **char***. The
**transform( )** algorithm can be used with **mem_fun_ref( )**
and **string::c_str( )** to convert all the **string**s to
**char***, and then these can be transformed using **atof**:

//: C05:MemFun3.cpp // Using mem_fun() #include "NumStringGen.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 9; vector<string> vs(sz); // Fill it with random number strings: generate(vs.begin(), vs.end(), NumStringGen()); copy(vs.begin(), vs.end(), ostream_iterator<string>(cout, "\t")); cout << endl; const char* vcp[sz]; transform(vs.begin(), vs.end(), vcp, mem_fun_ref(&string::c_str)); vector<double> vd; transform(vcp,vcp + sz,back_inserter(vd), std::atof); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, "\t")); cout << endl; } ///:~

The SGI extensions to the STL contain a number of additional
function object templates that accomplish more detailed activities than the
Standard C++ function object templates, including **identity **(returns its
argument unchanged), **project1st** and **project2nd** (to take two
arguments and return the first or second one, respectively), **select1st**
and **select2nd** (to take a **pair** object and return the first or
second element, respectively), and the “compose” function
templates.

If you’re using the SGI extensions, you can make the
above program denser using one of the two “compose” function
templates. The first, **compose1(f1, f2)**, takes the two function
objects **f1 **and **f2 **as its arguments. It produces a function object
that takes a single argument, passes it to **f2**, then takes the result of
the call to **f2** and passes it to **f1**. The result of **f1** is
returned. By using **compose1( )**, the process of converting the
**string** objects to **char***, then converting the **char*** to a
floating-point number can be combined into a single operation, like
this:

//: C05:MemFun4.cpp // Using the SGI STL compose1 function #include "NumStringGen.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> using namespace std; int main() { const int sz = 9; vector<string> vs(sz); // Fill it with random number strings: generate(vs.begin(), vs.end(), NumStringGen()); copy(vs.begin(), vs.end(), ostream_iterator<string>(cout, "\t")); cout << endl; vector<double> vd; transform(vs.begin(), vs.end(), back_inserter(vd), compose1(ptr_fun(atof), mem_fun_ref(&string::c_str))); copy(vd.begin(), vd.end(), ostream_iterator<double>(cout, "\t")); cout << endl; } ///:~

You can see there’s only a single call to
**transform( )** now, and no intermediate holder for the **char**
pointers.

The second “compose” function is
**compose2( )**, which takes three function objects as its arguments.
The first function object is binary (it takes two arguments), and its arguments
are the results of the second and third function objects, respectively. The
function object that results from **compose2( )** expects one argument,
and it feeds that argument to the second and third function objects. Here is an
example:

//: C05:Compose2.cpp // Using the SGI STL compose2() function #include "copy_if.h" #include <algorithm> #include <vector> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; int main() { srand(time(0)); vector<int> v(100); generate(v.begin(), v.end(), rand); transform(v.begin(), v.end(), v.begin(), bind2nd(divides<int>(), RAND_MAX/100)); vector<int> r; copy_if(v.begin(), v.end(), back_inserter(r), compose2(logical_and<bool>(), bind2nd(greater_equal<int>(), 30), bind2nd(less_equal<int>(), 40))); sort(r.begin(), r.end()); copy(r.begin(), r.end(), ostream_iterator<int>(cout, " ")); cout << endl; } ///:~

The **vector<int> v** is first filled with random
numbers. To cut these down to size, the **transform( )** algorithm is
used to divide each value by **RAND_MAX/100**, which will force the values to
be between 0 and 100 (making them more readable). The **copy_if( )**
algorithm defined later in this chapter is then used, along with a composed
function object, to copy all the elements that are greater than or equal to 30
and less than or equal to 40 into the destination **vector<int> r**.
Just to show how easy it is, **r** is sorted, and then displayed.

The arguments of **compose2( )** say, in
effect:

(x >= 30) && (x <= 40)

You could also take the function object that comes from a
**compose1( )** or **compose2( )** call and pass it into another
“compose” expression ... but this could rapidly get very difficult
to decipher.

Instead of all this composing and transforming, you can write
your own function objects (*without *using the SGI extensions) as
follows:

//: C05:NoCompose.cpp // Writing out the function objects explicitly #include "copy_if.h" #include <algorithm> #include <vector> #include <string> #include <iostream> #include <functional> #include <cstdlib> #include <ctime> using namespace std; class Rgen { const int max; public: Rgen(int mx = 100) : max(RAND_MAX/mx) { srand(time(0)); } int operator()() { return rand() / max; } }; class BoundTest { int top, bottom; public: BoundTest(int b, int t) : bottom(b), top(t) {} bool operator()(int arg) { return (arg >= bottom) && (arg <= top); } }; int main() { vector<int> v(100); generate(v.begin(), v.end(), Rgen()); vector<int> r; copy_if(v.begin(), v.end(), back_inserter(r), BoundTest(30, 40)); sort(r.begin(), r.end()); copy(r.begin(), r.end(), ostream_iterator<int>(cout, " ")); cout << endl; } ///:~

There are a few more lines of code, but you can’t deny
that it’s much clearer and easier to understand, and therefore to
maintain.

We can thus observe two drawbacks to the SGI extensions to the
STL. The first is simply that it’s an extension; yes, you can download and
use them for free so the barriers to entry are low, but your company may be
conservative and decide that if it’s not in Standard C++, they don’t
want to use it. The second drawback is complexity. Once you get familiar and
comfortable with the idea of composing complicated functions from simple ones
you can visually parse complicated expressions and figure out what they mean.
However, my guess is that most people will find anything more than what you can
do with the Standard, non-extended STL function object notation to be
overwhelming. At some point on the complexity curve you have to bite the bullet
and write a regular class to produce your function object, and that point might
as well be the point where you can’t use the Standard C++ STL. A
stand-alone class for a function object is going to be much more readable and
maintainable than a complicated function-composition expression (although my
sense of adventure does lure me into wanting to experiment more with the SGI
extensions...).

As a final note, you can’t compose generators; you can
only create function objects whose **operator( )** requires one or two
arguments.

This section provides a quick reference for when you’re
searching for the appropriate algorithm. I leave the full exploration of all the
STL algorithms to other references (see the end of this chapter, and Appendix
XX), along with the more intimate details of complexity, performance, etc. My
goal here is for you to become rapidly comfortable and facile with the
algorithms, and I will assume you will look into the more specialized references
if you need more depth of detail.

Although you will often see the algorithms described using
their full template declaration syntax, I am not doing that here because you
already know they are templates, and it’s quite easy to see what the
template arguments are from the function declarations. The type names for the
arguments provide descriptions for the types of iterators required. I think
you’ll find this form is easier to read, while you can quickly find the
full declaration in the template header file if for some reason you feel the
need.

The names of the iterator classes describe the iterator type
they must conform to. The iterator types were described in the previous chapter,
but here is a summary:

**InputIterator**.** **You (or rather, the STL algorithm
and any algorithms you write that use **InputIterator**s)** **can
increment this with **operator++** and dereference it with **operator***
to *read* the value (and *only* read the value), but you can only read
each value once. **InputIterator**s** **can be tested with
**operator==** and **operator!=**. That’s all. Because an
**InputIterator** is so limited, it can be used with **istream**s (via
**istream_iterator**).

**OutputIterator**.** **This can be incremented with
**operator++**, and dereferenced with **operator*** to *write* the
value (and *only* write the value), but you can only dereference/write each
value once. **OutputIterator**s cannot be tested with **operator==** and
**operator!=**, however, because you assume that you can just keep sending
elements to the destination and that you don’t have to see if the
destination’s end marker has been reached. That is, the container that an
**OutputIterator** references can take an infinite number of objects, so no
end-checking is necessary. This requirement is important so that an
**OutputIterator** can be used with **ostream**s (via
**ostream_iterator**), but you’ll also commonly use the
“insert” iterators **insert_iterator**,
**front_insert_iterator** and **back_insert_iterator** (generated by the
helper templates **inserter( )**, **front_inserter( )** and
**back_inserter( )**).

With both **InputIterator** and **OutputIterator**, you
cannot have multiple iterators pointing to different parts of the same range.
Just think in terms of iterators to support **istream**s and **ostream**s,
and **InputIterator** and **OutputIterator** will make perfect sense. Also
note that **InputIterator** and **OutputIterator** put the weakest
restrictions on the types of iterators they will accept, which means that you
can use any “more sophisticated” type of iterator when you see
**InputIterator** or **OutputIterator** used as STL algorithm template
arguments.

**ForwardIterator**. **InputIterator** and
**OutputIterator** are the most restricted, which means they’ll work
with the largest number of actual iterators. However, there are some operations
for which they are too restricted; you can only read from an
**InputIterator** and write to an **OutputIterator**, so you can’t
use them to read and modify a range, for example, and you can’t have more
than one active iterator on a particular range, or dereference such an iterator
more than once. With a **ForwardIterator** these restrictions are relaxed;
you can still only move forward using **operator++**, but you can both write
and read and you can write/read multiple times in each location. A
**ForwardIterator** is much more like a regular pointer, whereas
**InputIterator** and **OutputIterator** are a bit strange by
comparison.

**BidirectionalIterator**.** **Effectively, this is a
**ForwardIterator** that can also go backward. That is, a
**BidirectionalIterator** supports all the operations that a
**ForwardIterator** does, but in addition it has an **operator--**.

**RandomAccessIterator**. An iterator that is random access
supports all the same operations that a regular pointer does: you can add and
subtract integral values to move it forward and backward by jumps (rather than
just one element at a time), you can subscript it with **operator[ ]**,
you can subtract one iterator from another, and iterators can be compared to see
which is greater using **operator<**, **operator>**, etc. If
you’re implementing a sorting routine or something similar, random access
iterators are necessary to be able to create an efficient algorithm.

The names used for the template parameter types consist of the
above iterator types (sometimes with a ‘1’ or ‘2’
appended to distinguish different template arguments), and may also include
other arguments, often function objects.

When describing the group of elements that an operation is
performed on, mathematical “range” notation will often be used. In
this, the square bracket means “includes the end point” while the
parenthesis means “does not include the end point.” When using
iterators, a range is determined by the iterator pointing to the initial
element, and the “past-the-end” iterator, pointing past the last
element. Since the past-the-end element is never used, the range determined by a
pair of iterators can thus be expressed as **[first, last)**, where
**first** is the iterator pointing to the initial element and **last** is
the past-the-end iterator.

Most books and discussions of the STL algorithms arrange them
according to side effects: non-mutating algorithms don’t change the
elements in the range, mutating algorithms do change the elements, etc. These
descriptions are based more on the underlying behavior or implementation of the
algorithm – that is, the designer’s perspective. In practice, I
don’t find this a very useful categorization so I shall instead organize
them according to the problem you want to solve: are you searching for an
element or set of elements, performing an operation on each element, counting
elements, replacing elements, etc. This should help you find the one you want
more easily.

Note that all the algorithms are in the **namespace std**.
If you do not see a different header such as **<utility>** or
**<numerics>** above the function declarations, that means it appears
in **<algorithm>**.

It’s useful to create some basic tools with which to
test the algorithms.

Displaying a range is something that will be done constantly,
so here is a templatized function that allows you to print any sequence,
regardless of the type that’s in that sequence:

//: C05:PrintSequence.h // Prints the contents of any sequence #ifndef PRINTSEQUENCE_H #define PRINTSEQUENCE_H #include <iostream> template<typename InputIter> void print(InputIter first, InputIter last, char* nm = "", char* sep = "\n", std::ostream& os = std::cout) { if(*nm != '\0') // Only if you provide a string os << nm << ": " << sep; // is this printed while(first != last) os << *first++ << sep; os << std::endl; } // Use template-templates to allow type deduction // of the typename T: template<typename T, template<typename> class C> void print(C<T>& c, char* nm = "", char* sep = "\n", std::ostream& os = std::cout) { if(*nm != '\0') // Only if you provide a string os << nm << ": " << sep; // is this printed std::copy(c.begin(), c.end(), std::ostream_iterator<T>(os, " ")); cout << endl; } #endif // PRINTSEQUENCE_H ///:~

There are two forms here, one that requires you to give an
explicit range (this allows you to print an array or a sub-sequence) and one
that prints any of the STL containers, which provides notational convenience
when printing the entire contents of that container. The second form performs
template type deduction to determine the type of **T** so it can be used in
the **copy( )** algorithm. That trick wouldn’t work with the first
form, so the **copy( )** algorithm is avoided and the copying is just
done by hand (this could have been done with the second form as well, but
it’s instructive to see a template-template in use). Because of this, you
never need to specify the type that you’re printing when you call either
template function.

The default is to print to **cout** with newlines as
separators, but you can change that. You may also provide a message to print at
the head of the output.

Next, it’s useful to have some generators (classes with
an **operator( )** that returns values of the appropriate type) which
allow a sequence to be rapidly filled with different values.

//: C05:Generators.h // Different ways to fill sequences #ifndef GENERATORS_H #define GENERATORS_H #include <set> #include <cstdlib> #include <cstring> #include <ctime> // A generator that can skip over numbers: class SkipGen { int i; int skp; public: SkipGen(int start = 0, int skip = 1) : i(start), skp(skip) {} int operator()() { int r = i; i += skp; return r; } }; // Generate unique random numbers from 0 to mod: class URandGen { std::set<int> used; int modulus; public: URandGen(int mod) : modulus(mod) { std::srand(std::time(0)); } int operator()() { while(true) { int i = (int)std::rand() % modulus; if(used.find(i) == used.end()) { used.insert(i); return i; } } } }; // Produces random characters: class CharGen { static const char* source; static const int len; public: CharGen() { std::srand(std::time(0)); } char operator()() { return source[std::rand() % len]; } }; // Statics created here for convenience, but // will cause problems if multiply included: const char* CharGen::source = "ABCDEFGHIJK" "LMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; const int CharGen::len = std::strlen(source); #endif // GENERATORS_H ///:~

To create some interesting values, the **SkipGen**
generator skips by the value **skp** each time its **operator( )** is
called. You can initialize both the start value and the skip value in the
constructor.

**URandGen** (‘U’ for “unique”) is
a generator for random **int**s between 0 and **mod**, with the additional
constraint that each value can only be produced once (thus you must be careful
not to use up all the values). This is easily accomplished with a
**set**.

**CharGen **generates **char**s and can be used to fill
up a **string** (when treating a **string** as a sequence container).
You’ll note that the one member function that any generator implements is
**operator( )** (with no arguments). This is what is called by the
“generate” functions.

The use of the generators and the **print( )**
functions is shown in the following section.

Finally, a number of the STL algorithms that move elements of
a sequence around distinguish between “stable” and
“unstable” reordering of a sequence. This refers to preserving the
original order of the elements for those elements that are equivalent but not
identical. For example, consider a sequence **{ c(1), b(1), c(2), a(1), b(2),
a(2) }**. These elements are tested for equivalence based on their letters,
but their numbers indicate how they first appeared in the sequence. If you sort
(for example) this sequence using an unstable sort, there’s no guarantee
of any particular order among equivalent letters, so you could end up with **{
a(2), a(1), b(1), b(2), c(2), c(1) }**. However, if you used a stable sort, it
guarantees you will get **{ a(1), a(2), b(1), b(2), c(1), c(2) }**.

To demonstrate the stability versus instability of algorithms
that reorder a sequence, we need some way to keep track of how the elements
originally appeared. The following is a kind of **string** object that keeps
track of the order in which that particular object originally appeared, using a
**static map** that maps **NString**s to **Counter**s. Each
**NString** then contains an **occurrence** field that indicates the order
in which this **NString** was discovered:

//: C05:NString.h // A "numbered string" that indicates which // occurrence this is of a particular word #ifndef NSTRING_H #define NSTRING_H #include <string> #include <map> #include <iostream> class NString { std::string s; int occurrence; struct Counter { int i; Counter() : i(0) {} Counter& operator++(int) { i++; return *this; } // Post-incr operator int() { return i; } }; // Keep track of the number of occurrences: typedef std::map<std::string, Counter> csmap; static csmap occurMap; public: NString() : occurrence(0) {} NString(const std::string& x) : s(x), occurrence(occurMap[s]++) {} NString(const char* x) : s(x), occurrence(occurMap[s]++) {} // The synthesized operator= and // copy-constructor are OK here friend std::ostream& operator<<( std::ostream& os, const NString& ns) { return os << ns.s << " [" << ns.occurrence << "]"; } // Need this for sorting. Notice it only // compares strings, not occurrences: friend bool operator<(const NString& l, const NString& r) { return l.s < r.s; } // For sorting with greater<NString>: friend bool operator>(const NString& l, const NString& r) { return l.s > r.s; } // To get at the string directly: operator const std::string&() const {return s;} }; // Allocate static member object. Done here for // brevity, but should actually be done in a // separate cpp file: NString::csmap NString::occurMap; #endif // NSTRING_H ///:~

In the constructors (one that takes a **string**, one that
takes a **char***), the simple-looking initialization
**occurrence(occurMap[s]++)** performs all the work of maintaining and
assigning the occurrence counts (see the demonstration of the **map** class
in the previous chapter for more details).

To do an ordinary ascending sort, the only operator
that’s necessary is **NString::operator<( )**, however to sort
in reverse order the **operator>( )** is also provided so that the
**greater** template can be used.

As this is just a demonstration class I am getting away with
the convenience of putting the definition of the static member **occurMap**
in the header file, but this will break down if the header file is included in
more than one place, so you should normally relegate all **static**
definitions to **cpp** files.

These algorithms allow you to automatically fill a range with
a particular value, or to generate a set of values for a particular range (these
were introduced in the previous chapter). The “fill” functions
insert a single value multiple times into the container, while the
“generate” functions use an object called a *generator*
(described earlier) to create the values to insert into the container.

**void fill(ForwardIterator first, ForwardIterator last,
const T& value);****void fill_n(OutputIterator first, Size n, const
T& value);**

**fill( )** assigns **value** to every element in
the range **[first, last)**. **fill_n( )** assigns **value** to
**n** elements starting at **first**.

**void generate(ForwardIterator first, ForwardIterator last,
Generator gen);****void generate_n(OutputIterator first, Size n,
Generator gen);**

**generate( )** makes a call to **gen( )** for
each element in the range **[first, last)**, presumably** **to produce a
different value for each element. **generate_n( )** calls
**gen( )** **n** times and assigns each result to **n** elements
starting at **first**.

The following example fills and generates into **vector**s.
It also shows the use of **print( )**:

//: C05:FillGenerateTest.cpp // Demonstrates "fill" and "generate" #include "Generators.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <string> using namespace std; int main() { vector<string> v1(5); fill(v1.begin(), v1.end(), "howdy"); print(v1, "v1", " "); vector<string> v2; fill_n(back_inserter(v2), 7, "bye"); print(v2.begin(), v2.end(), "v2"); vector<int> v3(10); generate(v3.begin(), v3.end(), SkipGen(4,5)); print(v3, "v3", " "); vector<int> v4; generate_n(back_inserter(v4),15, URandGen(30)); print(v4, "v4", " "); } ///:~

A **vector<string>** is created with a pre-defined
size. Since storage has already been created for all the **string** objects
in the **vector**, **fill( )** can use its assignment operator to
assign a copy of “howdy” to each space in the **vector**. To
print the result, the second form of **print( )** is used which simply
needs a container (you don’t have to give the first and last iterators).
Also, the default newline separator is replaced with a space.

The second **vector<string> v2** is not given an
initial size so **back_inserter** must be used to force new elements in
instead of trying to assign to existing locations. Just as an example, the other
**print( )** is used which requires a range.

The **generate( )** and **generate_n( )**
functions have the same form as the “fill” functions except that
they use a generator instead of a constant value; here, both generators are
demonstrated.

All containers have a method **size( )** that will
tell you how many elements they hold. The following two algorithms count objects
only if they satisfy certain criteria.

**IntegralValue count(InputIterator first, InputIterator
last, **** const EqualityComparable& value);**

Produces the number of elements in **[first, last)** that
are equivalent to **value** (when tested using **operator==**).

**IntegralValue count_if(InputIterator first, InputIterator
last, Predicate pred);**

Produces the number of elements** **in **[first, last)**
which each cause **pred** to return **true**.

Here, a **vector<char> v** is** **filled with
random characters (including some duplicates). A **set<char>** is
initialized from **v**, so it holds only one of each letter represented in
**v**. This **set** is used to count all the instances of all the
different characters, which are then displayed:

//: C05:Counting.cpp // The counting algorithms #include "PrintSequence.h" #include "Generators.h" #include <vector> #include <algorithm> using namespace std; int main() { vector<char> v; generate_n(back_inserter(v), 50, CharGen()); print(v, "v", ""); // Create a set of the characters in v: set<char> cs(v.begin(), v.end()); set<char>::iterator it = cs.begin(); while(it != cs.end()) { int n = count(v.begin(), v.end(), *it); cout << *it << ": " << n << ", "; it++; } int lc = count_if(v.begin(), v.end(), bind2nd(greater<char>(), 'a')); cout << "\nLowercase letters: " << lc << endl; sort(v.begin(), v.end()); print(v, "sorted", ""); } ///:~

The **count_if( )** algorithm is demonstrated by
counting all the lowercase letters; the predicate is created using the
**bind2nd( )** and **greater** function object
templates.

These algorithms allow you to move sequences around.

**OutputIterator copy(InputIterator, first InputIterator
last, OutputIterator destination);**

Using assignment, copies from **[first, last)** to
**destination**, incrementing **destination** after each assignment. Works
with almost any type of source range and almost any kind of destination. Because
assignment is used, you cannot directly insert elements into an empty container
or at the end of a container, but instead you must wrap the **destination**
iterator in an **insert_iterator** (typically by using
**back_inserter( )**, or **inserter( )** in the case of an
associative container).

The copy algorithm is used in many examples in this
book.

**BidirectionalIterator2 copy_backward(BidirectionalIterator1
first, **** BidirectionalIterator1 last, BidirectionalIterator2
destinationEnd);**

Like **copy( )**, but performs the actual copying of
the elements in reverse order. That is, the resulting sequence is the same,
it’s just that the copy happens in a different way. The source range
**[first, last)** is copied to the destination, but the first destination
element is **destinationEnd - 1**. This iterator is then decremented after
each assignment. The space in the destination range must already exist (to allow
assignment), and the destination range cannot be within the source
range.

**void reverse(BidirectionalIterator first,
BidirectionalIterator last);****OutputIterator
reverse_copy(BidirectionalIterator first, BidirectionalIterator last,****
OutputIterator destination);**

Both forms of this function reverse the range **[first,
last)**. **reverse( )** reverses the range in place, while
**reverse_copy( )** leaves the original range alone and copies the
reversed elements into **destination**, returning the past-the-end iterator
of the resulting range.

**ForwardIterator2 swap_ranges(ForwardIterator1 first1,
ForwardIterator1 last1, **** ForwardIterator2 first2);**

Exchanges the contents of two ranges of equal size, by moving
from the beginning to the end of each range and swapping each set of
elements.

**void rotate(ForwardIterator first, ForwardIterator middle,
ForwardIterator last);****OutputIterator rotate_copy(ForwardIterator
first, ForwardIterator middle,**** ForwardIterator last,
OutputIterator destination);**

Swaps the two ranges **[first, middle)** and **[middle,
last)**. With **rotate( )**, the swap is performed in place, and with
**rotate_copy( )** the original range is untouched and the rotated
version is copied into **destination**, returning the past-the-end iterator
of the resulting range. Note that while **swap_ranges( )** requires that
the two ranges be exactly the same size, the “rotate” functions do
not.

**bool next_permutation(BidirectionalIterator first,
BidirectionalIterator last);****bool
next_permutation(BidirectionalIterator first, BidirectionalIterator
last,**** StrictWeakOrdering binary_pred);****bool
prev_permutation(BidirectionalIterator first, BidirectionalIterator
last);****bool prev_permutation(BidirectionalIterator first,
BidirectionalIterator last,**** StrictWeakOrdering
binary_pred);**

A *permutation* is one unique ordering of a set of
elements. If you have **n** unique elements, then there are **n!**
(**n** factorial) distinct possible combinations of those elements. All these
combinations can be conceptually sorted into a sequence using a lexicographical
ordering, and thus produce a concept of a “next” and
“previous” permutation. Therefore, whatever the current ordering of
elements in the range, there is a distinct “next” and
“previous” permutation in the sequence of permutations.

The **next_permutation( )** and
**prev_permutation( )** functions re-arrange the elements into their
next or previous permutation, and if successful return **true**. If there are
no more “next” permutations, it means that the elements are in
sorted order so **next_permutation( )** returns **false**. If there
are no more “previous” permutations, it means that the elements are
in descending sorted order so **previous_permutation( )** returns
**false**.

The versions of the functions which have a
**StrictWeakOrdering** argument perform the comparisons using
**binary_pred** instead of **operator<**.

**void random_shuffle(RandomAccessIterator first,
RandomAccessIterator last);****void random_shuffle(RandomAccessIterator
first, RandomAccessIterator last**** RandomNumberGenerator&
rand);**

This function randomly rearranges the elements in the range.
It yields uniformly distributed results. The first form uses an internal random
number generator and the second uses a user-supplied random-number
generator.

**BidirectionalIterator partition(BidirectionalIterator
first, BidirectionalIterator last, **** Predicate
pred);****BidirectionalIterator stable_partition(BidirectionalIterator
first, **** BidirectionalIterator last, Predicate pred);**

The “partition” functions use **pred** to
organize the elements in the range **[first, last)** so they are before or
after the partition (a point in the range). The partition point is given by the
returned iterator. If **pred(*i)** is **true** (where **i** is the
iterator pointing to a particular element), then that element will be placed
before the partition point, otherwise it will be placed after the partition
point.

With **partition( )**, the order of the elements is
after the function call is not specified, but with
**stable_parition( )** the relative order of the elements before and
after the partition point will be the same as before the partitioning
process.

This gives a basic demonstration of sequence
manipulation:

//: C05:Manipulations.cpp // Shows basic manipulations #include "PrintSequence.h" #include "NString.h" #include "Generators.h" #include <vector> #include <string> #include <algorithm> using namespace std; int main() { vector<int> v1(10); // Simple counting: generate(v1.begin(), v1.end(), SkipGen()); print(v1, "v1", " "); vector<int> v2(v1.size()); copy_backward(v1.begin(), v1.end(), v2.end()); print(v2, "copy_backward", " "); reverse_copy(v1.begin(), v1.end(), v2.begin()); print(v2, "reverse_copy", " "); reverse(v1.begin(), v1.end()); print(v1, "reverse", " "); int half = v1.size() / 2; // Ranges must be exactly the same size: swap_ranges(v1.begin(), v1.begin() + half, v1.begin() + half); print(v1, "swap_ranges", " "); // Start with fresh sequence: generate(v1.begin(), v1.end(), SkipGen()); print(v1, "v1", " "); int third = v1.size() / 3; for(int i = 0; i < 10; i++) { rotate(v1.begin(), v1.begin() + third, v1.end()); print(v1, "rotate", " "); } cout << "Second rotate example:" << endl; char c[] = "aabbccddeeffgghhiijj"; const char csz = strlen(c); for(int i = 0; i < 10; i++) { rotate(c, c + 2, c + csz); print(c, c + csz, "", ""); } cout << "All n! permutations of abcd:" << endl; int nf = 4 * 3 * 2 * 1; char p[] = "abcd"; for(int i = 0; i < nf; i++) { next_permutation(p, p + 4); print(p, p + 4, "", ""); } cout << "Using prev_permutation:" << endl; for(int i = 0; i < nf; i++) { prev_permutation(p, p + 4); print(p, p + 4, "", ""); } cout << "random_shuffling a word:" << endl; string s("hello"); cout << s << endl; for(int i = 0; i < 5; i++) { random_shuffle(s.begin(), s.end()); cout << s << endl; } NString sa[] = { "a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c"}; const int sasz = sizeof sa / sizeof *sa; vector<NString> ns(sa, sa + sasz); print(ns, "ns", " "); vector<NString>::iterator it = partition(ns.begin(), ns.end(), bind2nd(greater<NString>(), "b")); cout << "Partition point: " << *it << endl; print(ns, "", " "); // Reload vector: copy (sa, sa + sasz, ns.begin()); it = stable_partition(ns.begin(), ns.end(), bind2nd(greater<NString>(), "b")); cout << "Stable partition" << endl; cout << "Partition point: " << *it << endl; print(ns, "", " "); } ///:~

The best way to see the results of the above program is to run
it (you’ll probably want to redirect the output to a file).

The **vector<int> v1** is initially loaded with a
simple ascending sequence and printed. You’ll see that the effect of
**copy_backward( )** (which copies into **v2**, which is the same
size as **v1**) is the same as an ordinary copy. Again,
**copy_backward( )** does the same thing as **copy( )**, it just
performs the operations in backward order.

**reverse_copy( )**, however, actually does created a
reversed copy, while **reverse( )** performs the reversal in place.
Next, **swap_ranges( )** swaps the upper half of the reversed sequence
with the lower half. Of course, the ranges could be smaller subsets of the
entire vector, as long as they are of equivalent size.

After re-creating the ascending sequence,
**rotate( )** is demonstrated by rotating one third of **v1**
multiple times. A second **rotate( )** example uses characters and just
rotates two characters at a time. This also demonstrates the flexibility of both
the STL algorithms and the **print( )** template, since they can both be
used with arrays of **char** as easily as with anything else.

To demonstrate **next_permutation( )** and
**prev_permutation( )**, a set of four characters “abcd” is
permuted through all **n!** (**n** factorial) possible combinations.
You’ll see from the output that the permutations move through a
strictly-defined order (that is, permuting is a deterministic
process).

A quick-and-dirty demonstration of
**random_shuffle( )** is to apply it to a **string** and see what
words result. Because a **string** object has **begin( )** and
**end( )** member functions that return the appropriate iterators, it
too may be easily used with many of the STL algorithms. Of course, an array of
**char** could also have been used.

Finally, the **partition( )** and
**stable_partition( )** are demonstrated, using an array of
**NString**. You’ll note that the aggregate initialization expression
uses **char** arrays, but **NString** has a **char*** constructor which
is automatically used.

When partitioning a sequence, you need a predicate which will
determine whether the object belongs above or below the partition point. This
takes a single argument and returns **true** (the object is above the
partition point) or **false** (it isn’t). I could have written a
separate function or function object to do this, but for something simple, like
“the object is greater than ‘b’”, why not use the
built-in function object templates? The expression is:

bind2nd(greater<NString>(), "b")

And to understand it, you need to pick it apart from the
middle outward. First,

greater<NString>()

produces a binary function object which compares its first and
second arguments:

return first > second;

and returns a **bool**. But we don’t want a binary
predicate, and we want to compare against the constant value
“**b**.” So **bind2nd( )** says: create a new function
object which only takes one argument, by taking this
**greater<NString>( )** function and forcing the second argument
to always be “**b**.” The first argument (the only argument) will
be the one from the vector **ns**.

You’ll see from the output that with the unstable
partition, the objects are correctly above and below the partition point, but in
no particular order, whereas with the stable partition their original order is
maintained.

All of these algorithms are used for searching for one or more
objects within a range defined by the first two iterator arguments.

**InputIterator find(InputIterator first, InputIterator
last,**** const EqualityComparable& value);**

Searches for **value **within a range of elements. Returns
an iterator in the range **[first, last)** that points to the first
occurrence of **value**. If **value** isn’t in the range, then
**find( )** returns **last**. This is a *linear search*, that
is, it starts at the beginning and looks at each sequential element without
making any assumptions about the way the elements are ordered. In contrast, a
**binary_search( )** (defined later) works on a sorted sequence and can
thus be much faster.

**InputIterator find_if(InputIterator first, InputIterator
last, Predicate pred);**

Just like **find( )**, **find_if( )** performs
a linear search through the range. However, instead of searching for
**value**, **find_if( )** looks for an element such that the
**Predicate pred** returns **true** when applied to that element. Returns
**last** if no such element can be found.

**ForwardIterator adjacent_find(ForwardIterator first,
ForwardIterator last);****ForwardIterator adjacent_find(ForwardIterator
first, ForwardIterator last, **** BinaryPredicate
binary_pred);**

Like **find( )**, performs a linear search through the
range, but instead of looking for only one element it searches for two elements
that are right next to each other. The first form of the function looks for two
elements that are equivalent (via **operator==**). The second form looks for
two adjacent elements that, when passed together to **binary_pred**, produce
a **true** result. If two adjacent elements cannot be found, **last** is
returned.

**ForwardIterator1 find_first_of(ForwardIterator1 first1,
ForwardIterator1 last1, **** ForwardIterator2 first2, ForwardIterator2
last2);****ForwardIterator1 find_first_of(ForwardIterator1 first1,
ForwardIterator1 last1, **** ForwardIterator2 first2, ForwardIterator2
last2, BinaryPredicate binary_pred); **

Like **find( )**, performs a linear search through the
range. The first form finds the first element in the first range that is
equivalent to any of the elements in the second range. The second form finds the
first element in the first range that produces **true** when passed to
**binary_pred** along with any of the elements in the second range. When a
**BinaryPredicate** is used with two ranges in the algorithms, the element
from the first range becomes the first argument to **binary_pred**, and the
element from the second range becomes the second argument.

**ForwardIterator1 search(ForwardIterator1 first1,
ForwardIterator1 last1, **** ForwardIterator2 first2, ForwardIterator2
last2);****ForwardIterator1 search(ForwardIterator1 first1,
ForwardIterator1 last1, **** ForwardIterator2 first2, ForwardIterator2
last2 BinaryPredicate binary_pred);**

Attempts to find the entire range **[first2, last2)**
within the range **[first1, last1)**. That is, it checks to see if the second
range occurs (in the exact order of the second range) within the first range,
and if so returns an iterator pointing to the place in the first range where the
second range begins. Returns **last1** if no subset can be found. The first
form performs its test using **operator==**, while the second checks to see
if each pair of objects being compared causes **binary_pred** to return
**true**.

**ForwardIterator1 find_end(ForwardIterator1 first1,
ForwardIterator1 last1,**** ForwardIterator2 first2, ForwardIterator2
last2);****ForwardIterator1 find_end(ForwardIterator1 first1,
ForwardIterator1 last1,**** ForwardIterator2 first2, ForwardIterator2
last2, BinaryPredicate binary_pred);**

The forms and arguments are just like **search( )** in
that it looks for the second range within the first range, but while
**search( )** looks for the first occurrence of the second range,
**find_end( )** looks for the *last* occurrence of the second range
within the first.

**ForwardIterator search_n(ForwardIterator first,
ForwardIterator last, **** Size count, const T&
value);****ForwardIterator search_n(ForwardIterator first,
ForwardIterator last, **** Size count, const T& value,
BinaryPredicate binary_pred);**

Looks for a group of **count** consecutive values in
**[first, last)** that are all equal to **value** (in the first form) or
that all cause a return value of **true** when passed into **binary_pred**
along with **value** (in the second form). Returns **last** if such a
group cannot be found.

**ForwardIterator min_element(ForwardIterator first,
ForwardIterator last);****ForwardIterator min_element(ForwardIterator
first, ForwardIterator last, **** BinaryPredicate
binary_pred);**

Returns an iterator pointing to the first occurrence of the
smallest value in the range (there may be multiple occurrences of the smallest
value). Returns **last** if the range is empty. The first version performs
comparisons with **operator<** and the value **r ** returned is such
that

***e < *r**

is false for every element **e** in the range.
The second version compares using **binary_pred** and the value **r**
returned is such that **binary_pred (*e, *r)** is false for every element
**e** in the range.

**ForwardIterator max_element(ForwardIterator first,
ForwardIterator last);****ForwardIterator max_element(ForwardIterator
first, ForwardIterator last, **** BinaryPredicate
binary_pred);**

Returns an iterator pointing to the first occurrence of the
largest value in the range (there may be multiple occurrences of the largest
value). Returns **last** if the range is empty. The first version performs
comparisons with **operator<** and the value **r ** returned is such
that

***r < *e**

is false for every element **e** in the range.
The second version compares using **binary_pred** and the value **r**
returned is such that **binary_pred (*r, *e)** is false for every element
**e** in the range.

**void replace(ForwardIterator first, ForwardIterator last,
**** const T& old_value, const T& new_value);****void
replace_if(ForwardIterator first, ForwardIterator last, **** Predicate
pred, const T& new_value);****OutputIterator
replace_copy(InputIterator first, InputIterator last, ****
OutputIterator result, const T& old_value, const T&
new_value);****OutputIterator replace_copy_if(InputIterator first,
InputIterator last, **** OutputIterator result, Predicate pred, const
T& new_value);**

Each of the “replace” forms moves through the
range **[first, last)**, finding values that match a criterion and replacing
them with **new_value**. Both **replace( )** and
**replace_copy( )** simply look for **old_value** to replace, while
**replace_if( )** and **replace_copy_if( )** look for values
that satisfy the predicate **pred**. The “copy” versions of the
functions do not modify the original range but instead make a copy with the
replacements into **result** (incrementing **result** after each
assignment).

To provide easy viewing of the results, this example will
manipulate **vector**s of **int**. Again, not every possible version of
each algorithm will be shown (some that should be obvious have been
omitted).

//: C05:SearchReplace.cpp // The STL search and replace algorithms #include "PrintSequence.h" #include <vector> #include <algorithm> #include <functional> using namespace std; struct PlusOne { bool operator()(int i, int j) { return j == i + 1; } }; class MulMoreThan { int value; public: MulMoreThan(int val) : value(val) {} bool operator()(int v, int m) { return v * m > value; } }; int main() { int a[] = { 1, 2, 3, 4, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 11, 11, 11, 11, 11 }; const int asz = sizeof a / sizeof *a; vector<int> v(a, a + asz); print(v, "v", " "); vector<int>::iterator it = find(v.begin(), v.end(), 4); cout << "find: " << *it << endl; it = find_if(v.begin(), v.end(), bind2nd(greater<int>(), 8)); cout << "find_if: " << *it << endl; it = adjacent_find(v.begin(), v.end()); while(it != v.end()) { cout << "adjacent_find: " << *it << ", " << *(it + 1) << endl; it = adjacent_find(it + 2, v.end()); } it = adjacent_find(v.begin(), v.end(), PlusOne()); while(it != v.end()) { cout << "adjacent_find PlusOne: " << *it << ", " << *(it + 1) << endl; it = adjacent_find(it + 1, v.end(), PlusOne()); } int b[] = { 8, 11 }; const int bsz = sizeof b / sizeof *b; print(b, b + bsz, "b", " "); it = find_first_of(v.begin(), v.end(), b, b + bsz); print(it, it + bsz, "find_first_of", " "); it = find_first_of(v.begin(), v.end(), b, b + bsz, PlusOne()); print(it,it + bsz,"find_first_of PlusOne"," "); it = search(v.begin(), v.end(), b, b + bsz); print(it, it + bsz, "search", " "); int c[] = { 5, 6, 7 }; const int csz = sizeof c / sizeof *c; print(c, c + csz, "c", " "); it = search(v.begin(), v.end(), c, c + csz, PlusOne()); print(it, it + csz,"search PlusOne", " "); int d[] = { 11, 11, 11 }; const int dsz = sizeof d / sizeof *d; print(d, d + dsz, "d", " "); it = find_end(v.begin(), v.end(), d, d + dsz); print(it, v.end(),"find_end", " "); int e[] = { 9, 9 }; print(e, e + 2, "e", " "); it = find_end(v.begin(), v.end(), e, e + 2, PlusOne()); print(it, v.end(),"find_end PlusOne"," "); it = search_n(v.begin(), v.end(), 3, 7); print(it, it + 3, "search_n 3, 7", " "); it = search_n(v.begin(), v.end(), 6, 15, MulMoreThan(100)); print(it, it + 6, "search_n 6, 15, MulMoreThan(100)", " "); cout << "min_element: " << *min_element(v.begin(), v.end()) << endl; cout << "max_element: " << *max_element(v.begin(), v.end()) << endl; vector<int> v2; replace_copy(v.begin(), v.end(), back_inserter(v2), 8, 47); print(v2, "replace_copy 8 -> 47", " "); replace_if(v.begin(), v.end(), bind2nd(greater_equal<int>(), 7), -1); print(v, "replace_if >= 7 -> -1", " "); } ///:~

The example begins with two predicates: **PlusOne** which
is a binary predicate that returns **true** if the second argument is
equivalent to one plus the first argument, and **MulMoreThan** which returns
**true** if the first argument times the second argument is greater than a
value stored in the object. These binary predicates are used as tests in the
example.

In **main( )**, an array **a** is created and fed
to the constructor for **vector<int> v**. This vector will be used as
the target for the search and replace activities, and you’ll note that
there are duplicate elements – these will be discovered by some of the
search/replace routines.

The first test demonstrates **find( )**, discovering
the value 4 in **v**. The return value is the iterator pointing to the first
instance of 4, or the end of the input range (**v.end( )**) if the
search value is not found.

**find_if( )** uses a predicate to determine if it has
discovered the correct element. In the above example, this predicate is created
on the fly using **greater<int>** (that is, “see if the first
**int **argument is greater than the second”) and
**bind2nd( )** to fix the second argument to 8. Thus, it returns true if
the value in **v** is greater than 8.

Since there are a number of cases in **v** where two
identical objects appear next to each other, the test of
**adjacent_find( )** is designed to find them all. It starts looking
from the beginning and then drops into a **while** loop, making sure that the
iterator **it** has not reached the end of the input sequence (which would
mean that no more matches can be found). For each match it finds, the loop
prints out the matches and then performs the next **adjacent_find( )**,
this time using **it + 2** as the first argument (this way, it moves past the
two elements that it already found).

You might look at the **while** loop and think that you can
do it a bit more cleverly, to wit:

while(it != v.end()) { cout << "adjacent_find: " << *it++ << ", " << *it++ << endl; it = adjacent_find(it, v.end()); }

Of course, this is exactly what I tried at first. However, I
did not get the output I expected, on any compiler. This is because there is no
guarantee about when the increments occur in the above expression. A bit of a
disturbing discovery, I know, but the situation is best avoided now that
you’re aware of it.

The next test uses **adjacent_find( )** with the
**PlusOne** predicate, which discovers all the places where the next number
in the sequence **v** changes from the previous by one. The same **while**
approach is used to find all the cases.

**find_first_of( )** requires a second range of
objects for which to hunt; this is provided in the array **b**. Notice that,
because the first range and the second range in **find_first_of( )** are
controlled by separate template arguments, those ranges can refer to two
different types of containers, as seen here. The second form of
**find_first_of( )** is also tested, using **PlusOne**.

**search( )** finds exactly the second range inside
the first one, with the elements in the same order. The second form of
**search( )** uses a predicate, which is typically just something that
defines equivalence, but it also opens some interesting possibilities –
here, the **PlusOne** predicate causes the range **{ 4, 5, 6 }** to be
found.

The **find_end( )** test discovers the *last*
occurrence of the entire sequence **{ 11, 11, 11 }**. To show that it has in
fact found the last occurrence, the rest of **v** starting from **it** is
printed.

The first **search_n( )** test looks for 3 copies of
the value 7, which it finds and prints. When using the second version of
**search_n( )**, the predicate is ordinarily meant to be used to
determine equivalence between two elements, but I’ve taken some liberties
and used a function object that multiplies the value in the sequence by (in this
case) 15 and checks to see if it’s greater than 100. That is, the
**search_n( )** test above says “find me 6 consecutive values
which, when multiplied by 15, each produce a number greater than 100.” Not
exactly what you normally expect to do, but it might give you some ideas the
next time you have an odd searching problem.

**min_element( )** and **max_element( )** are
straightforward; the only thing that’s a bit odd is that it looks like the
function is being dereferenced with a ‘*****’. Actually, the
returned iterator is being dereferenced to produce the value for
printing.

To test replacements, **replace_copy( )** is used
first (so it doesn’t modify the original vector) to replace all values of
8 with the value 47. Notice the use of **back_inserter( )** with the
empty vector **v2**. To demonstrate **replace_if( )**, a function
object is created using the standard template **greater_equal** along with
**bind2nd** to replace all the values that are greater than or equal to 7
with the value -1.

These algorithms provide ways to compare two ranges. At first
glance, the operations they perform seem very close to the **search( )**
function above. However, **search( )** tells you where the second
sequence appears within the first, while **equal( )** and
**lexicographical_compare( ) **simply tell you whether or not two
sequences are exactly identical (using different comparison algorithms). On the
other hand, **mismatch( )** does tell you where the two sequences go out
of sync, but those sequences must be exactly the same length.

**bool equal(InputIterator first1, InputIterator last1,
InputIterator first2);****bool equal(InputIterator first1, InputIterator
last1, InputIterator first2**** BinaryPredicate
binary_pred);**

In both of these functions, the first range is the typical
one, **[first1, last1)**. The second range starts at **first2**, but there
is no “last2” because its length is determined by the length of the
first range. The **equal( )** function returns true if both ranges are
exactly the same (the same elements in the same order); in the first case, the
**operator==** is used to perform the comparison and in the second case
**binary_pred** is used to decide if two elements are the same.

**bool lexicographical_compare(InputIterator1 first1,
InputIterator1 last1**** InputIterator2 first2, InputIterator2
last2);****bool lexicographical_compare(InputIterator1 first1,
InputIterator1 last1**** InputIterator2 first2, InputIterator2 last2,
BinaryPredicate binary_pred);**

These two functions determine if the first range is
“lexicographically less” than the second (they return **true** if
range 1 is less than range 2, and false otherwise. Lexicographical equality, or
“dictionary” comparison, means that the comparison is done the same
way we establish the order of strings in a dictionary, one element at a time.
The first elements determine the result if these elements are different, but if
they’re equal the algorithm moves on to the next elements and looks at
those, and so on. until it finds a mismatch. At that point it looks at the
elements, and if the element from range 1 is less than the element from range
two, then **lexicographical_compare( )** returns **true**, otherwise
it returns **false**. If it gets all the way through one range or the other
(the ranges may be different lengths for this algorithm) without finding an
inequality, then range 1 is *not *less than range 2 so the function returns
**false**.

If the two ranges are different lengths, a missing element in
one range acts as one that “precedes” an element that exists in the
other range. So {‘a’, ‘b’} lexicographically precedes
{‘a’, ‘b’, ‘a’ }.

In the first version of the function, **operator<** is
used to perform the comparisons, and in the second version **binary_pred** is
used.

**pair<InputIterator1, InputIterator2>
mismatch(InputIterator1 first1, **** InputIterator1 last1,
InputIterator2 first2);****pair<InputIterator1, InputIterator2>
mismatch(InputIterator1 first1, **** InputIterator1 last1,
InputIterator2 first2, BinaryPredicate binary_pred);**

As in **equal( )**, the length of both ranges is
exactly the same, so only the first iterator in the second range is necessary,
and the length of the first range is used as the length of the second range.
Whereas **equal( )** just tells you whether or not the two ranges are
the same, **mismatch( )** tells you where they begin to differ. To
accomplish this, you must be told (1) the element in the first range where the
mismatch occurred and (2) the element in the second range where the mismatch
occurred. These two iterators are packaged together into a **pair** object
and returned. If no mismatch occurs, the return value is **last1** combined
with the past-the-end iterator of the second range.

As in **equal( )**, the first function tests for
equality using **operator==** while the second one uses
**binary_pred**.

Because the standard C++ **string** class is built like a
container (it has **begin( )** and **end( )** member functions
which produce objects of type **string::iterator**), it can be used to
conveniently create ranges of characters to test with the STL comparison
algorithms. However, you should note that **string **has a fairly complete
set of native operations, so you should look at the **string** class before
using the STL algorithms to perform operations.

//: C05:Comparison.cpp // The STL range comparison algorithms #include "PrintSequence.h" #include <vector> #include <algorithm> #include <functional> #include <string> using namespace std; int main() { // strings provide a convenient way to create // ranges of characters, but you should // normally look for native string operations: string s1("This is a test"); string s2("This is a Test"); cout << "s1: " << s1 << endl << "s2: " << s2 << endl; cout << "compare s1 & s1: " << equal(s1.begin(), s1.end(), s1.begin()) << endl; cout << "compare s1 & s2: " << equal(s1.begin(), s1.end(), s2.begin()) << endl; cout << "lexicographical_compare s1 & s1: " << lexicographical_compare(s1.begin(), s1.end(), s1.begin(), s1.end()) << endl; cout << "lexicographical_compare s1 & s2: " << lexicographical_compare(s1.begin(), s1.end(), s2.begin(), s2.end()) << endl; cout << "lexicographical_compare s2 & s1: " << lexicographical_compare(s2.begin(), s2.end(), s1.begin(), s1.end()) << endl; cout << "lexicographical_compare shortened " "s1 & full-length s2: " << endl; string s3(s1); while(s3.length() != 0) { bool result = lexicographical_compare( s3.begin(), s3.end(), s2.begin(),s2.end()); cout << s3 << endl << s2 << ", result = " << result << endl; if(result == true) break; s3 = s3.substr(0, s3.length() - 1); } pair<string::iterator, string::iterator> p = mismatch(s1.begin(), s1.end(), s2.begin()); print(p.first, s1.end(), "p.first", ""); print(p.second, s2.end(), "p.second",""); } ///:~

Note that the only difference between **s1** and **s2**
is the capital ‘T’ in **s2**’s “Test.”
Comparing **s1** and **s1** for equality yields **true**, as expected,
while **s1** and **s2** are not equal because of the capital
‘T’.

To understand the output of the
**lexicographical_compare( )** tests, you must remember two things:
first, the comparison is performed character-by-character, and second that
capital letters “precede” lowercase letters. In the first test,
**s1** is compared to **s1**. These are exactly equivalent, thus one is
*not* lexicographically less than the other (which is what the comparison
is looking for) and thus the result is **false**. The second test is asking
“does **s1** precede **s2**?” When the comparison gets to the
‘t’ in “test”, it discovers that the lowercase
‘t’ in **s1** is “greater” than the uppercase
‘T’ in **s2**, so the answer is again **false**. However, if
we test to see whether **s2** precedes **s1**, the answer is
**true**.

To further examine lexicographical comparison, the next test
in the above example compares **s1** with **s2** again (which returned
**false** before). But this time it repeats the comparison, trimming one
character off the end of **s1** (which is first copied into **s3**) each
time through the loop until the test evaluates to **true**. What you’ll
see is that, as soon as the uppercase ‘T’ is trimmed off of
**s3** (the copy of **s1**), then the characters, which are exactly equal
up to that point, no longer count and the fact that **s3** is shorter than
**s2** is what makes it lexicographically precede **s2**.

The final test uses** mismatch( )**. In order to
capture the return value, you must first create the appropriate **pair p**,
constructing the template using the iterator type from the first range and the
iterator type from the second range (in this case, both
**string::iterator**s). To print the results, the iterator for the mismatch
in the first range is **p.first**, and for the second range is
**p.second**. In both cases, the range is printed from the mismatch iterator
to the end of the range so you can see exactly where the iterator
points.

Because of the genericity of the STL, the concept of removal
is a bit constrained. Since elements can only be “removed” via
iterators, and iterators can point to arrays, vectors, lists, etc., it is not
safe or reasonable to actually try to destroy the elements that are being
removed, and to change the size of the input range **[first, last)** (an
array, for example, cannot have its size changed). So instead, what the STL
“remove” functions do is rearrange the sequence so that the
“removed” elements are at the end of the sequence, and the
“un-removed” elements are at the beginning of the sequence (in the
same order that they were before, minus the removed elements – that is,
this is a *stable* operation). Then the function will return an iterator to
the “new last” element of the sequence, which is the end of the
sequence without the removed elements and the beginning of the sequence of the
removed elements. In other words, if **new_last** is the iterator that is
returned from the “remove” function, then **[first, new_last)**
is the sequence without any of the removed elements, and **[new_last, last)**
is the sequence of removed elements.

If you are simply using your sequence, including the removed
elements, with more STL algorithms, you can just use **new_last** as the new
past-the-end iterator. However, if you’re using a resizable container **c
**(not an array) and you actually want to eliminate the removed elements from
the container you can use **erase( )** to do so, for example:

c.erase(remove(c.begin(), c.end(), value), c.end());

The return value of **remove( )** is the
**new_last** iterator, so **erase( )** will delete all the removed
elements from **c**.

The iterators in **[new_last, last)** are dereferenceable
but the element values are undefined and should not be used.

**ForwardIterator remove(ForwardIterator first,
ForwardIterator last, const T& value);****ForwardIterator
remove_if(ForwardIterator first, ForwardIterator last, **** Predicate
pred);****OutputIterator remove_copy(InputIterator first, InputIterator
last, **** OutputIterator result, const T&
value);****OutputIterator remove_copy_if(InputIterator first,
InputIterator last, **** OutputIterator result, Predicate
pred);**

Each of the “remove” forms moves through the range
**[first, last)**, finding values that match a removal criterion and copying
the un-removed elements over the removed elements (thus effectively removing
them). The original order of the un-removed elements is maintained. The return
value is an iterator pointing past the end of the range that contains none of
the removed elements. The values that this iterator points to are
unspecified.

The “if” versions pass each element to
**pred( )** to determine whether it should be removed or not (if
**pred( )** returns **true**, the element is removed). The
“copy” versions do not modify the original sequence, but instead
copy the un-removed values into a range beginning at **result**, and return
an iterator indicating the past-the-end value of this new range.

**ForwardIterator unique(ForwardIterator first,
ForwardIterator last);****ForwardIterator unique(ForwardIterator first,
ForwardIterator last, **** BinaryPredicate
binary_pred);****OutputIterator unique_copy(InputIterator first,
InputIterator last, **** OutputIterator
result);****OutputIterator unique_copy(InputIterator first, InputIterator
last, **** OutputIterator result, BinaryPredicate
binary_pred);**

Each of the “unique” functions moves through the
range **[first, last)**, finding adjacent values that are equivalent (that
is, duplicates) and “removing” the duplicate elements by copying
over them. The original order of the un-removed elements is maintained. The
return value is an iterator pointing past the end of the range that has the
adjacent duplicates removed.

Because only duplicates that are adjacent are removed,
it’s likely that you’ll want to call **sort( )** before
calling a “unique” algorithm, since that will guarantee that
*all* the duplicates are removed.

The versions containing **binary_pred** call, for each
iterator value **i** in the input range:

binary_pred(*i, *(i-1));

and if the result is true then ***(i-1)** is considered a
duplicate.

The “copy” versions do not modify the original
sequence, but instead copy the un-removed values into a range beginning at
**result**, and return an iterator indicating the past-the-end value of this
new range.

This example gives a visual demonstration of the way the
“remove” and “unique” functions work.

//: C05:Removing.cpp // The removing algorithms #include "PrintSequence.h" #include "Generators.h" #include <vector> #include <algorithm> #include <cctype> using namespace std; struct IsUpper { bool operator()(char c) { return isupper(c); } }; int main() { vector<char> v(50); generate(v.begin(), v.end(), CharGen()); print(v, "v", ""); // Create a set of the characters in v: set<char> cs(v.begin(), v.end()); set<char>::iterator it = cs.begin(); vector<char>::iterator cit; // Step through and remove everything: while(it != cs.end()) { cit = remove(v.begin(), v.end(), *it); cout << *it << "[" << *cit << "] "; print(v, "", ""); it++; } generate(v.begin(), v.end(), CharGen()); print(v, "v", ""); cit = remove_if(v.begin(), v.end(), IsUpper()); print(v.begin(), cit, "after remove_if", ""); // Copying versions are not shown for remove // and remove_if. sort(v.begin(), cit); print(v.begin(), cit, "sorted", ""); vector<char> v2; unique_copy(v.begin(), cit, back_inserter(v2)); print(v2, "unique_copy", ""); // Same behavior: cit = unique(v.begin(), cit, equal_to<char>()); print(v.begin(), cit, "unique", ""); } ///:~

The **vector<char> v** is filled with
randomly-generated characters and then copied into a **set**. Each element of
the **set** is used in a **remove** statement, but the entire **vector
v** is printed out each time so you can see what happens to the rest of the
range, after the resulting endpoint (which is stored in **cit**).

To demonstrate **remove_if( )**, the address of the
Standard C library function **isupper( ) **(in **<cctype> **is
called inside of the function object class **IsUpper**, an object of which
is** **passed as the predicate for **remove_if( )**.** **This only
returns **true** if a character is uppercase, so only lowercase characters
will remain. Here, the end of the range is used in the call to
**print( )** so only the remaining elements will appear. The copying
versions of **remove( )** and **remove_if( )** are not shown
because they are a simple variation on the non-copying versions which you should
be able to use without an example.

The range of lowercase letters is sorted in preparation for
testing the “unique” functions (the “unique” functions
are not undefined if the range isn’t sorted, but it’s probably not
what you want). First, **unique_copy( )** puts the unique elements into
a new **vector** using the default element comparison, and then the form of
**unique( )** that takes a predicate is used; the predicate used is the
built-in function object **equal_to( )**, which produces the same
results as the default element comparison.

There is a significant category of STL algorithms which
require that the range they operate on be in sorted order.

There is actually only one “sort” algorithm used
in the STL. This algorithm is presumably the fastest one, but the implementer
has fairly broad latitude. However, it comes packaged in various flavors
depending on whether the sort should be stable, partial or just the regular
sort. Oddly enough, only the partial sort has a copying version; otherwise
you’ll need to make your own copy before sorting if that’s what you
want. If you are working with a very large number of items you may be better off
transferring them to an array (or at least a **vector**, which uses an array
internally) rather than using them in some of the STL containers.

Once your sequence is sorted, there are many operations you
can perform on that sequence, from simply locating an element or group of
elements to merging with another sorted sequence or manipulating sequences as
mathematical sets.

Each algorithm involved with sorting or operations on sorted
sequences has two versions of each function, the first that uses the
object’s own **operator<** to perform the comparison, and the second
that uses an additional **StrictWeakOrdering** object’s
**operator( )(a, b)** to compare two objects for **a** **<**
**b**. Other than this there are no differences, so the distinction will not
be pointed out in the description of each algorithm.

One STL container (**list**) has its own built-in
**sort( )** function which is almost certainly going to be faster than
the generic sort presented here (especially since the **list **sort just
swaps pointers rather than copying entire objects around). This means that
you’ll only want to use the sort functions here if (a) you’re
working with an array or a sequence container that doesn’t have a
**sort( )** function or (b) you want to use one of the other sorting
flavors, like a partial or stable sort, which aren’t supported by
**list**’s **sort( )**.

**void sort(RandomAccessIterator first, RandomAccessIterator
last);****void sort(RandomAccessIterator first, RandomAccessIterator
last, **** StrictWeakOrdering binary_pred);**

Sorts **[first, last)** into ascending order. The second
form allows a comparator object to determine the order.

**void stable_sort(RandomAccessIterator first,
RandomAccessIterator last);****void stable_sort(RandomAccessIterator
first, RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Sorts **[first, last)** into ascending order, preserving
the original ordering of equivalent elements (this is important if elements can
be equivalent but not identical). The second form allows a comparator object to
determine the order.

**void partial_sort(RandomAccessIterator first, ****
RandomAccessIterator middle, RandomAccessIterator last);****void
partial_sort(RandomAccessIterator first, **** RandomAccessIterator
middle, RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Sorts the number of elements from **[first, last)** that
can be placed in the range **[first, middle)**. The rest of the elements end
up in **[middle, last)**, and have no guaranteed order. The second form
allows a comparator object to determine the order.

**RandomAccessIterator partial_sort_copy(InputIterator first,
InputIterator last, **** RandomAccessIterator result_first,
RandomAccessIterator result_last);****RandomAccessIterator
partial_sort_copy(InputIterator first, **** InputIterator last,
RandomAccessIterator result_first, **** RandomAccessIterator
result_last, StrictWeakOrdering binary_pred);**

Sorts the number of elements from **[first, last)** that
can be placed in the range **[result_first, result_last)**, and copies those
elements into **[result_first, result_last)**. If the range **[first,
last)** is smaller than **[result_first, result_last)**, then the smaller
number of elements is used. The second form allows a comparator object to
determine the order.

**void nth_element(RandomAccessIterator first, ****
RandomAccessIterator nth, RandomAccessIterator last);****void
nth_element(RandomAccessIterator first, **** RandomAccessIterator nth,
RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Just like **partial_sort( )**,
**nth_element( )** partially orders a range of elements. However,
it’s much “less ordered” than **partial_sort( )**. The
only thing that **nth_element( )** guarantees is that whatever
*location *you choose will become a dividing point. All the elements in the
range **[first, nth) **will be less than (they could also be equivalent to)
whatever element ends up at location **nth **and all the elements in the
range **(nth, last]** will be greater than whatever element ends up location
**nth**. However, neither range is in any particular order, unlike
**partial_sort( )** which has the first range in sorted order.

If all you need is this very weak ordering (if, for example,
you’re determining medians, percentiles and that sort of thing) this
algorithm is faster than **partial_sort( )**.

The **StreamTokenizer** class from the previous chapter is
used to break a file into words, and each word is turned into an **NString**
and added to a **deque<NString>**. Once the input file is completely
read, a **vector<NString> **is created from the contents of the
**deque**. The **vector** is then used to demonstrate the sorting
algorithms:

//: C05:SortTest.cpp //{L} ../C04/StreamTokenizer // Test different kinds of sorting #include "../C04/StreamTokenizer.h" #include "NString.h" #include "PrintSequence.h" #include "Generators.h" #include "../require.h" #include <algorithm> #include <fstream> #include <queue> #include <vector> #include <cctype> using namespace std; // For sorting NStrings and ignore string case: struct NoCase { bool operator()( const NString& x, const NString& y) { /* Somthing's wrong with this approach but I can't seem to see it. It would be much faster: const string& lv = x; const string& rv = y; int len = min(lv.size(), rv.size()); for(int i = 0; i < len; i++) if(tolower(lv[i]) < tolower(rv[i])) return true; return false; } */ // Brute force: copy, force to lowercase: string lv(x); string rv(y); lcase(lv); lcase(rv); return lv < rv; } void lcase(string& s) { int n = s.size(); for(int i = 0; i < n; i++) s[i] = tolower(s[i]); } }; int main(int argc, char* argv[]) { requireArgs(argc, 1); ifstream in(argv[1]); assure(in, argv[1]); StreamTokenizer words(in); deque<NString> nstr; string word; while((word = words.next()).size() != 0) nstr.push_back(NString(word)); print(nstr); // Create a vector from the contents of nstr: vector<NString> v(nstr.begin(), nstr.end()); sort(v.begin(), v.end()); print(v, "sort"); // Use an additional comparator object: sort(v.begin(), v.end(), NoCase()); print(v, "sort NoCase"); copy(nstr.begin(), nstr.end(), v.begin()); stable_sort(v.begin(), v.end()); print(v, "stable_sort"); // Use an additional comparator object: stable_sort(v.begin(), v.end(), greater<NString>()); print(v, "stable_sort greater"); copy(nstr.begin(), nstr.end(), v.begin()); // Partial sorts. The additional comparator // versions are obvious and not shown here. partial_sort(v.begin(), v.begin() + v.size()/2, v.end()); print(v, "partial_sort"); // Create a vector with a preallocated size: vector<NString> v2(v.size()/2); partial_sort_copy(v.begin(), v.end(), v2.begin(), v2.end()); print(v2, "partial_sort_copy"); // Finally, the weakest form of ordering: vector<int> v3(20); generate(v3.begin(), v3.end(), URandGen(50)); print(v3, "v3 before nth_element"); int n = 10; vector<int>::iterator vit = v3.begin() + n; nth_element(v3.begin(), vit, v3.end()); cout << "After ordering with nth = " << n << ", nth element is " << v3[n] << endl; print(v3, "v3 after nth_element"); } ///:~

The first class is a binary predicate used to compare two
**NString** objects while ignoring the case of the **string**s. You can
pass the object into the various sort routines to produce an alphabetic sort
(rather than the default lexicographic sort, which has all the capital letters
in one group, followed by all the lowercase letters).

As an example, try the source code for the above file as
input. Because the occurrence numbers are printed along with the strings you can
distinguish between an ordinary sort and a stable sort, and you can also see
what happens during a partial sort (the remaining unsorted elements are in no
particular order). There is no “partial stable sort.”

You’ll notice that the use of the second
“comparator” forms of the functions are not exhaustively tested in
the above example, but the use of a comparator is the same as in the first part
of the example.

The test of **nth_element** does not use the **NString**
objects because it’s simpler to see what’s going on if **int**s
are used. Notice that, whatever the nth element turns out to be (which will vary
from one run to another because of **URandGen**), the elements before that
are less, and after that are greater, but the elements have no particular order
other than that. Because of **URandGen**, there are no duplicates but if you
use a generator that allows duplicates you can see that the elements before the
nth element will be less than or equal to the nth element.

Once a range is sorted, there are a group of operations that
can be used to find elements within those ranges. In the following functions,
there are always two forms, one that assumes the intrinsic **operator<**
has been used to perform the sort, and the second that must be used if some
other comparison function object has been used to perform the sort. You must use
the same comparison for locating elements as you do to perform the sort,
otherwise the results are undefined. In addition, if you try to use these
functions on unsorted ranges the results will be undefined.

**bool binary_search(ForwardIterator first, ForwardIterator
last, const T& value);****bool binary_search(ForwardIterator first,
ForwardIterator last, const T& value, **** StrictWeakOrdering
binary_pred);**

Tells you whether **value** appears in the sorted range
**[first, last)**.

**ForwardIterator lower_bound(ForwardIterator first,
ForwardIterator last, **** const T&
value);****ForwardIterator lower_bound(ForwardIterator first,
ForwardIterator last, **** const T& value, StrictWeakOrdering
binary_pred);**

Returns an iterator indicating the first occurrence of
**value** in the sorted range **[first, last)**. Returns **last** if
**value** is not found.

**ForwardIterator upper_bound(ForwardIterator first,
ForwardIterator last, **** const T&
value);****ForwardIterator upper_bound(ForwardIterator first,
ForwardIterator last, **** const T& value, StrictWeakOrdering
binary_pred);**

Returns an iterator indicating one past the last occurrence of
**value** in the sorted range **[first, last)**. Returns **last** if
**value** is not found.

**pair<ForwardIterator, ForwardIterator> ****
equal_range(ForwardIterator first, ForwardIterator last, **** const
T& value);****pair<ForwardIterator, ForwardIterator>
**** equal_range(ForwardIterator first, ForwardIterator last,
**** const T& value, StrictWeakOrdering
binary_pred);**

Essentially combines **lower_bound( )** and
**upper_bound( )** to return a **pair** indicating the first and
one-past-the-last occurrences of **value** in the sorted range **[first,
last)**. Both iterators indicate **last** if **value** is not
found.

Here, we can use the approach from the previous
example:

//: C05:SortedSearchTest.cpp //{L} ../C04/StreamTokenizer // Test searching in sorted ranges #include "../C04/StreamTokenizer.h" #include "PrintSequence.h" #include "NString.h" #include "../require.h" #include <algorithm> #include <fstream> #include <queue> #include <vector> using namespace std; int main() { ifstream in("SortedSearchTest.cpp"); assure(in, "SortedSearchTest.cpp"); StreamTokenizer words(in); deque<NString> dstr; string word; while((word = words.next()).size() != 0) dstr.push_back(NString(word)); vector<NString> v(dstr.begin(), dstr.end()); sort(v.begin(), v.end()); print(v, "sorted"); typedef vector<NString>::iterator sit; sit it, it2; string f("include"); cout << "binary search: " << binary_search(v.begin(), v.end(), f) << endl; it = lower_bound(v.begin(), v.end(), f); it2 = upper_bound(v.begin(), v.end(), f); print(it, it2, "found range"); pair<sit, sit> ip = equal_range(v.begin(), v.end(), f); print(ip.first, ip.second, "equal_range"); } ///:~

The input is forced to be the source code for this file
because the word “include” will be used for a find string (since
“include” appears many times). The file is tokenized into words that
are placed into a **deque** (a better container when you don’t know how
much storage to allocate), and left unsorted in the **deque**. The
**deque** is copied into a **vector** via the appropriate constructor, and
the **vector** is sorted and printed.

The **binary_search( )** function only tells you if
the object is there or not; **lower_bound( )** and
**upper_bound( )** produce iterators to the beginning and ending
positions where the matching objects appear. The same effect can be produced
more succinctly using **equal_range( )** (as shown in the previous
chapter, with **multimap** and **multiset**).

As before, the first form of each function assumes the
intrinsic **operator<** has been used to perform the sort. The second form
must be used if some other comparison function object has been used to perform
the sort. You must use the same comparison for locating elements as you do to
perform the sort, otherwise the results are undefined. In addition, if you try
to use these functions on unsorted ranges the results will be
undefined.

**OutputIterator merge(InputIterator1 first1, InputIterator1
last1, **** InputIterator2 first2, InputIterator2 last2,
OutputIterator result);****OutputIterator merge(InputIterator1 first1,
InputIterator1 last1, **** InputIterator2 first2, InputIterator2
last2, OutputIterator result, **** StrictWeakOrdering
binary_pred);**

Copies elements from **[first1, last1)** and **[first2,
last2)** into **result**, such that the resulting range is sorted in
ascending order. This is a stable operation.

**void inplace_merge(BidirectionalIterator first, ****
BidirectionalIterator middle, BidirectionalIterator last);****void
inplace_merge(BidirectionalIterator first, **** BidirectionalIterator
middle, BidirectionalIterator last, **** StrictWeakOrdering
binary_pred);**

This assumes that **[first, middle)** and **[middle,
last)** are each sorted ranges. The two ranges are merged so that the
resulting range **[first, last)** contains the combined ranges in sorted
order.

It’s easier to see what goes on with merging if
**int**s are used; the following example also emphasizes how the algorithms
(and my own **print** template) work with arrays as well as
containers.

//: C05:MergeTest.cpp // Test merging in sorted ranges #include <algorithm> #include "PrintSequence.h" #include "Generators.h" using namespace std; int main() { const int sz = 15; int a[sz*2] = {0}; // Both ranges go in the same array: generate(a, a + sz, SkipGen(0, 2)); generate(a + sz, a + sz*2, SkipGen(1, 3)); print(a, a + sz, "range1", " "); print(a + sz, a + sz*2, "range2", " "); int b[sz*2] = {0}; // Initialize all to zero merge(a, a + sz, a + sz, a + sz*2, b); print(b, b + sz*2, "merge", " "); // set_union is a merge that removes duplicates set_union(a, a + sz, a + sz, a + sz*2, b); print(b, b + sz*2, "set_union", " "); inplace_merge(a, a + sz, a + sz*2); print(a, a + sz*2, "inplace_merge", " "); } ///:~

In **main( )**, instead of creating two separate
arrays both ranges will be created end-to-end in the same array **a** (this
will come in handy for the **inplace_merge**). The first call to
**merge( )** places the result in a different array, **b**. For
comparison, **set_union( )** is also called, which has the same
signature and similar behavior, except that it removes the duplicates. Finally,
**inplace_merge( )** is used to combine both parts of
**a**.

Once ranges have been sorted, you can perform mathematical set
operations on them.

**bool includes(InputIterator1 first1, InputIterator1 last1,
**** InputIterator2 first2, InputIterator2 last2);****bool
includes (InputIterator1 first1, InputIterator1 last1, ****
InputIterator2 first2, InputIterator2 last2, **** StrictWeakOrdering
binary_pred);**

Returns **true** if **[first2, last2)** is a subset of
**[first1, last1)**. Neither range is required to hold only unique elements,
but if **[first2, last2)** holds **n** elements of a particular value,
then **[first1, last1)** must also hold **n** elements if the result is to
be **true**.

**OutputIterator set_union(InputIterator1 first1,
InputIterator1 last1, **** InputIterator2 first2, InputIterator2
last2, OutputIterator result);****OutputIterator set_union(InputIterator1
first1, InputIterator1 last1, **** InputIterator2 first2,
InputIterator2 last2, OutputIterator result, **** StrictWeakOrdering
binary_pred);**

Creates the mathematical union of two sorted ranges in the
**result** range, returning the end of the output range. Neither input range
is required to hold only unique elements, but if a particular value appears
multiple times in both input sets, then the resulting set will contain the
larger number of identical values.

**OutputIterator set_intersection (InputIterator1 first1,
InputIterator1 last1, **** InputIterator2 first2, InputIterator2
last2, OutputIterator result);****OutputIterator set_intersection
(InputIterator1 first1, InputIterator1 last1, **** InputIterator2
first2, InputIterator2 last2, OutputIterator result, ****
StrictWeakOrdering binary_pred);**

Produces, in **result**, the intersection of the two input
sets, returning the end of the output range. That is, the set of values that
appear in both input sets. Neither input range is required to hold only unique
elements, but if a particular value appears multiple times in both input sets,
then the resulting set will contain the smaller number of identical
values.

**OutputIterator set_difference (InputIterator1 first1,
InputIterator1 last1, InputIterator2 first2, InputIterator2 last2,
OutputIterator result);****OutputIterator set_difference (InputIterator1
first1, InputIterator1 last1, **** InputIterator2 first2,
InputIterator2 last2, OutputIterator result, **** StrictWeakOrdering
binary_pred);**

Produces, in **result**, the mathematical set difference,
returning the end of the output range. All the elements that are in **[first1,
last1)** but not in **[first2, last2)** are placed in the result set.
Neither input range is required to hold only unique elements, but if a
particular value appears multiple times in both input sets (**n** times in
set 1 and **m** times in set 2), then the resulting set will contain
**max(n-m, 0)** copies of that value.

**OutputIterator set_symmetric_difference(InputIterator1
first1, **** InputIterator1 last1, InputIterator2 first2,
InputIterator2 last2, **** OutputIterator
result);****OutputIterator set_symmetric_difference(InputIterator1
first1, **** InputIterator1 last1, InputIterator2 first2,
InputIterator2 last2, **** OutputIterator result, StrictWeakOrdering
binary_pred);**

Constructs, in **result**, the set containing:

- All the elements in set 1 that are not in set 2
- All the elements in set 2 that are not in set 1.

Neither input range is required to hold only unique
elements, but if a particular value appears multiple times in both input sets
(**n** times in set 1 and **m** times in set 2), then the resulting set
will contain **abs(n-m)** copies of that value, where **abs( )** is
the absolute value. The return value is the end of the output range

It’s easiest to see the set operations demonstrated
using simple vectors of characters, so you view the sets more easily. These
characters are randomly generated and then sorted, but the duplicates are not
removed so you can see what the set operations do when duplicates are
involved.

//: C05:SetOperations.cpp // Set operations on sorted ranges #include <vector> #include <algorithm> #include "PrintSequence.h" #include "Generators.h" using namespace std; int main() { vector<char> v(50), v2(50); CharGen g; generate(v.begin(), v.end(), g); generate(v2.begin(), v2.end(), g); sort(v.begin(), v.end()); sort(v2.begin(), v2.end()); print(v, "v", ""); print(v2, "v2", ""); bool b = includes(v.begin(), v.end(), v.begin() + v.size()/2, v.end()); cout << "includes: " << (b ? "true" : "false") << endl; vector<char> v3, v4, v5, v6; set_union(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v3)); print(v3, "set_union", ""); set_intersection(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v4)); print(v4, "set_intersection", ""); set_difference(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v5)); print(v5, "set_difference", ""); set_symmetric_difference(v.begin(), v.end(), v2.begin(), v2.end(), back_inserter(v6)); print(v6, "set_symmetric_difference",""); } ///:~

After **v** and **v2** are generated, sorted and
printed, the **includes( )** algorithm is tested by seeing if the entire
range of **v** contains the last half of **v**, which of course it does so
the result should always be true. The vectors **v3**, **v4**, **v5**
and **v6** are created to hold the output of **set_union( )**,
**set_intersection( )**, **set_difference( )** and
**set_symmetric_difference( )**, and the results of each are displayed
so you can ponder them and convince yourself that the algorithms do indeed work
as promised.

The heap operations in the STL are primarily concerned with
the creation of the STL **priority_queue**, which provides efficient access
to the “largest” element, whatever “largest” happens to
mean for your program. These were discussed in some detail in the previous
chapter, and you can find an example there.

As with the “sort” operations, there are two
versions of each function, the first that uses the object’s own
**operator<** to perform the comparison, the second that uses an
additional **StrictWeakOrdering** object’s **operator( )(a,
b)** to compare two objects for **a** **<** **b**.

**void make_heap(RandomAccessIterator first,
RandomAccessIterator last);****void make_heap(RandomAccessIterator first,
RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Turns an arbitrary range into a heap. A heap is just a range
that is organized in a particular way.

**void push_heap(RandomAccessIterator first,
RandomAccessIterator last);****void push_heap(RandomAccessIterator first,
RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Adds the element *(**last-1)** to the heap determined by
the range **[first, last-1)**. Yes, it seems like an odd way to do things but
remember that the **priority_queue** container presents the nice interface to
a heap, as shown in the previous chapter.

**void pop_heap(RandomAccessIterator first,
RandomAccessIterator last);****void pop_heap(RandomAccessIterator first,
RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

Places the largest element (which is actually in
***first**, before the operation, because of the way heaps are defined) into
the position ***(last-1)** *and* reorganizes the remaining range so that
it’s still in heap order. If you simply grabbed ***first**, the next
element would not be the next-largest element so you must use
**pop_heap( )** if you want to maintain the heap in its proper
priority-queue order.

**void sort_heap(RandomAccessIterator first,
RandomAccessIterator last);****void sort_heap(RandomAccessIterator first,
RandomAccessIterator last, **** StrictWeakOrdering
binary_pred);**

This could be thought of as the complement of
**make_heap( )**, since it takes a range that is in heap order and turns
it into ordinary sorted order, so it is no longer a heap. That means that if you
call **sort_heap( )** you can no longer use **push_heap( )** or
**pop_heap( )** on that range (rather, you can use those functions but
they won’t do anything sensible). This is not a stable
sort.

These algorithms move through the entire range and perform an
operation on each element. They differ in what they do with the results of that
operation: **for_each( )** discards the return value of the operation
(but returns the function object that has been applied to each element), while
**transform( )** places the results of each operation into a destination
sequence (which can be the original sequence).

**UnaryFunction for_each(InputIterator first, InputIterator
last, UnaryFunction f); **

Applies the function object **f** to each element in
**[first, last)**, discarding the return value from each individual
application of **f**. If **f **is just a function pointer then you are
typically not interested in the return value, but if **f **is an object that
maintains some internal state it can capture the combined return value of being
applied to the range. The final return value of **for_each( )** is
**f**.

**OutputIterator transform(InputIterator first, InputIterator
last, **** OutputIterator result, UnaryFunction
f);****OutputIterator transform(InputIterator1 first, InputIterator1
last, **** InputIterator2 first2, OutputIterator result,
BinaryFunction f);**

Like **for_each( )**, **transform( )** applies
a function object **f** to each element in the range **[first, last)**.
However, instead of discarding the result of each function call,
**transform( )** copies the result (using **operator=**) into
***result**, incrementing **result** after each copy (the sequence pointed
to by **result** must have enough storage, otherwise you should use an
inserter to force insertions instead of assignments).

The first form of **transform( )** simply calls
**f( )** and passes it each object from the input range as an argument.
The second form passes an object from the first input range and one from the
second input range as the two arguments to the binary function **f **(note
the length of the second input range is determined by the length of the first).
The return value in both cases is the past-the-end iterator for the resulting
output range.

Since much of what you do with objects in a container is to
apply an operation to all of those objects, these are fairly important
algorithms and merit several illustrations.

First, consider **for_each( )**. This sweeps through
the range, pulling out each element and passing it as an argument as it calls
whatever function object it’s been given. Thus **for_each( )**
performs operations that you might normally write out by hand. In
**Stlshape.cpp**, for example:

for(Iter j = shapes.begin(); j != shapes.end(); j++) delete *j;

If you look in your compiler’s
header file at the template defining **for_each( )**, you’ll see
something like this:

template <class InputIterator, class Function> Function for_each(InputIterator first, InputIterator last, Function f) { while (first != last) f(*first++); return f; }

**Function** **f **looks at first like it must be
a pointer to a function which takes, as an argument, an object of whatever
**InputIterator** selects. However, the above template actually only says
that you must be able to call **f** using parentheses and an argument. This
is true for a function pointer, but it’s also true for a function object
– any class that defines the appropriate **operator( )**.**
**The following example shows several different ways this template can be
expanded. First, we need a class that keeps track of its objects so we can know
that it’s being properly destroyed:

//: C05:Counted.h // An object that keeps track of itself #ifndef COUNTED_H #define COUNTED_H #include <vector> #include <iostream> class Counted { static int count; char* ident; public: Counted(char* id) : ident(id) { count++; } ~Counted() { std::cout << ident << " count = " << --count << std::endl; } }; int Counted::count = 0; class CountedVector : public std::vector<Counted*> { public: CountedVector(char* id) { for(int i = 0; i < 5; i++) push_back(new Counted(id)); } }; #endif // COUNTED_H ///:~

The **class Counted** keeps a static count of how many
**Counted** objects have been created, and tells you as they are destroyed.
In addition, each **Counted** keeps a **char*** identifier to make
tracking the output easier.

The **CountedVector** is inherited from
**vector<Counted*>**, and in the constructor it creates some
**Counted** objects, handing each one your desired **char***. The
**CountedVector** makes testing quite simple, as you’ll see.

//: C05:ForEach.cpp // Use of STL for_each() algorithm #include "Counted.h" #include <iostream> #include <vector> #include <algorithm> using namespace std; // Simple function: void destroy(Counted* fp) { delete fp; } // Function object: template<class T> class DeleteT { public: void operator()(T* x) { delete x; } }; // Template function: template <class T> void wipe(T* x) { delete x; } int main() { CountedVector A("one"); for_each(A.begin(), A.end(), destroy); CountedVector B("two"); for_each(B.begin(),B.end(),DeleteT<Counted>()); CountedVector C("three"); for_each(C.begin(), C.end(), wipe<Counted>); } ///:~

In **main( )**, the first approach is the simple
pointer-to-function, which works but has the drawback that you must write a new
**Destroy** function for each different type. The obvious solution is to make
a template, which is shown in the second approach with a templatized function
object. On the other hand, approach three also makes sense: template functions
work as well.

Since this is obviously something you might want to do a lot,
why not create an algorithm to **delete** all the pointers in a container?
This was accomplished with the **purge( )** template created in the
previous chapter. However, that used explicitly-written code; here, we could use
**transform( )**. The value of **transform( )** over
**for_each( )** is that **transform( )** assigns the result of
calling the function object into a resulting range, which can actually be the
input range. That case means a literal transformation for the input range, since
each element would be a modification of its previous value. In the above example
this would be especially useful since it’s more appropriate to assign each
pointer to the safe value of zero after calling **delete** for that pointer.
**Transform( )** can easily do this:

//: C05:Transform.cpp // Use of STL transform() algorithm #include "Counted.h" #include <iostream> #include <vector> #include <algorithm> using namespace std; template<class T> T* deleteP(T* x) { delete x; return 0; } template<class T> struct Deleter { T* operator()(T* x) { delete x; return 0; } }; int main() { CountedVector cv("one"); transform(cv.begin(), cv.end(), cv.begin(), deleteP<Counted>); CountedVector cv2("two"); transform(cv2.begin(), cv2.end(), cv2.begin(), Deleter<Counted>()); } ///:~

This shows both approaches: using a template function or a
templatized function object. After the call to **transform( )**, the
vector contains zero pointers, which is safer since any duplicate **delete**s
will have no effect.

One thing you cannot do is **delete** every pointer in a
collection without wrapping the call to **delete **inside a function or an
object. That is, you don’t want to say something like this:

for_each(a.begin(), a.end(), ptr_fun(operator delete));

You can say it, but what you’ll get is a sequence of
calls to the function that releases the storage. You will not get the effect of
calling **delete** for each pointer in **a**, however; the destructor will
not be called. This is typically not what you want, so you will need wrap your
calls to **delete**.

In the previous example of **for_each( )**, the return
value of the algorithm was ignored. This return value is the function that is
passed in to **for_each( )**. If the function is just a pointer to a
function, then the return value is not very useful, but if it is a function
object, then that function object may have internal member data that it uses to
accumulate information about all the objects that it sees during
**for_each( )**.

For example, consider a simple model of inventory. Each
**Inventory** object has the type of product it represents (here, single
characters will be used for product names), the quantity of that product and the
price of each item:

//: C05:Inventory.h #ifndef INVENTORY_H #define INVENTORY_H #include <iostream> #include <cstdlib> #include <ctime> class Inventory { char item; int quantity; int value; public: Inventory(char it, int quant, int val) : item(it), quantity(quant), value(val) {} // Synthesized operator= & copy-constructor OK char getItem() const { return item; } int getQuantity() const { return quantity; } void setQuantity(int q) { quantity = q; } int getValue() const { return value; } void setValue(int val) { value = val; } friend std::ostream& operator<<( std::ostream& os, const Inventory& inv) { return os << inv.item << ": " << "quantity " << inv.quantity << ", value " << inv.value; } }; // A generator: struct InvenGen { InvenGen() { std::srand(std::time(0)); } Inventory operator()() { static char c = 'a'; int q = std::rand() % 100; int v = std::rand() % 500; return Inventory(c++, q, v); } }; #endif // INVENTORY_H ///:~

There are member functions to get the item name, and to get
and set quantity and value. An **operator<<** prints the
**Inventory** object to an **ostream**. There’s also a generator
that creates objects that have sequentially-labeled items and random quantities
and values. Note the use of the return value optimization in
**operator( )**.

To find out the total number of items and total value, you can
create a function object to use with **for_each( )** that has data
members to hold the totals:

//: C05:CalcInventory.cpp // More use of for_each() #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> using namespace std; // To calculate inventory totals: class InvAccum { int quantity; int value; public: InvAccum() : quantity(0), value(0) {} void operator()(const Inventory& inv) { quantity += inv.getQuantity(); value += inv.getQuantity() * inv.getValue(); } friend ostream& operator<<(ostream& os, const InvAccum& ia) { return os << "total quantity: " << ia.quantity << ", total value: " << ia.value; } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); InvAccum ia = for_each(vi.begin(),vi.end(), InvAccum()); cout << ia << endl; } ///:~

**InvAccum**’s **operator( )** takes a single
argument, as required by **for_each( )**. As **for_each( )**
moves through its range, it takes each object in that range and passes it to
**InvAccum::operator( )**, which performs calculations and saves the
result. At the end of this process, **for_each( )** returns the
**InvAccum** object which you can then examine; in this case it is simply
printed.

You can do most things to the **Inventory** objects using
**for_each( )**. For example, if you wanted to increase all the prices
by 10%, **for_each( )** could do this handily. But you’ll notice
that the **Inventory** objects have no way to change the **item** value.
The programmers who designed **Inventory** thought this was a good idea,
after all, why would you want to change the name of an item? But marketing has
decided that they want a “new, improved” look by changing all the
item names to uppercase; they’ve done studies and determined that the new
names will boost sales (well, marketing has to have *something* to do ...).
So **for_each( )** will not work here, but **transform( )**
will:

//: C05:TransformNames.cpp // More use of transform() #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <cctype> using namespace std; struct NewImproved { Inventory operator()(const Inventory& inv) { return Inventory(toupper(inv.getItem()), inv.getQuantity(), inv.getValue()); } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); transform(vi.begin(), vi.end(), vi.begin(), NewImproved()); print(vi, "vi"); } ///:~

Notice that the resulting range is the same as the input
range, that is, the transformation is performed in-place.

Now suppose that the sales department needs to generate
special price lists with different discounts for each item. The original list
must stay the same, and there need to be any number of generated special lists.
Sales will give you a separate list of discounts for each new list. To solve
this problem we can use the second version of
**transform( )**:

//: C05:SpecialList.cpp // Using the second version of transform() #include "Inventory.h" #include "PrintSequence.h" #include <vector> #include <algorithm> #include <cstdlib> #include <ctime> using namespace std; struct Discounter { Inventory operator()(const Inventory& inv, float discount) { return Inventory(inv.getItem(), inv.getQuantity(), inv.getValue() * (1 - discount)); } }; struct DiscGen { DiscGen() { srand(time(0)); } float operator()() { float r = float(rand() % 10); return r / 100.0; } }; int main() { vector<Inventory> vi; generate_n(back_inserter(vi), 15, InvenGen()); print(vi, "vi"); vector<float> disc; generate_n(back_inserter(disc), 15, DiscGen()); print(disc, "Discounts:"); vector<Inventory> discounted; transform(vi.begin(),vi.end(), disc.begin(), back_inserter(discounted), Discounter()); print(discounted, "discounted"); } ///:~

**Discounter** is a function object that, given an
**Inventory** object and a discount percentage, produces a new
**Inventory** with the discounted price. **DiscGen** just generates random
discount values between 1 and 10 percent to use for testing. In
**main( )**, two **vector**s are created, one for **Inventory**
and one for discounts. These are passed to **transform( )** along with a
**Discounter** object, and **transform( ) **fills a new
**vector<Inventory>** called
**discounted**.

These algorithms are all tucked into the header
**<numeric>**, since they are primarily useful for performing numerical
calculations.

**<numeric>****T accumulate(InputIterator first,
InputIterator last, T result);****T accumulate(InputIterator first,
InputIterator last, T result, **** BinaryFunction f);**

The first form is a generalized summation; for each element
pointed to by an iterator **i** in **[first, last)**, it performs the
operation **result = result + *i**, where **result** is of type **T**.
However, the second form is more general; it applies the function **f(result,
*i)** on each element ***i** in the range from beginning to end. The value
**result** is initialized in both cases by **resultI**, and if the range
is empty then **resultI** is returned.

Note the similarity between the second form of
**transform( )** and the second form of **accumulate( )**.

**<numeric>****T inner_product(InputIterator1
first1, InputIterator1 last1, **** InputIterator2 first2, T
init);****T inner_product(InputIterator1 first1, InputIterator1 last1,
**** InputIterator2 first2, T init**** BinaryFunction1 op1,
BinaryFunction2 op2);**

Calculates a generalized inner product of the two ranges
**[first1, last1)** and **[first2, first2 + (last1 - first1))**. The
return value is produced by multiplying the element from the first sequence by
the “parallel” element in the second sequence, and then adding it to
the sum. So if you have two sequences {1, 1, 2, 2} and {1, 2, 3, 4} the inner
product becomes:

(1*1) + (1*2) + (2*3) + (2*4)

Which is 17. The **init** argument is the initial value for
the inner product; this is probably zero but may be anything and is especially
important for an empty first sequence, because then it becomes the default
return value. The second sequence must have at least as many elements as the
first.

While the first form is very specifically mathematical, the
second form is simply a multiple application of functions and could conceivably
be used in many other situations. The **op1** function is used in place of
addition, and **op2** is used instead of multiplication. Thus, if you applied
the second version of **inner_product( )** to the above sequence, the
result would be the following operations:

init = op1(init, op2(1,1)); init = op1(init, op2(1,2)); init = op1(init, op2(2,3)); init = op1(init, op2(2,4));

Thus it’s similar to **transform( )** but two
operations are performed instead of one.

**<numeric>****OutputIterator
partial_sum(InputIterator first, InputIterator last, ****
OutputIterator result);****OutputIterator partial_sum(InputIterator
first, InputIterator last, **** OutputIterator result, BinaryFunction
op);**

Calculates a generalized partial sum. This means that a new
sequence is created, beginning at **result**, where each element is the sum
of all the elements up to the currently selected element in **[first,
last)**. For example, if the original sequence is **{1, 1, 2, 2, 3}** then
the generated sequence is **{1, 1 + 1, 1 + 1 + 2, 1 + 1 + 1 + 2 + 2, 1 + 1 + 1
+ 2 + 2 + 3}**, that is, **{1, 2, 4, 6, 9}**.

In the second version, the binary function **op** is used
instead of the **+** operator to take all the “summation” up to
that point and combine it with the new value. For example, if you use
**multiplies<int>( )** as the object for the above sequence, the
output is **{1, 1, 2, 4, 12}**. Note that the first output value is always
the same as the first input value.

The return value is the end of the output range **[result,
result + (last - first) )**.

**<numeric>****OutputIterator
adjacent_difference(InputIterator first, InputIterator last, ****
OutputIterator result);****OutputIterator
adjacent_difference(InputIterator first, InputIterator last,****
OutputIterator result, BinaryFunction op);**

Calculates the differences of adjacent elements throughout the
range **[first, last)**. This means that in the new sequence, the value is
the value of the difference of the current element and the previous element in
the original sequence (the first value is the same). For example, if the
original sequence is **{1, 1, 2, 2, 3}**, the resulting sequence is **{1, 1
– 1, 2 – 1, 2 – 2, 3 – 2}**, that is: **{1, 0, 1, 0,
1}**.

The second form uses the binary function **op** instead of
the **–** operator to perform the “differencing.” For
example, if you use **multiplies<int>( )** as the function object
for the above sequence, the output is **{1, 1, 2, 4, 6}**.

The return value is the end of the output range **[result,
result + (last - first) )**.

This program tests all the algorithms in
**<numeric>** in both forms, on integer arrays. You’ll notice
that in the test of the form where you supply the function or functions, the
function objects used are the ones that produce the same result as form one so
the results produced will be exactly the same. This should also demonstrate a
bit more clearly the operations that are going on, and how to substitute your
own operations.

//: C05:NumericTest.cpp #include "PrintSequence.h" #include <numeric> #include <algorithm> #include <iostream> #include <iterator> #include <functional> using namespace std; int main() { int a[] = { 1, 1, 2, 2, 3, 5, 7, 9, 11, 13 }; const int asz = sizeof a / sizeof a[0]; print(a, a + asz, "a", " "); int r = accumulate(a, a + asz, 0); cout << "accumulate 1: " << r << endl; // Should produce the same result: r = accumulate(a, a + asz, 0, plus<int>()); cout << "accumulate 2: " << r << endl; int b[] = { 1, 2, 3, 4, 1, 2, 3, 4, 1, 2 }; print(b, b + sizeof b / sizeof b[0], "b", " "); r = inner_product(a, a + asz, b, 0); cout << "inner_product 1: " << r << endl; // Should produce the same result: r = inner_product(a, a + asz, b, 0, plus<int>(), multiplies<int>()); cout << "inner_product 2: " << r << endl; int* it = partial_sum(a, a + asz, b); print(b, it, "partial_sum 1", " "); // Should produce the same result: it = partial_sum(a, a + asz, b, plus<int>()); print(b, it, "partial_sum 2", " "); it = adjacent_difference(a, a + asz, b); print(b, it, "adjacent_difference 1"," "); // Should produce the same result: it = adjacent_difference(a, a + asz, b, minus<int>()); print(b, it, "adjacent_difference 2"," "); } ///:~

Note that the return value of **inner_product( )** and
**partial_sum( )** is the past-the-end iterator for the resulting
sequence, so it is used as the second iterator in the **print( )**
function.

Since the second form of each function allows you to provide
your own function object, only the first form of the functions is purely
“numeric.” You could conceivably do some things that are not
intuitively numeric with something like
**inner_product( )**.

Finally, here are some basic tools that are used with the
other algorithms; you may or may not use them directly yourself.

**<utility>****struct
pair;****make_pair( );**

This was described and used in the previous chapter and in
this one. A **pair** is simply a way to package two objects (which may be of
different types) together into a single object. This is typically used when you
need to return more than one object from a function, but it can also be used to
create a container that holds **pair **objects, or to pass more than one
object as a single argument. You access the elements by saying **p.first**
and **p.second**, where **p** is the **pair** object. The function
**equal_range( )**, described in the last chapter and in this one,
returns its result as a **pair** of iterators. You can **insert( )**
a **pair** directly into a **map** or **multimap**; a **pair** is
the **value_type** for those containers.

If you want to create a **pair**, you typically use the
template function **make_pair( )** rather than explicitly constructing a
**pair** object.

**<iterator>****distance(InputIterator first,
InputIterator last);**

Tells you the number of elements between **first** and
**last**. More precisely, it returns an integral value that tells you the
number of times **first** must be incremented before it is equal to
**last**. No dereferencing of the iterators occurs during this
process.

**<iterator>****void advance(InputIterator&
i, Distance n);**

Moves the iterator **i** forward by the value of **n**
(the iterator can also be moved backward for negative values of **n** if the
iterator is also a bidirectional iterator). This algorithm is aware of
bidirectional iterators, and will use the most efficient approach.

**<iterator>****back_insert_iterator<Container>
back_inserter(Container&
x);****front_insert_iterator<Container>
front_inserter(Container& x);****insert_iterator<Container>
inserter(Container& x, Iterator i);**

These functions are used to create iterators for the given
containers that will insert elements into the container, rather than overwrite
the existing elements in the container using **operator= **(which is the
default behavior). Each type of iterator uses a different operation for
insertion: **back_insert_iterator** uses **push_back( )**,
**front_insert_iterator** uses **push_front( )** and
**insert_iterator** uses **insert( )** (and thus it can be used with
the associative containers, while the other two can be used with sequence
containers). These were shown in some detail in the previous chapter, and also
used in this chapter.

**const LessThanComparable& min(const
LessThanComparable& a, **** const LessThanComparable&
b);****const T& min(const T& a, const T& b, BinaryPredicate
binary_pred);**

Returns the lesser of its two arguments, or the first argument
if the two are equivalent. The first version performs comparisons using
**operator<** and the second passes both arguments to **binary_pred**
to perform the comparison.

**const LessThanComparable& max(const
LessThanComparable& a, **** const LessThanComparable&
b);****const T& max(const T& a, const T& b, BinaryPredicate
binary_pred);**

Exactly like **min( )**, but returns the greater of
its two arguments.

**void swap(Assignable& a, Assignable&
b);****void iter_swap(ForwardIterator1 a, ForwardIterator2
b);**

Exchanges the values of **a** and **b** using
assignment. Note that all container classes use specialized versions of
**swap( )** that are typically more efficient than this general
version.

Once you become comfortable with the STL algorithm style, you
can begin to create your own STL-style algorithms. Because these will conform to
the format of all the other algorithms in the STL, they’re easy to use for
programmers who are familiar with the STL, and thus become a way to
“extend the STL vocabulary.”

The easiest way to approach the problem is to go to the
**<algorithm>** header file and find something similar to what you
need, and modify that (virtually all STL implementations provide the code for
the templates directly in the header files). For example, an algorithm that
stands out by its absence is **copy_if( ) **(the closest approximation
is **partition( )**), which was used in **Binder1.cpp** at the
beginning of this chapter, and in several other examples in this chapter. This
will only copy an element if it satisfies a predicate. Here’s an
implementation:

//: C05:copy_if.h // Roll your own STL-style algorithm #ifndef COPY_IF_H #define COPY_IF_H template<typename ForwardIter, typename OutputIter, typename UnaryPred> OutputIter copy_if(ForwardIter begin, ForwardIter end, OutputIter dest, UnaryPred f) { while(begin != end) { if(f(*begin)) *dest++ = *begin; begin++; } return dest; } #endif // COPY_IF_H ///:~

The return value is the past-the-end iterator for the
destination sequence (the copied sequence).

Now that you’re comfortable with the ideas of the
various iterator types, the actual implementation is quite straightforward. You
can imagine creating an entire additional library of your own useful algorithms
that follow the format of the STL.

The goal of this chapter, and the previous one, was to give
you a programmer’s-depth understanding of the containers and algorithms in
the Standard Template Library. That is, to make you aware of and comfortable
enough with the STL that you begin to use it on a regular basis (or at least, to
think of using it so you can come back here and hunt for the appropriate
solution). It is powerful not only because it’s a reasonably complete
library of tools, but also because it provides a vocabulary for thinking about
problem solutions, and because it is a framework for creating additional
tools.

Although this chapter and the last did show some examples of
creating your own tools, I did not go into the full depth of the theory of the
STL that is necessary to completely understand all the STL nooks and crannies to
allow you to create tools more sophisticated than those shown here. I did not do
this partially because of space limitations, but mostly because it is beyond the
charter of this book; my goal here is to give you practical understanding that
will affect your day-to-day programming skills.

There are a number of books dedicated solely to the STL (these
are listed in the appendices), but the two that I learned the most from, in
terms of the theory necessary for tool creation, were first, *Generic
Programming and the STL* by Matthew H. Austern, Addison-Wesley 1999 (this
also covers all the SGI extensions, which Austern was instrumental in creating),
and second (older and somewhat out of date, but still quite valuable), *C++
Programmer’s Guide to the Standard Template Library* by Mark Nelson,
IDG press 1995.

- Create a generator that returns the current value of
**clock( )**(in**<ctime>**). Create a**list<clock_t>**and fill it with your generator using**generate_n( )**. Remove any duplicates in the list and print it to**cout**using**copy( )**. - Modify
**Stlshape.cpp**from chapter XXX so that it uses**transform( )**to delete all its objects. - Using
**transform( )**and**toupper( )**(in**<cctype>**) write a single function call that will convert a**string**to all uppercase letters. - Create a
**Sum**function object template that will accumulate all the values in a range when used with**for_each( )**. - Write an anagram generator that takes a word as a command-line argument and produces all possible permutations of the letters.
- Write a “sentence anagram generator” that takes a sentence as a command-line argument and produces all possible permutations of the words in the sentence (it leaves the words alone, just moves them around).
- Create a class hierarchy with a base class
**B**and a derived class**D**. Put a**virtual**member function**void****f( )**in**B**such that it will print a message indicating that**B**’s**f( )**has been called, and redefine this function for**D**to print a different message. Create a**deque<B*>**and fill it with**B**and**D**objects. Use**for_each( )**to call**f( )**for each of the objects in your**deque**. - Modify
**FunctionObjects.cpp**so that it uses**float**instead of**int**. - Modify
**FunctionObjects.cpp**so that it templatizes the main body of tests so you can choose which type you’re going to test (you’ll have to pull most of**main( )**out into a separate template function). - Using
**transform( )**,**toupper( )**and**tolower( )**(in**<ccytpe>**), create two functions such that the first takes a**string**object and returns that**string**with all the letters in uppercase, and the second returns a**string**with all the letters in lowercase. - Create a
container of containers of
**Noisy**objects, and sort them. Now write a template for your sorting test (to use with the three basic sequence containers), and compare the performance of the different container types. - Write a program that takes as a command line argument the name of a
text file. Open this file and read it a word at a time (hint: use
**>>**). Store each word into a**deque<string>**. Force all the words to lowercase, sort them, remove all the duplicates and print the results. - Write a program that finds all the words that are in common between
two input files, using
**set_intersection( )**. Change it to show the words that are not in common, using**set_symmetric_difference( )**. - Create a program that, given an
integer on the command line, creates a “factorial table” of all the
factorials up to and including the number on the command line. To do this, write
a generator to fill a
**vector<int>**, then use**partial_sum( )**with a standard function object. - Modify
**CalcInventory.cpp**so that it will find all the objects that have a quantity that’s less than a certain amount. Provide this amount as a command-line argument, and use**copy_if( )**and**bind2nd( )**to create the collection of values less than the target value. - Create
template function objects that perform bitwise operations for
**&**,**|**,**^**and**~**. Test these with a**bitset**. - Fill a
**vector<double>**with numbers representing angles in radians. Using function object composition, take the sine of all the elements in your vector (see**<cmath>**). - Create a
**map**which is a cosine table where the keys are the angles in degrees and the values are the cosines. Use**transform( )**with**cos( )**(in**<cmath>**) to fill the table. - Write a program to compare the speed of sorting a
**list**using**list::sort( )**vs. using**std::sort( )**(the STL algorithm version of**sort( )**). Hint: see the timing examples in the previous chapter. - Create and test a
**logical_xor**function object template to implement a logical exclusive-*or*. - Create
an STL-style algorithm
**transform_if( )**following the first form of**transform( )**which only performs transformations on objects that satisfy a unary predicate. - Create an STL-style algorithm which is an
overloaded version of
**for_each( )**that follows the second form of**transform( )**and takes two input ranges so it can pass the objects of the second input range a to a binary function which it applies to each object of the first range. - Create a
**Matrix**class which is made from a**vector<vector<int> >**. Provide it with a**friend ostream& operator<<(ostream&, const Matrix&)**to display the matrix. Create the following using the STL algorithms where possible (you may need to look up the mathematical meanings of the matrix operations if you don’t remember them):**operator+(const Matrix&, const Matrix&)**for**Matrix**addition,**operator*(const Matrix&, const vector<int>&)**for multiplying a matrix by a vector, and**operator*(const Matrix&, const Matrix&)**for matrix multiplication. Demonstrate each. - Templatize the
**Matrix**class and associated operations from the previous example so they will work with any appropriate type.

[ Previous Chapter ]
[ Short TOC ]
[ Table of Contents ]
[ Index ]
[ Next Chapter ]

Last Update:02/22/2000

Last Update:02/22/2000