Functions

oop

Learning Objectives

Explain why the use of functions is desirable and write C++ programs that make use of them.

Explain the different types of variable scope in C++ and declare and use them appropriately.

Explain the difference between passing arguments by value and reference and use this in C++ programs

Implement recursive and iterative solutions to problems and compare and contrast them with regards to efficiency and simplicity

Code reuse

Previously we described how conditional and iterative control structures are implemented in C++, and in this chapter we list the third type of control structure, subprograms.

Often, to perform the same of a similar sequence of operations in different parts of our program rather than enter the same statements multiple times, it is better to reuse a single piece of code. This is considered good practice because it can make code:

less work to write
less work to modify
less prone to error
more readable, helping others to make changes

Functions

A function is a sequence of statements defined by the programmer that is identified by name. The components of a function are:

Arguments or values that are passed to the function from the program (the types of each must be specified at compile time).
Return type or the value of a specified type which is returned.
Body which is where the programmer defines the sequence of statements to be executed.

Basic function syntax

Return_type function_name (argument_list)
{
  statements
}

To illustrate the use of a function, consider the following example program which calculates the standard error of a data sample ( $ \frac{standard \ deviation}{\sqrt{sample \ size}} $ ) using a function named std_err:

Standard Error Program

#include <iostream>
#include <math.h>
using namespace std;

// function prototype
double std_err (double std_dev, int n);

int main() {
  const double std_dev = 3.0;
  for (int i = 1; i < 30; i++)
    cout << "Std dev = " << std_dev 
         << ", Sample Size = " << i
         << ", Std err = "
         << std_err(std_dev,i)
         << endl;
  return 0;  
}

// function signature
double std_err (double std_dev, int n)
{
  return std_dev / sqrt(n);
}

The fourth line of the program contains double std_err (double std_dev, int n); which is the function prototype. This tells the compiler that the std_err function exists and it will be fully defined later. Although it is possible to include the entire function definition before the main function definition or even in header files.

The last four lines of the code above represent the function signature and it contains the following information:

Return type - double
Function name - std_err
Argument list - double std_dev, int n
Function body - return std_dev / sqrt(n);

The function call denotes where the function is called, and in this case it is << std_err(std_dev,size) << endl;. The function call will return a value of the return type.

Note that in C++ functions only return a single value but there are ways around this.

Developing subjects with multiple source files

With larger programs, code readability can be improved if related code/functions are separated into separate source files which can also promote code reuse. It is typical to store:

Function prototypes in a “.h” header file
Function bodies in a “.cpp” source file

#include can be used to make sure that the appropriate code is available.

Default values for arguments

Default values can also be specified for function arguments, making it possible to omit the argument when the function is called - the missing argument will take on the specified value, for example consider this alternate definition of std_err:

double std_err (double std_dev, int n=10)
{
  return std_dev / sqrt(n);
}

Here the =10 after the argument n means that if omitted the argument n will take on the value 10. This function can still be called using both arguments, for example:

float se1 = std_err (sd, 5); // n takes a value of 5
float se2 = std_err (sd); // n takes default value of 10

Function arguments

When calling a function, its argument types must match the values passed. Consider a function:

Pretty_print(int x);

int input

int a = 33;
pretty_print(a);

With an integer input, the types match and function completes without errors.

char input

char c = 'x';
pretty_print(c);

char can be converted to int using ASCII form, and the function will still work.

string input

string str = "hello";
pretty_print(str);

string cannot be cast into int - this will lead to a compilation error.

Function overloading

Function overloading is where multiple functions with the same name but different argument types can be built. Consider the following example:

int main() {
  int a = 33;
  pretty_print(a);
  char c = 'x';
  pretty_print(c);
  string str = "hello";
  pretty_print(str);
  return 0;
}
void pretty_print (int x) {
  cout << "Integer: " << x << endl;
}
void pretty_print (string x) {
  cout << "String: " << x << endl;
}

The first two pretty_print()s call the int version of the function, whereas the third calls the string version. the varying overloaded functions can even have different numbers of arguments.

As C++ performs static type checking the compiler will decide which of the overloaded functions to use for each function call.

Variable scope

Many variables can only be used in a limited part of the overall program. This is known as the scope of the variable. there are 3 types of scope in C++:

1. Global scope

A variable declared outside of all functions, usually at the beginning of the program will have global scope (ie. between the using namespace --- and int main() ). Variables with global scope are accessible from anywhere in the program, after the declaration.

2. Function scope

A variable which is a function argument or declared inside a function body will have Function scope. Function scope variables are only accessible inside the function, after the declaration.

3. Block scope

A variable declared inside a code block (within the {…….}) with have block scope. Block scope variables are only accessible inside the block, after their declaration.

Note that, if at any point there are two variables with the same identifier in scope, any use of that identifier will access the ‘closest’ one in scope, i.e. block, then function, then global.

Passing variables into functions

Pass-by-value

So far, when arguments have been passed to functions they have been passed by value. This is where the value of the variable is copied into a new variable inside the function. Any changes made to the new variable inside the function will not apply to the original variable.

Pass-by-reference &

Instead of copying the value of the variable, we pass a reference to the variable itself into the function. Any changes to the new variable inside the function will apply to the original variable.

What’s the difference

This example will illustrate the difference between the two methods.

#include <iostream>

void test_by_value (int);
void test_by_reference (int&);

int main () {
  int x;
  x = 5;
  cout << "x = " << x << endl ;
  test_by_value (x);
  cout << "x = " << x << endl;

  test_by_reference (x);
  cout << "x = " << x << endl;
}

void test_by_value (int n ){
  n++
}

void test_by_reference (int& n){
  n++
}

This program contains two functions with identical bodies. The only difference is that the first uses pass-by-value and the second uses pass-by-reference.

The & after the argument data type tells the compiler to use pass-by-reference in the function definition. The output of this program will look like:

x = 5
x = 5 // pass-by-value
x = 6 // pass-by-reference

The test by value does not change the value of x, where the test by reference increments its value. Effectively we are extending the scope of the variable by passing the reference into the function so that the function can access it.

Pass-by-reference is useful when values are swapped over in functions affecting their global scope. Another possible reason for using pass-by-reference is program efficiency. If we are making many function calls in a program, it can be inefficient to copy large amounts of data from one place to another. Using pass-by-reference can improve efficiency in such cases.

Recursion

Recursion is the ability of a function to call itself. For example, the factorial function can be defined as follows:

$\begin{equation} n! = \begin{cases} 1 & \text{if $n=0$ }\\ n \cdot (n - 1)! & \text{if $n>0$ } \end{cases} \end{equation}$

In mathematical terms, we can say that this recursive definition consists of two parts: the anchor (n=0) and the inductive step (n>0).

Anchor condition

The anchor condition is the point at which no recursive calls will be made. It is sometimes called the ground case. With each iteration of a recursive there should be some progress towards this anchor condition.

Inductive step

This is where the function calls itself, and arguments passed should make some progress towards the anchor condition.

Recursion example

...
int main() {
  int n;
  cout << "Enter number:" << endl;
  cin >> n;
  cout << n << "th Fibonacci number is " << Fib(n) << endl;
  return 0;
}

int Fib (int n) {
  if ((n == 0) || (n == 1))
    return 1;
  else
    return (Fib(n-1) + Fib(n-2));
}

The if statement, if ((n == 0) || (n == 1)) represent the anchor condition. The two recursive calls, return (Fib(n-1) + Fib(n-2)); represents the inductive step.

Recursive implementations can appear strange at first. To answer this it is useful to examine what exactly happens when a function is called.

Environment of a function

When a function is called, the operating system needs to store information about that function:

Values of arguments
Local variables (within function scope)
Memory slot allocated to hold the returned value
What originally called the function so that execution can continue from where it left off before the function.

All this information is collectively referred to as the environment of the function.

Program Stack

If recursive calls are being made the compiler needs to store and access all of the separate environments for the different function calls. Compilers use a data structure which can be visualised and is known as a stack. The program stack is a list of values accessed using last-in-first-out (LIFO) principle as illustrated in the table below. Two operations are defined on the stack:

Push refers to a an environment being added to the stack
Pop refers to an environment being removed from the stack

Push/Pop	Contents of stack
Push 6	`6`
Push 11	`11` `6`
Pop	`6`
Push 8	`8` `6`
Push 5	`5` `8` `6`
Pop	`6` `8`

When a function call is made, the current environment is pushed onto the stack and creates a new environment. When the function terminates, the compiler pops the saved environment off the stack, replacing the function’s environment. This process happens every time a function call is made, so if multiple (recursive) calls are made we will end up with multiple environments on the program stack. For example consider the following program:

#include <iostream>
using namespace std

int f1 (int);
int f2 (int);

int main () {
  int x = 3;
  cout << "f1(3) = " << f1(x) << endl;
  return 0;
}

int f1 (int a) {
  int p = a * 3;
  return f2(p + 2);
}
int f2 (int b) {
  int q = b * 2;
  return q - 1;
}

The state of the program stack during execution of the program shown:

Push/Pop	Contents of stack
Push x=3	`x=3`
Push a=3, p=9	`a=3` `p=9` `x=3`
Pop	`p=9` `x=3`
Pop	`x=3`

Creating environments and pushing and popping them onto and from the stack takes time. Recursive implementations typically make a lot more function calls than equivalent iterative implementations, and for this reason they tend to be a lot less efficient. Therefore, from an efficiency perspective, it is generally better to use iteration.

However, some solutions are much more elegantly and simply expressed in recursive form, so they can often be a trade-off between efficiency and simplicity.

Types of Recursion

All recursive functions contain at least one call to themselves. However there are a number of different types of recursive functions, based on how many recursive calls there are, and how many recursive calls there are, and where in the function they occur.

Tail recursion

The recursive call is the last statement in the function.

Head recursion

The recursive call is the first statement in the function.

Middle recursion

The recursive call occurs in the middle of the function.

Multi recursion

This refers to the case where there are more than one recursive calls in a function, for example a recursive call in a function to create the Fibonacci sequence called fib(n) ( Fib (n -1) + Fib (n -2));.

Mutual recursion indirect recursion

Functions X and Y are mutually-recursive when X calls Y and Y also calls X.

Tail recursion is considered the most efficient form of recursion (almost comparable in terms of efficiency with iteration) after the recursive call the storage can be immediately deallocated.

Efficiency

If a function is called a very large number of times in a program in which efficiency is a prime concern, it can become a problem. In such cases the compiler should be instructed to create an inline function.

With inline functions, rather than treating a function call in the normal way (i.e. using the program stack), the function body code is inserted/copied into the program to replace the function call. This means more code in the program, and hence longer compilation time, but fewer overheads at run-time. The example function below illustrates the definition of an inline function:

inline double sdt_err (double std_dev, int n)
// compute standard error of mean given sample
// standard deviation std_dev and sample size n
{
  return std_dev / sqrt(n);
}

return link

Written by Toby Morris & Tobias Whetton