Learning Objectives
- Describe the use of pointers.
- Describe how one- and multi-dimensional arrays are defined and implemented.
- Define and manipulate
string
variables.- Define and use new data types using
struct
andenum
.- Perform file input/output operations.
There are two main differences between C++ and Matlab with regard to data types, and these are summed up in the table below:
Language | Type Definition | Type Checking |
---|---|---|
Matlab | implicit works out the type of variable from the value it is assigned |
dynamic type consistency is checked at run time |
C++ | explicit stated by the programmer in the variable declaration |
static type consistency is checked at compile time |
If we think of simple data types as containers that hold a value of the specified type. Then pointers are the addresses to the locations of the containers. Just like ordinary data types, pointer variables still have an associated data type, for example, a variable may have the type ‘pointer to an int
’ , or ‘pointer to a char
’. They are defined using the *
symbol, before the variable name in the variable declaration (however do not confuse this with dereferencing!). For example:
int *p1, *p2; // pointers to 'int' values
char *cp; // Pointer to a 'char' value
Declaring a pointer variable does not mean that it points to a valid value in memory - it must be initialised to point to something. For example:
int val = 1000;
p1 = &val;
The &
symbol means the address of, in this case p1
points to an area of memory which holds the int
value 1000. Consider another example below:
char *c = new char;
*c = 'x';
This code creates a variable called c
that points to a char
type. The new
keyword can be used to allocate memory without first defining a variable as in the previous example, where int val = 1000;
. Similarly we can delete the space pointed to:
delete c;
This statement frees up memory allocated by the new
statement, enabling the compiler to make use of it for something else. This means that c
no longer points to a valid value and should not be accessed. It is always good practice to delete
unused pointers.
It is possible to have multiple pointers printing to the same memory space, for example:
int *p1 , *p2 ;
int val = 1000;
p1 = &val ;
p2 = &val ;
When more than one pointer points to the same area of memory, if the value of the variable is changed via one pointer, it is changed via the other pointer as well. For example:
*p1 = 500;
cout << "p1 -> " << *p1 << ", p2 -> " << *p2 << endl;
The output produced this code would be p1 -> 500, p2 --> 500
. The *
before a variable name dereferences the pointer (i.e. it returns the value pointed to), and this is effectively the opposite of &
.
A 1-D array is a sequence of values of the same type (e.g. an array of 10 integers). The values are commonly referred to as elements. Higher-dimensional arrays are also possible such as 2-D arrays where every element is itself an array.
In C++ array variables must be explicitly declared and the array size be specified and fixed at compile-time (i.e in the array variable declaration). Consider the following example:
int x[10]; // defines array of 10 integers
x[9] = 3; // assign to last one
The array size is specified in the square brackets [ ]
after the variable name. Elements of an array are undefined until they are initialised. Square brackets are also used to access array elements, and note that array indices start at 0.
To assign an entire array in one statement when the array variable is declared, use curly brackets, e.g.
int x[10] = {1,2,3,4,5,6,7,8,9,10};
Note that in C++ commas are required as a delimiter, spaces cannot be used to separate the element values.
When assigning an array it is possible to omit the array size and let the compiler work out the size itself, editing the previous example:
int x[] = {1,2,3,4,5,6,7,8,9,10};
Once an array has been declared only individual elements can be assigned and accessed, not the array as a whole. Also, it is impossible to display an entire array in a single statement, only individual array elements can be displayed using the count
statement. For example, to display the array variable x
declared in the previous example the following code is required:
for (int xind = 0; xind < 10; xind++)
cout << x[xind] << " ";
cout << endl;
Multi-dimensional arrays can be defined in C++, for example declaring a 2D array:
char a[10][10]; // 10 by 10 array of chars
int b[2][2] = {{2, 3}, {1, 4}}; // 2x2 array of ints
The size of the second dimension is specified in a second set of square brackets after the first. If the elements are initialised then the size of the array has to be considered, in this case two sets of curly brackets are used.
Higher dimensional arrays are accessed in the same way as 1-D arrays, the second array index is just added inside the square brackets after the first, for example:
int d = b[0][0] * b[1][1] - b[0][1] * b[1][0];
Arrays are implemented by the C++ compiler using pointers. For instance, an array of integers is implemented as a pointer to an integer, where the value pointed to is the first element of the array and the other elements are in continuous memory spaces following the first element.
Consider the following example:
int p1[3] = {1000, 500, 750}
The array variable p1
(which is in fact a pointer to an int
) points to the area of memory containing the first element (100). The second and third elements are contained in the areas of memory immediately following this first element:
1000 p1 |
500 |
750 |
So any array element can be accessed by dereferencing the pointer and moving forward a certain number of blocks of memory. In fact, the square bracket notation is simply a short-hand way of doing this. The same concept is applied for 2-D arrays, for example:
int p2[2][3] = {{1000, 500, 750}, {100, 200, 300}};
Therefore in this 2-D case the graphical illustration of array implementation:
1000 p2 |
500 |
750 |
100 |
200 |
300 |
This explains why, the size of higher-dimensional arrays need to be known at compile-time. From looking at the above graphical illustration for a 2-D array alone we would know if p2
was a 2 x 3 or a 3 x 2 array (or indeed a 1-D array of 6 elements). Therefore, this information needs to be specified in the array declaration, and will be remembered by the compiler so it can access the array elements correctly.
Arrays can be passed as arguments to functions, just like any other variable. However, all array arguments to functions are treated as pass-by-reference, as arrays are essentially pointers. The syntax for passing array arguments to functions is:
void func (int x[]) {...}
For 1-D arrays, the array size doesn’t need to be specified when defining the function header. But for 2-D arrays at least the second dimension needs to be specified, although both can be specified:
void function (int x[2][3]){...}
The following example illustrates the passing of an array variable to a function. The code displays a frequency table of true positive (TP), false positive (FP), true negative (TN), false negative (FN) values.
#include <iostream>
#include <iomanip>
#include "freq_table."
using namespace std;
void dispFreqTable(int freq[2][3])
{
int rsum[2] = {0,0}, csum[2] = {0,0}, tot = 0;
cout << " | GT +ve | GT -ve | Total"
<< endl;
cout << "---------|--------|--------|------"
<< endl;
for (int r = 0; r < 2; r++) {
cout << "Test +ve |";
for (int c = 0; c < 2; c++) {
cout << setw(8) << freq[r][c] << "|";
rsum[r] += freq[r][c];
csum[c] += freq[r][c];
tot += freq[r][c];
}
cout << setw(6) << rsum[r] << endl;
}
cout << "---------|--------|--------|------"
<< endl;
cout << "Total |" << setw(8) << setw(8)
<< csum[0] << "|" << setw(8) << csum[1]
<< "|" << setw(6) << tot << ends;
}
In the above program the dispFreqTable
function takes a 2 x 2 array of integers as an argument. The first 2 in the argument type is optional as it is only required to specify the second array dimension in the argument type. The setw
function sets the width of the output of the next item in a cout
. To use setw
the source file #include <iomanip>
is called.
An array of characters is known as a string. The most common way of using them to #include
the standard <string>
library. Consider the following program:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string greeting = "hello", name;
cout << "What's your name? ";
cin >> name;
string a = greeting + " " + name;
cout << a << endl;
int n = name.length();
cout << "Your name has " << n << " letters"
<< ends;
return 0;
}
The string
library defines new versions of built in C++ operators with its own version which can take strings as arguments i.e. overloading. Here a few definitions from the string
library:
=
assignment+
string concatenationcin
inputcout
output==
!=
>
<
>=
<=
string comparison operators (performed by character)getline
gets a line of text from standard inputThe above program also illustrates the use of a special function that is associated with a string
variable: name.length()
. Here, name
is a string
variable and the function call length()
is appended to it, separated by a full stop. This function call returns the number of characters in name
. These special functions are known as member functions. Other member functions can be called in the same way, using string
variables:
find
finds an instance of a substring within a stringreplace
replaces a substring within a string by another stringThe following program demonstrates the use of the find
and replace
member functions with string
variables:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string str ("Brazil are the best team in the world.");
// can search for a constant string
size_t found = str.find("the");
if (found != string::npos)
cout << "'the' found at: " << found << endl;
found = str.find("the", found + 1);
if (found != string::npos)
cout << "second 'the' found at: "
<< found << endl;
// can search for another string variable
string str2 = "England";
found = str.find(str2);
if (found != string::npos)
cout << "'England' found at: " << found << endl;
// can search for another string variable
string str2 = "Brazil";
found = str.find(str2);
if (found != string::npos)
cout << "'Brazil' found at: " << found << endl;
// replace a substring with another string
str.replace(str.find(str2), str2.length(), "England");
cout << str << endl;
return 0;
}
The find
function takes a single argument, which can be another string or a string constant. It returns a number indicating the array index where the substring starts. The type of the returned value is size_t
which is just an unsigned integer, however it has the special property of guaranteeing to be big enough to refer to the largest amount of memory the machine has.
In the case of the substring not being found i.e. ‘England’ in the above program. Then find
returns a special value from the string
library called string::npos
. This means the constant npos
from the string
library. The ::
symbol is called the scope operator.
The replace
function takes 3 arguments, the start and end indices of the substring to be replaced, and the string to replace it with. The output of the program when run would be:
'the' found at: 11
second 'the' found at: 37
'Brazil' found at: 0
England are the best team in the world.
Structures are another way of defining new data types. Whereas array types are used for storing a collection of values of the same type, structures are used for storing a collection of values of different types. Each component of a structure is called a member.
Consider the following example for defining a data type to store information about patients:
struct PatientData {
string firstName;
string lastName;
unsigned int age;
double bloodPressure;
};
This defines a new data type, called PatientData
. Variables of type PatientData
contain four values: two string
, an unsigned int
and a double
. Variables can be declared of type PatientData
just as we can for built in C++ types, as the following code illustrates:
PatientData p1;
cout << "Enter patient's name (first last):";
cin >> p1.firstName >> p1.lastName;
cout << "Enter " << p1. firstName << " "
<< p1.lastName << "'s age:";
cin >> p1.age;
cout << "Enter " << p1.firstName << " "
<< p1.lastName << "'s blood pressure:";
cin >> p1.bloodPressure;
struct
provides a great mechanism for creating new types that group data together.
Enumeration types are another way of creating new data types. A variable that is declared as an enumeration type can take any one of a pre-defined number of symbolic values.
For example, the following code creates a new data type to store information about chess pieces:
enum ChessPiece {Pawn, Rook, Knight, Bishop, King,
Queen, Empty};
enum Colour {White, Black, None};
struct Square {
ChessPiece piece;
Colour colour;
};
Here we have defined 3 new data types:
ChessPiece
variables can take any one of these symbolic values: Pawn
Rook
Knight
Bishop
King
Queen
Empty
Colour
variables can take any of these symbolic values: White
Black
None
Square
variables contain two values: a Chesspiece
and a Colour
Based on these newly defined data types, we can then go on to declare a chess board, and to start to fill it up with pieces:
Square b[8][8];
b[0][0].piece = Rook;
b[0][0].colour = White;
The variable b
represents the chess board, and is a 2-D (8 x 8) array of Sqaure
. We have initialised the square index by [0][0]
to be a white rook. The use of enum types is good practice if the data is symbolic i.e. categorised data, non-numeric with no ordering.
Reading and writing data from and to external files occurs much in the same way with advanced data types as standard ones. Consider the following example program:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream inFile;
inFile.open("data.txt");
if (!inFile) {
cerr << "Error opening file: data.txt" << endl;
return 1;
}
int ages[5];
for (int i = 0; i < 5; i++)
inFile >> ages[i];
inFile.close();
return 0;
}
In the above program, #include <fstream>
allows the program to use any file input/output operations. Variables can then by declared of type ifstream
(for input files) or ofstream
(for output files).
All files (input or output) must be opened before use with the open()
function in this case. If open()
returns a false value the file is not successfully opened (i.e doesn’t not exist or is locked).
The ifstream
variable, inFile
, can be used like cin
to input data. cerr
is an alternative output statement that sends data to standard error rather that standard output. It is good practice to separate normal program output from error messages in this way. All the files (input and output) must be closed after use.
Although not illustrated in the above example, the same principles apply to the ofstream
variables just like cout
. In addition there are a number of other file input/output functions that we can make use of:
get
gets a single character from the input fileput
puts a single character into the output filegetline
gets an entire line from the input file, i.e. it will read all the data until the next newline character (i.e any space, tab or newline)The following example shows the use of getline
#include <iostream>
#inlcude <fstream>
using namespace std;
int main()
{
ifstream namesFile;
namesFile.open("names.txt");
if (!namesFile) {
err << "Error opening file: names.txt"
<< endl;
return 1;
}
string names[10];
for (int i = 0; i < 10; i++)
getline (namesFile, names[i]);
namesFile.close();