The last article introduced our first fundamental concept, the variable, and explained that variables are a named memory location for which we want to store a specific type of data, more on variables can be read here. Variables are useful for storing a single value, a single piece of data of some type, however you may want to store more than one piece of data, and that requires the use of a data structure.
Before describing data structures though I would like to look back briefly at data types. As discussed in the previous article I explained how we must declare a type for the data we want to store in a variable, so we can give meaning to the data, which we know is essentially a sequence of binary digits, through the operations that can be performed on that data.
For example, if we look at a common data type, an integer, which is used to store whole numbers, we can perform typical operations on these numbers such as addition, subtraction, division, and multiplication. Therefore, the type integer, states that the binary number stored represents a whole number, as well as describing the operations that can be performed, mathematical operations.
All programming languages have a set of basic types, defined within the language itself, known as primitive types such as: integers (whole numbers), float-pointing (decimal numbers), and char (characters). We frequently use these types, and the operations that go with them, but many languages also give us the opportunity to define our own types, known as user-defined types. What this means is that we can store our own data and describe what operations we can carry out on that data with the help of certain data structures.
Data structures are then in many ways like data types. They are created in an analogous way to variables, requiring a name, specifying the data structure in use (more on this later), as well as the type of data it can store. What makes them different though is that they store more than one value, for example a data structure can store one integer, or it could store 100 integers, or it could be created to store 50 floating-point values, or any number of any type, both primitive, or user-defined. The operations of data structure are also different, instead of providing us a means in which to operate on the type of data it stores, they instead provide operations that can be carried out on all the data, also known as an element, stored within the structure. The operations commonly associated with data structures include:
Traversing: accessing all the data elements once and only once in the data structure and doing something with that data, such as updating all elements, or outputting them to the screen.
Inserting: adding new data to the data structure.
Deleting: deleting data that already exists in the data structure.
Merging: merge two data structures together.
Searching: attempt to the find location of a data element within the data structure, if it exists.
Sorting: order the data within the data structure, in some way, which usually depends upon the type of data being stored, i.e. numbers arranged smallest to largest.
We use data structures to store more than one element of data, the data we store is intended to be traversed, organised, sorted, and searched, and they allow insertion and deletion in as efficient a way as possible. Efficiency is determined by the design of the data structure, but also efficiency of a program can be determined in the choice of structure we use.
There are a broad range of data structures but they typically fall into two groups: linear and non-linear, both of which affect the way we interact with the data stored in the structure.
With linear data structures the data stored with them is stored sequentially, that means if we start at the 1st element in the structure, and want to get to the 4th element we must first traverse to the 2nd, then traverse from 2nd to 3rd, then finally from 3rd to 4th.
This isn’t the case with non-linear data structures, we don’t have to traverse through the data elements linearly, one after another. Elements of data can be linked to more than one element. This is achieved by having our data type not only store the data we want but also values containing addresses to other types which hold data and addresses, also commonly known as a node. This means we don’t necessarily have a 1st, 2nd, 3rd, 4th data element instead we have a root, which is the first node in our data structure, and this could store addresses that point to many different elements.
As previously mentioned deciding what data structure to use is important, and can have a huge effect on the overall efficiency of your program. Both in how the data structure is implemented in memory, and how the data is stored in that structure, linear or non-linear.
The main thing to take away from this article is to understand how they work, so that you know in what situation you can apply each one. This requires understanding a wide variety of different data structures and how they are implemented for each programming language, which can differ. This article is just an overview on data structures, so I will forgo any details regarding specifics, in favour of discussing them in detail in future posts.
To conclude a data structure is like a data type, but instead of defining operations to be that relate to the data itself, it instead allows storing of more than one piece of data, and defines operations that relate to the data it stores. We can create a data structure much like a variable by defining the data structure we want to use, the type to be stored within it, and a name to identify the address of the memory the data will be stored.
With variables and data structures discussed the following articles will focus back on concepts related to writing programs. How we can alter the instructions being executed using conditions, repeat instructions with the help of loops, and finally the organisation of instructions using functions. The next article will focus on the conditions, and how they help expand the potential of our programs.