In this article, I will try to explain the 7 most important data structures in the most easy manner. These data structures are important to know whether you are preparing for a coding interview or you are a CS student.
Arrays are the ordered collection of Data. Typically the data is all of a similar type, like integers or strings, but some languages also allow for differing data types.
An example of a real-life use for an array would be if you had temperatures for the next 5 days, and wanted to store them so your program could access them later. Arrays are used all the time and for pretty much everything, so they’re the most important data structure to learn first.
One of the amazing benefits of using an array is that it’s very easy to find any element, as each element in an array is assigned a number called an index that you can use to find it. This form of numbering is often called “zero-based indexing”, which just means the first element in the array is at an index of 0. This can often confuse new programmers because if you want to retrieve the second element, you have to use its index, which is 1, not 2.
While the advantage of arrays is that they make it easy to read elements (O(1)), the disadvantage is that they have a slightly harder time inserting or deleting elements (O(n)).
Now, just a quick note.
We won’t talk about memory much in this article, but for the first two data structures in this article, the memory component is very simple, and important so you can tell the difference between them.
We’ll use our temperature array from earlier as an example.
Arrays are stored in contiguous memory, which means each of the elements in the array are next to one another in memory. If a new element is added in the middle of the array, the entire array must shift down. But what would’ve happened if there was already something in the next memory address? To fit this new element in now, the array’s memory will have to be reallocated to an entirely new space where all of the elements fit. While computers are incredibly quick, this is not very efficient. Arrays are very good for reading elements but can be a bit less efficient when it comes to insertion or deletion.
Now we’ll take a look at a data structure that’s the opposite.
While arrays were fast at reading elements, and a bit slower at inserting or deleting, linked lists are a bit slower at reading elements, but fast at inserting or deleting.
Linked lists are similar to arrays, in that they also store ordered lists of data elements. However, a huge difference is in the way they are stored in memory.
Each element of a linked list has what we call a pointer, which is the address of the next element of the list. As a result, elements in a linked list do not have to be stored beside each other. You can store the next element in any location, and the previous element will point to it if you want to find it.
The advantage of this is that it solves our problem with array insertions and deletions. To add a new element, you find a free spot in memory for it, and have the previous element point to it.
To remove an element, you just delete the element and have the previous point to the one ahead of the deleted element.
The disadvantage of this is that linked lists do not have indexes, as the elements are not stored right beside each other. This means that to find an element, we have to go through the list starting from the beginning. If we want the third element, we have to first look at the first element, see where it points, go to the second, see where it points, and then we find the third. If you think of a huge linked list, you can start to understand why it’s not the fastest at reading.
So, to recap, Arrays are faster when it comes to reading, linked lists are faster when it comes to inserting and deleting.
Remember how arrays had values stored, and for each value, there was an index that numbered them? Well, hash maps are essentially the same thing.
except that you can choose what the “index” is, which hash maps call a key. The key and it’s value are commonly known as key-value pairs.
The other major difference between arrays and hash maps is that hash maps are unordered.
Hash maps are fast (O(1)) for both inserting and removing elements. But the real benefit to using hash maps is their ability to search quickly (O(1)).
Let’s say you wanted to store a list of capital cities. If we stored these in an array, we would have to know the index for each one to read it. But for a hash map, if we make the keys countries, we can just look up the country to find the capital city associated with it.
Hash maps go by a few different names. They are sometimes called hash tables, or, if you know Python, they’re called dictionaries. For the purposes of this article, you can assume they’re all the same thing. So, if you know how to use dictionaries in Python, congratulations! You’ve actually been using hash maps.
So recap, just understand that they’re unordered, and their custom keys allow for very quick searching.
The most simple way to describe stacks is to think of a stack of plates or pancakes. The first plate goes on the bottom, the last plate goes the top. Stacks are LIFO structures, which stands for last in, first out, because the last element in is like the last plate that goes on top the stack. When you go to grab a plate, this last plate on top will be the first you take off. Stacks have three common operations, which are push, pop, and peek.
Pushing is when you add a new element to the top of the stack.
Popping is when you remove the top-most element from the stack.
And peeking is when you’re just taking a look at what the element at the top of the stack is.
All of these are very fast, which is why stacks are optimal for certain problems. If you’re wondering when stacks might be used, think of the pancake and plate examples. For any scenario that has a similar structure where the last element in is the first element out, stacks are likely a good data structure to use.
Queues are the opposites of stacks. The simplest way to think of a queue is like a lineup at a grocery store. The first person in the line will get serviced first, and every additional person who joins the line goes at the end.
Queues are FIFO structures, which stands for first in, first out, because the first element in is the first element to come out. Queues have very similar operations to stacks, which are enqueue, dequeue, and front.
Enqueue is like push for a stack, and is when a new element is added to the back of the queue.
Dequeue is like pop for a stack, and is when the element on the front of the queue is removed.
The front is like a peek for a stack and is when you take a look at the front-most element in the queue.
Queues are more frequently used than stacks, especially in real-world programming. Think of YouTube playlists. When you start watching a playlist, you’ll start with the video that was added first, and the last video you watch will be the final one that was added.
Trees are a category of data structure that, as you might have guessed by the name, resemble trees. Trees have nodes, which are connected to each other by edges. The first node in a tree is called the root node. Nodes have a parent-child direction, where one is a parent node that leads to another node, which is a child node. Sometimes, the parent nodes are sometimes just called nodes, and the child nodes are called leaves.
There are tons of tree-based data structures, but in this video, we’ll be talking about binary trees, and in particular, binary search trees.
A binary tree is a tree where each parent node has up to two children nodes.
A binary search tree is a type of binary tree, where all left children nodes are less than the parent node, and all right children nodes are greater than the parent node. These binary search trees make it very easy to search through large amounts of ordered values.
The classic example is to think of a number guessing game, where one person thinks of a number between 1 and 100, and the other person has to guess. With each guess, they get told whether the correct number is higher or lower than their guess. The strategy is to always guess the middle number, because then you’re eliminating the most amount possible each time.
This is what a binary search tree does.We can eliminate a parent node and everything below it or above it,
and continue that process until we get to our correct number. This is not only useful for this game though.
A more practical example is to think of a digital dictionary. Dictionaries have over 100,000 words. If you give a computer a word and want it to give you the definition,
it would be incredibly slow for it to start at the beginning and look through each word until it finds the correct one (O(n)). Instead, because the dictionary is sorted alphabetically, the computer is able to go right to the word in the middle of the dictionary, and check if the target word comes before or after. It continues sorting like this until it reaches the word (O(long)). There are tons of other tree structures like heaps and tries, but we’ll leave those for another article.
Lastly we have graphs. Graphs are basically models for a set of connections. Like trees, graphs are made up of nodes and edges. In fact, trees, and even linked lists to an extent, are technically types of graphs. But graphs can get a lot more complicated.
In a graph, there are fewer restrictions than with trees. Nodes can be connected to any amount of neighbors.
Graphs can be directed, where nodes point to other nodes,
but they can also be undirected.
Some graphs have cycles, where two nodes both point to each other.
The edges between nodes can also be weighted, where the path has a value associated with it. As you can see, graphs are very complicated, which is why many people consider them to be one of the hardest data structures to learn. I’m going to make an entirely separate article dedicated to graphs, so we’ll tackle the complexities in a little bit.
Using this data structure, we can develop an algorithm that allows us to calculate what the shortest route between all five places is.
For a real-life example, think of Uber. Every Uber driver and user could be seen as nodes, and the application is constantly trying to optimize so that the waiting time for each rider is as short as possible. There are endless applications for graphs, which is why they’re such an important data structure to understand.
Thats it, Thank you! :)