Introduction to Data Structures in Java
Data Structure is a systematic way of organizing and accessing the data efficiently.
We need Data Structures to organize the data i.e. to store and arrange the data or information in an efficient way in the memory.
Data structures are the basic fundamentals of any program or application as the main function of a program or software is to store and retrieve the user's data as fast as possible and this is only possible through the use of data structures.
For example, if we have a room full of hundreds of books in a haphazard manner then it will be very difficult for us to select one particular book amongst so many. But if we have a library where all the books are arranged in specified sections accordingly, then it will be much easier for anyone to retrieve any book and use it. That's what data structures do. It organizes the data in an easier way!
Data Structures can be classified as Linear and Non-Linear data structures.
When the data is stored in a sequential and single-level manner i.e. all the elements are arranged in linear order and each element is connected to its previous and next element, then it is said to be a linear data structure.
Types of linear data structures are-
- Arrays - An array is a collection of similar types of elements where indexing plays a very important role. Using indexing, we can randomly access the data located at any position in the array. The disadvantage is that an array has a fixed size and cannot be modified. To solve this problem, collection framework is used in java which dynamically grows the array size (an example is ArrayList). Arrays can be of two types namely Single Dimensional Array and Multidimensional array.
- Linked List- The linked list is used to store elements in "containers". The containers are also referred to as "nodes" and each node has a link to the next node in the list. To add an element to a linked list, the element is placed into a new node and that node is linked to one of the other nodes in the linked list. Unlike arrays, linked list elements are not stored at a contiguous location; the elements are linked using pointers.
The various types of linked list are :
- Singly Linked list- Items in a singly linked list can be traversed in only one direction from head to last node(tail).
- Doubly Linked list- Items in a doubly linked list can be traversed in both directions. It contains a pointer to the next node as well as the previous node.
- Circular Linked list- It is a combination of singly and doubly linked list where the first element points to the last element and the last element points to the first element.
- Stack- The stack data structure is based on the principle of LIFO (last-in-first-out). It has various operations such as push(which inserts elements), pop (to delete elements), empty, search and peek.
- Queue- Queue is based on the principle of FIFO (first-in-first-out). Queue is an interface that extends the collection interface and has ordered list of elements where we can insert at the end and delete from the start of the list.
Four basic operations are performed on queue:
- Enqueue- It adds an element to the queue. If the queue is full, then it is an Overflow condition.
- Dequeue- It removes an element from the queue. If the queue is empty, then it is an Underflow condition.
- Front- It fetches the front element from the queue.
- Rear : It fetches the last item from the queue.
A Static data structure is one that has a fixed size and cannot change at run time, example is Arrays.
A Dynamic data structure is able to modify the data stored inside it and so it does not waste as much space, example are Stack, Queues, Linked lists.
In Non-Linear data structures, the data elements are not arranged in a single-level, sequential order and are connected in a non-linear arrangement.
Types of Non-Linear data structures are-
- Trees- Trees are hierarchical data structures that contain root and nodes and each node has almost two children. Elements are arranged in multiple levels like a tree branch.
A binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. It is implemented mainly using Links.
A Binary Search Tree is a Binary Tree with the following additional properties as follows-
- The left subtree of a node contains only nodes with keys less than the node’s key.
- The right subtree of a node contains only nodes with keys greater than the node’s key.
- The left and right subtree each must also be a binary search tree.
- Graphs- A graph consists of a finite set of vertices (or nodes) and set of Edges that are interconnected with the vertices.
- Heaps- A heap is a specialized tree-based data structure whose main function is to provide easy access to the minimum and maximum item.
Heaps can be of two types:
- Max-Heap : In a Max-Heap the key present at the root node must be greatest among the keys present at all of its children. The same property must be recursively true for all sub-trees in that Binary Tree.
- Min-Heap : In a Min-Heap the key present at the root node must be minimum among the keys present at all of its children. The same property must be recursively true for all sub-trees in that Binary Tree.
- Tries- A Trie is also called a digital tree or prefix tree, it is a type of search tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters.
Now to work on the data structures or to solve a problem, we need Algorithms. Algorithms are methods or set of instructions which are executed in a certain way to give the desired output. Algorithms do not depend on a particular programming language. They are often known as pseudocode or represented as flowcharts.
Every Algorithm must satisfy the following properties:
- Input- There should be 0 or more inputs provided to the algorithm.
- Output- There should be atleast 1 output obtained.
- Definiteness- Every step of the algorithm should be clear and unambiguous.
- Finiteness- The algorithm should have finite number of steps.
- Effectiveness- Every step of the algorithm must be basic and essential.
Analysis/Complexity of an Algorithm -
The complexity of an algorithm is basically a mathematical function g(n) that gives the upper bound of the number of operations performed by an algorithm when the input size is n.
Types of complexity -
- Space Complexity- It is the amount of memory used by the algorithm (including the input values to the algorithm) to execute and produce the result.
- Time Complexity- Time complexity of an algorithm signifies the total time required by the program to run till its completion.
Cases of complexity-
- Worst Case Complexity - In the worst-case analysis, we calculate upper bound on the running time of an algorithm. For example in Linear Search, the worst case happens when the element to be searched is not present in the array. When the element is not present, the searching functions compare it with all the elements of array one by one. Therefore, the worst-case time complexity of linear search would be Θ(n).
- Average Case Complexity - In average case analysis, we take all possible inputs and calculate computing time for all of the inputs. We find sum of all the calculated values and divide the sum by total number of inputs.
- Best Case Complexity - In the best case analysis, we calculate lower bound on running time of an algorithm. In the linear search problem, the best case occurs when an element is present at the first location. So time complexity in the best case would be Θ(1).
An Asymptotic notation is a shorthand way to represent the fastest possible and slowest possible running time of an algorithm. It is a line that stays within bounds.
Notations used for analyzing complexity are:-
- Θ-Notation (Same order)
- O-Notation (Upper bound)
- Omega-Notation (Lower bound)
- Little-oh Notation (o)
- Little omega Notation
We will be discussing about these notations in-depth in the next article.