Data Structures to Learn (Ordered by Priority)
- Arrays
- Strings
- Hash Table
- Recursion
- Sorting/searching
- Matrix
- Linked List
- Queue
- Stack
- Tree
- Graph
- Heap
- Trie
- Interval
- Dynamic Programming
- Binary
- Math
- Geometry
Arrays
A type of data structure that stores more than one data item of the similar data type. It is a container of elements. Items of an array are allocated at adjacent memory locations.
Elements - Memory locations found in array.
Length - Total number of elements in an array.
NOTE:
To access a specific element of an array, you use its index/subscript.
Declaring an array: In python, you declare an array the following way:
arrName = array.array( 'i', [300,200,100] )
Where i
is the data type (type code) and [300,200,100]
are the elements of the array
In C++, you declare an array the following way:
int arrName[3] = {300,200,100};
Why Are Arrays Important/Useful?
- They can store multiple values in a single variable.
- Better at processing many values easily and quickly.
- Sorting and searching values are easy to do.
Difference Between Lists and Arrays in Python
In Python, a list can have different data types, whereas arrays can only have items of the same type.
Array Operations
For arrays, there are various operations:
- Insert - Inserting one or more elements into an array at the beginning, end, or any given index.
- Delete - Can delete an item from an array by value (value as in, removing the specified element in array).
- Search - Search an item in an array based by value.
- Traverse - Traversing through an array using a loop.
B Tree
It is a self-balancing data structure that follows a set of rules for searching, inserting, and deleting data in a faster and memory efficient way.
NOTE: Self-balancing means the height/depth of the left and right subtrees of any node differ by at most one.
Rules for B Tree
- All leaves are created at the same level.
- B Tree is determined by a number of degree, which is called the “order”
- Left subtree of the node will have lesser values than the right side of the subtree–nodes are sorted in ascending order from left to right (basically, an in-order traversal).
- For EVERY node, the maximum number of keys each node can have is: $m - 1$ For example, if $m = 4$, then the maximum number of children each node can have is $4$ and node can have at most $3$ keys.
- The minimum children a node can have is half of $m$, which is $m/2$ (the ceiling is taken).
- Every node, except for the root, MUST contain minimum keys of $[m/2]-1$
$m = 4$
[30]
/ \
[10, 20] [40, 50, 60]
/ | \ | | \
[5] [15] [25] [35] [45] [55] [65, 70]
Why use a B Tree?
- It reduces number of reads on the disk.
- They are optimized to adjust the number of children nodes according to disk size.
- It is designed for handling a bulky amount of data.
- Useful for databases and file systems.
- Useful for reading and writing large blocks of data.
How a B Tree Works in Memory
Index Tables are created that saves the record reference of records based on the blocks they reside in–this reduces time and memory consumption.
Operations of a B Tree
- Search
- Insert
- Delete
Search Operation
Steps to search:
- You have some value/key, $k$, be searched.
- Start searching from the root and recursively traverse down.
- If $k$ is less than root value, search the left subtree, else search the right.
- If $k$ is found within a node, simply return the node, else keep traversing down to the child with a greater key.
- If $k$ is not found in the tree, return NULL.
Insert Operation
Since a B Tree is self-balanced, you cannot just insert a key into any node.
Steps to insert:
- First search and find the appropriate place to insert the key.
- Once a proper location is found, insert the key at that location.
- If the proper location is found but the node has a maximum number of keys:
- Then the node, along with the inserted key, will split from the middle element (the split will have each key be its own node).
- The middle element will become the parent for the other two child nodes.
- Nodes need to re-arrange keys in ascending order.