Today I Learned

Some of the things I've learned every day since Oct 10, 2016

Category Archives: data structures

184: TreeMap vs. HashMap vs. LinkedHashMap (Java)

In Java, TreeMap, HashMap, and LinkedHashMap are all implementations of the Map interface, and are all just different ways of mapping keys to values. Importantly, though, they differ in their lookup and insertion times, as well as ordering they maintain.

  • HashMap is implemented as an array of linked lists, and is the implementation of a hash table most people are probably most familiar with. It offers O(1)  lookup and insertion, but doesn’t maintain any ordering of its keys.
  • TreeMap is implemented as a red-black tree, and so offers O(\text{log} N ) lookup and insertion. However, its keys are ordered according to its keys’ implementation of the \texttt{Comparable} interface.
  • LinkedHashMap is implemented as a linked list and like HashMap offers O(1) time, but in addition maintains ordering of its keys according to the order in which they were added to the map.

When to use which? If you need to retrieve keys ordered by their implementation of \texttt{Comparable} (perhaps when the keys are Integers or Strings), TreeMap can do that. If you need to retrieve keys ordered by their insertion time (as in a caching system), LinkedHashMap can do that. If neither of these orderings is needed, HashMap is the best go-to, being typically faster with less overhead than the other two options.

Advertisements

157: Dijkstra’s and Negative Weights

Dijkstra’s algorithm, commonly used to find the path of minimum weight to any given node from a starting node in a weighted graph, does not work if negative weights are allowed.

Intuitively, the reason is that Dijkstra’s makes the assumption while working through the graph that paths can only become ‘heavier’, and so while visiting a node does not consider neighbors which themselves have already been visited. For instance, consider running the algorithm on the following graph with A as the starting node:

dijkstra

Image: https://stackoverflow.com/q/13159337

The algorithm will look at as a neighbor of A and set its minimum known distance to 2. It will then look at B and set its to 5. It will then move on to visit C without any changes since C has no neighbors. It will then visit B, but will not consider the edge from to C because C has already been visited. So the algorithm will wrongly conclude that the shortest path to C from A is 2, not -5.

One adaptation which does work on certain types of graphs with negative weights is Johnson’s algorithm, which itself uses Dijkstra’s as a subroutine.

 

149: Floyd’s Cycle-Finding Algorithm

Suppose you want to detect a cycle in a linked list. More specifically, you want to find the node N where the cycle begins. One way to do this is Floyd’s cycle-finding algorithm, sometimes referred to as the “Tortoise and the Hare” algorithm.

Informally, the way it works is:

  • Initialize two pointers P, Q, with P = next(head), Q = next(next(head)).
  • While P \neq Q, increment P by 1 node and Q by 2. (P is the “slow” pointer and Q is the “fast” one. At each step of this loop the distance between the two will increment by 1.)
  • Initialize counter n = 0, and reset P to point to the head of the list.
  • While P \neq Q, increment each of P, Q by 1 node and n by 1.
  • Return node N which is n nodes after the head of the list.

The runtime of this algorithm is O(n + \lambda), where \lambda is the period of the cycle being found.

123: Skip Lists (Intro)

skip list is a data structure similar to a linked list which stores an ordered sequence of elements and is built specifically for fast search within the sequence. The way it’s commonly implemented is as a hierarchy of linked lists, with the first list containing all the elements of the skip list, and each successive linked list containing a sparser subset of the elements in the previous one.

To search for an element within the skip list one then starts with the sparsest linked list, getting as ‘close’ to the desired element as possible, then moves directly down to that same position in the next linked list and continues searching until the item is found (or not).

This animation depicts the insertion of the element 80 into a skip list of positive integers (insertion being equivalent to searching):

skiplist

Credit: Artyom Kalinin

 

In the average case, the skip list uses O(n) space and takes just O(\log{} n) time for searching operations.

112: Persistent Data Structures

A data structure is said to be persistent if whenever it is modified or updated, it preserves the ‘old’ version of itself. This results in a kind of immutability. A good example of this is the simple singly-linked list, which whenever added onto contains the old version of itself in the new version.