Today I Learned

Some of the things I've learned every day since Oct 10, 2016

231: Min Heap Using Dynamically-Sized Array

Today I learned how to implement a Min Heap using a dynamically-sized array and the typical representation of a complete binary tree as an array where the children of index k are 2k + 1 and 2k + 2, respectively.

Specifically, I used Python’s List data structure for an impromptu implementation of a basic min heap which uses its items’ values themselves for comparisons. Below is my code (which I wrote in like 5 minutes so it’s not super pretty):

class MyMinHeap(object):
    """
    Implementation of Min Heap using dynamic array
    """
    def __init__(self):
        self.heap = []
        self.size = 0

    def peek(self):
        if self.size == 0:
            return None
        return self.heap[0]

    def push(self, val):
        self.heap.append(val)
        self.size += 1

        # float up
        curr = self.size - 1
        while curr != 0:
            parent = (curr - 1) // 2
            # if min heap invariant violated, swap with parent
            if self.heap[curr] < self.heap[parent]:
                self.heap[curr], self.heap[parent] = (self.heap[parent], self.heap[curr])
                curr = parent
            else:
                break

    def pop(self):
        if self.size == 0:
            return None

        # replace root of tree with last child, maintaining completeness
        return_val = self.heap[0]
        self.heap[0] = self.heap[-1]
        del self.heap[-1]
        self.size -= 1

        # sink down
        curr = 0
        while 2 * curr + 1 <= self.size:
            left = 2 * curr + 1
            right = 2 * curr + 2
            # pick which child has smallest value
            if right <= self.size and self.heap[right] < self.heap[left]:
                child = right
            else:
                child = left

            #swap with child if necessary to maintain min heap invariant
            if self.heap[child] < self.heap[curr]:
                self.heap[curr], self.heap[child] = (self.heap[child], self.heap[curr])
                curr = child
            else:
                break

        return return_val

 

Advertisements

230: ‘Product of Array Except at Each Index’ O(n) Algorithm

Problem:

Given an array A = [a_0, \dots, a_{n - 1}] of n integers, find an O(n) time algorithm for producing a new array P also of length n, where p_i = \prod _{j \neq i} a_j. That is, where p_i is the product of all elements of A except a_i.

Solution:

Considering how we can break down the problem of finding p_i into subproblems, we note that this product is that of all the a_j with j < i times that of the a_j with j > i. In the case of i = 0, n - 1 we can consider the product of zero numbers to be 1.

Now that we’ve broken the problem into two subproblems which are identical by symmetry, finding a simple O(n) dynamic programming algorithm to solve this subproblem and thus the original is fairly obvious.

For each i = 0, \dots, n - 1, the product L_i of all a_j with j < i is inductively a_{i - 1} L_{i - 1}, where as our base case we define L_0 = 1.

Working in the other direction from right to left, for each i = n - 1, \dots, 0 the product R_i of all a_j with j > i is a_{i + 1} R_{i + 1} with R_{n -1 } = 1.

Now we have after these 2 linear-time passes of A that p_i = L_i R_i for each i.

Because of the properties of multiplication there are several ways you can implement this, but the simplest I’ve seen is in the Java code below (courtesy of this Leetcode submission) which additionally uses only O(1) extra space:

public int[] productExceptSelf(int[] A) {
    int[] P = new int[A.length];
    int p;

    // forward pass
    p = 1;
    for (int i = 0; i < A.length; i++) {
        P[i] = p;
        p *= A[i];
    }

    // backward pass
    p = 1;
    for (int i = A.length - 1; i >= 0; i--) {
        P[i] *= p;
        p *= A[i]
    }

    return P;
}

229: Vim Basics

Today I learned the basics of Vim and used it for the first time, which took me a weirdly long time to get around to.

I’m not going to go through a long list of commands and keyboard shortcuts; there are already plenty of really good guides out there, and I’d better remember it all just by continuing to use it.

That said, I will note one takeaway from using Vim for the first time: It’s not nearly as hard as I’d been led to believe.

228: `where` Helper Functions (Haskell)

In Haskell, the where keyword can be used to define local variables in the scope of guards (the rough equivalent of if-else chains in imperative languages).

The same keyword can be used to define a helper functions within a function, local to its scope. For instance, the following function takes in a list of RealFloat values representing the radii of circles, and outputs 2-tuples of their respective perimeters and areas, using helper functions for each:

circlePerimsAndAreas :: (RealFloat a) => [a] -> [(a, a)]
circlePerimsAndAreas rs = [(perimeter r, area r) | r <- rs]
    where
        perimeter r = 2 * 3.14 * r
        area r = 3.14 * (r ^ 2)

This isn’t necessarily a great pattern to use (I don’t know enough about how people typically structure things in Haskell), but it’s another possible use for where that you could use if needed.

227: Pattern Matching with Haskell Lists

Continuing with pattern matching in Haskell, today I learned some basic ways to apply the tool to lists. Recalling that lists in Haskell are represented as linked lists in that they have a ‘head’ and ‘tail’ (or ‘head’ and ‘rest’, if you prefer), we can use pattern matching and recursion to implement our own length function which gives the length of a list:

length' :: (Num b) => [a] -> b
length' [] = 0
length' (_:rst) = 1 + length' rst

We put the base case — the empty list — first so that it is matched before the general case. In the general case, the _ character signifies that we don’t really care what the head of the list is, since we don’t use it, and recurse on the tail of the list rst.

As another illustration, below we implement a function s which returns a string showing up to the first 3 elements of a list:

s :: (Show a) => [a] -> String
s ls@[] = show ls
s ls@(x:[]) = show ls
s ls@(x:y:[]) = show ls
s ls@(x:y:z:_) = "[" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ "..]"

Yes, there are obviously simpler ways to do this (notably just by using the length function), but the point is to illustrate pattern matching, not write succinct code.

226: Haskell Pattern Matching (Intro)

In Haskell, pattern matching is (sort of) a way of defining piecewise functions. For example,

isSeven :: (Integral a) => a -> Bool
isSeven 7 = True
isSeven x = False

This function takes an Integral input and outputs True if it’s 7 and False if it’s not. (This is obviously something that could just be done with if-else, but we’re just showing the syntax for the sake of it.)

The way this works is that upon the function being called on an input, the interpreter reads the piecewise function definitions from top to bottom, and executes the first function body whose input pattern matches the actual argument. For instance, if the above 2 definitions were swapped the function would always return False, even if the input was 7.

If a function with pattern matching is called on an input which doesn’t match any of the patterns, an error is raised, so you always have to be sure to include a general catch-all case at the end.

As another illustration, here’s the factorial function implemented using pattern matching:

factorial :: (Integral a) => a -> a
factorial 0 = 1
factorial n = n * factorial (n - 1)

225: Deep Copies vs. Shallow Copies (vs. Variable Assignment)

Say you want to copy an object. If the object is just a primitive value, depending on the language in question, just assigning a new variable to this value will copy everything about it. For example, in Java int a = 5;  int b = a; copies the value 5. This is pretty straightforward.

With non-primitive objects, the situation gets more nuanced. For example, in Python

a = [[1], [1, 2], [1, 2, 3]]
b = a

just points the variable b towards the same object as a, so nothing is copied. However, you can do the following in Python to perform a shallow copy:

import copy
b = copy.copy(a)
id(a) == id(b) # returns False, since b is a new object
id(a[0]) == id(b[0]) # returns True. a[0] and b[0] are the same object

That 3rd line is the reason the copy is ‘shallow’: a new object was created to assign b to, but the attributes of b (its values) are the same ones as in a.

Enter the ‘deep copy’, which does pretty much what it sounds like. A deep copy not only creates a new object, but recursively creates new objects copying that object’s attributes. Thus it’s not only just a copied wrapper, but a real copy all the way down.

To contrast with the above example:

b = copy.deepcopy(a)
id(a) == id(b) # returns False, as with the shallow copy
id(a[0]) == id(b[0]) # now returns False. b[0] is a new value separate from a[0]

To summarize, the way the Python docs summarize the difference is:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

  • shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
  • deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Finally, as with any recursive procedure, deep copying should be done carefully to avoid infinite loops and other problems.

224: defaultdict (Python)

In Python, defaultdict is a class (imported using from collections import defaultdict) which is a subclass of dict (the Python hashmap). Its behavior is almost exactly the same of dict, with the critical exception that it has an attribute default_factory. This attribute tells the defaultdict what default value to provide if an attempt is made to get the value for a key not already present.

Example:

>>> d_int = defaultdict(int)
>>> d_list = defaultdict(list)
>>> def foo():
...     return 'default value'
... 
>>> d_foo = defaultdict(foo)
>>> d_int['key'] 
0 
>>> d_list['key'] 
[] 
>>> d_foo['key'] 
'default value'

This example illustrates that the single mandatory argument to the constructor, which determines default_factory, can be any object that belongs to the class callable. When necessary the dict then simply calls this callable object to provide a default value.

There are a lot of uses for this behavior, but the simplest one I can think of is using a defaultdict(int) in place of {} for counting the number of occurrences of every character in a string. Instead of looping over the string and at every point checking whether a character has already been seen, you can just let the default dict handle it for you.

223: “use strict”; (JavaScript)

As I’ve started to look at some more JavaScript code, I’ve come across multiple instances of scripts with the mysterious statement "use strict"; placed at the top, and was naturally curious what it does.

It turns out "use strict"; is placed at the beginning of a script, or even at the beginning of a function block (to only use strict mode in that context), to instruct the browser to use Strict Mode in evaluating the code. This essentially does what it sounds like: evaluate code according to a stricter set of rules about what is and isn’t okay, often catching ‘mistakes’ or bad practices which wouldn’t normally result in errors in non-strict mode.

For instance, strict mode will:

  • Throw errors instead of silently ignoring failed assignments such as NaN = 5
  • Throw errors when attempts are made to delete un-deleteable properties of objects
  • Require function parameter names to be unique
  • etc.

All in all strict mode is not strictly necessary but seems like it would be incredibly useful for avoiding stupid mistakes, which is often half the struggle.

222: == vs === (JavaScript)

In recently picking up JavaScript, one of the first things I’ve had to learn is the distinction between the 2 equality operators, namely == and ===. The shorter of the two, ==, which should by all means be the one which makes more sense and is more safe to use, is the one that makes less sense and can wind you up in all sorts of trouble if you’re not paying attention to what you’re doing.

Namely, == has an appalling lack of transitivity for supposedly representing an ‘equality’ relation, as in the below example

'0' == 0 // true

0 == '' // true

'' == '0' // false

and also as represented by this diagram:

Related image

The reason for this weird behavior is that the == operator compares equality after doing any type conversions deemed necessary. If both the operands are of the same type then everything behaves as expected, but if not then the interpreter starts converting things and you get unexpected results.

By contrast, === does not do type conversion between operands, and returns true only if both are of the same type and are equal under the rules of that type. This is the behavior which is more in line with what equality is supposed to represent, and thus is the one I’ll be using exclusively until someone gives me a strong reason to use ==.