Monday, May 4, 2020

Efficiency pitfalls in Haskell

This is my third note on programming in Haskell that has performance centric commentary. First two are here and here. In this post, I hope to consolidate in one place several best practices and gotchas regarding runtime performance in Haskell, at an elementary level.

When I first learned Haskell, I would tend to solve any problem involving speed by going through the same old imperative tricks and techniques, with just a 'for loop' replaced by a map, aggregating actions replaced by fold and of course, using explicit recursion. If there was some data that had to be mutated, I would tend to introduce additional arguments to the function involved to bypass the need to mutate data.

Example,

raisedToLinear :: Int -> Int -> Int
raisedToLinear _ 0 = 1
raisedToLinear x 1 = x
raisedToLinear 1 _ = 1
raisedToLinear x y = x * raisedToLinear x (y - 1)

-- Linear time w.r.t exponent. To make it efficient...
-- Divide and Conquer makes this logarithmic time
raisedTo :: Int -> Int -> Int
raisedTo _ 0 = 1
raisedTo x 1 = x
raisedTo 1 _ = 1
raisedTo x y
| y `mod` 2 == 0 = sq * sq
| otherwise = x * sq * sq
where sq = raisedTo x (y `div` 2)


Here is another example where carrying forward the problem solving thinking from imperative languages can still help obtain an efficient solution using Haskell. It is a O(N) time solution for the maximum subarray problem, that every job applicant to the tech companies has learned how to write while asleep,



acf :: Int -> (Int, Int) -> (Int, Int)
acf elem (rs, best)
| (rs + elem <= 0) = (0, best) | otherwise = (rs + elem, max (rs + elem) best) maxsub :: [Int] -> Int
maxsub a = snd (foldr acf (0, 0) a)



It is the same logic expressed in Haskell, as is well known except that modification of variable that sums up incoming list elements is now handled as an accumulator while 'reducing' in a foldr. Its linear time alright but space ? Refer here for implications of using foldr on memory requirements in a lazy setting.
Also observe how passing an 'accumulator' argument helps the function acf improve efficiency. It uses a common technique of tupling the argument to include the best result so far as well as the ongoing accumulation.

Well, it makes for a rough and ready solution not involving too much knowledge of Haskell and making do with as much generic logical skills as possible. It would still work well as long as one is properly cognizant of mistakes usually made by beginners involving lazy evaluation and space leaks.

However, the techniques described by good books in Haskell approach problems differently. They tend to go strong on first defining types involved, writing composable functions that do a small single thing and then intelligently compose them. It makes the solution difficult to read and understand for someone fed on imperative algorithms and leaves us wondering why the paraphernalia is needed.

Under strict evaluation and in pure languages, one is forced to use explicit recursion to define functions for reasons of efficiency. It does not allow composing program logic out of simple library functions like folds, maps and filters.
Example,

-- With eager evaluation, code to select first element of a list that passed a filter would be -
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f xs
| null xs = error "Empty List"
| p x = x
| otherwise = first p f (tail xs)
where x = f (head xs)

-- However, under lazy evaluation, happyness looks like ...
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f = head (filter p) (map f)
-- Under eager evaluation, first p f = head (filter p) (map f) makes us
-- evaluate entire list and not stop where
-- the filter returns a single item that the head really needs.

Now, an example where seemingly innocuous code can deliver a painful bite when you least expect,

sum = foldl (+) 0

-- If sum is defined as above,
sum [1..10000]
-- consumes 10000 units of memory before it actually adds anything at all


If you are coming from a Scala background, foldl is known to be much more efficient than foldr because it does not need to consume space due to eagerly evaluating the passed function on elements of the list starting at the beginning. Tail recursively implemented functions were supposed to be memory efficient in a guaranteed manner. foldr was supposed to be the problem which required evaluating the list from the rightmost/last element and thus consuming stack space.
In Haskell, even foldl doesn't cure all ills.

Space Leak ?
Misunderstanding the term 'Space Leak' with the usual 'Memory Leak' leaves most C++ programmers dumbfounded. First, they see a mention of space leak as a problem with automatically garbage collected Haskell programs. Actually space leak should clearly refer to the fact that badly written Haskell code can end up consuming a lot more memory than should be needed. Lazy evaluation can blow up memory requirements and can make writing what should be simple 'bread and butter' code really tricky to get correct. What is worse, the problem doesn't show up until you pass a really large input dataset and track the memory consumption closely. Runtime failures such as stack overflows can be easily triggered.
Here is how it can hurt,



-- We want to compute the simple average of a list of Ints ...
average :: [Double] -> Double
average [] = 0
average xs = sum xs / fromIntegral (length xs)
-- Has a rookie mistake, consumes too much memory when it doesn't need to
-- and has unnecessary list traversals.


Of course, the summing and length calculation traverses the list twice so we try to fix it using a fold and get both together. As explained here, we use an eager version of left fold (foldl') defined as follows,


foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f acc [] = acc
foldl' f acc (x:xs) = x `seq` acc `seq` foldl' f (f acc x) xs

sumlen :: [Int] -> (Int, Int)
sumlen = foldl' g (0, 0)
where g (s, n) x = s `seq` n `seq` (s + x, n + 1)

-- a fresh definition of average ...
mean [] = 0
mean xs = s / fromIntegral n
where (s, n) = sumlen xs



The above piece of code to calculate just an average contains a good number of concepts to grok. First it is the specially made eager version of foldl, that is memory efficient and uses a primitively implemented function 'seq'. seq simply evaluates its left argument and returns the right one as it is.
foldl' was used by another function sumlen so that sum of the input numbers and length of list could be calculated in one shot and using constant amount of memory.
Just knowing about the happy lever named 'seq' that turns on eager evaluation is not enough ! seq needs to be applied thoughtfully while implementing sumlen to take care that both 's' and 'n' are eagerly evaluated.

Purity does benefit Haskell code too, sometimes. It is able to cache values returned by functions in case the returned values do not depend on its arguments or if the returned value is "bound" to the function itself.
In the following example, the sum of first few prime numbers is calculated for the first invocation but running the same code again,



sumOfPrimes :: Int -> Int
sumOfPrimes n = sum (take n primes)

primes :: [Int]
primes = [x | x <- [2..], divisors x == [x]] divisors :: Int -> [Int]
divisors x = [d | d <- [2..x], x `mod` d == 0] -- the following blazes through... map (\x -> x + sumOfPrimes 1000) [1..1000]

-- but the following code is unable to re-use calculations of prime and crawls interminably,
sumOfPrimes2 :: Int -> Int
sumOfPrimes2 n = sum (take n primes2)
where primes2 = [x | x <- [2..], divisors2 x == [x]] divisors2 x = [d | d <- [2..x], x `mod` d == 0] map (\x -> x + sumOfPrimes2 1000) [1..1000]



Finally, read and interpret which of the following two versions of a function is more efficient.
The function basically calculates all possible subsequences of a list. So for any list of length N, it has exponential complexity. But still, one of these functions does the same job better -



subseq1 :: [a] -> [[a]]
subseq1 [] = [[]]
subseq1 (x:xs) = subseq1 xs ++ map(x:) (subseq1 xs)

subseq2 :: [a] -> [[a]]
subseq2 [] = [[]]
subseq2 (x:xs) = sxss ++ map(x:) sxss
where sxss = subseq2 xs

subseq2 is more efficient because it prunes repeat work of calculating subseq2 xs through reusing the common data sxss.
I learnt most of what I know about efficiency in Haskell programming from this excellent resource - Thinking functionally in Haskell by Richard Bird. The writing is quite comprehensive in coverage, precise and systematic.I encourage you all to proceed and read up.