Showing posts with label Haskell. Show all posts
Showing posts with label Haskell. Show all posts

Monday, May 4, 2020

Efficiency pitfalls in Haskell

This is my third note on programming in Haskell that has performance centric commentary. First two are here and here. In this post, I hope to consolidate in one place several best practices and gotchas regarding runtime performance in Haskell, at an elementary level.

When I first learned Haskell, I would tend to solve any problem involving speed by going through the same old imperative tricks and techniques, with just a 'for loop' replaced by a map, aggregating actions replaced by fold and of course, using explicit recursion. If there was some data that had to be mutated, I would tend to introduce additional arguments to the function involved to bypass the need to mutate data.

Example,

raisedToLinear :: Int -> Int -> Int
raisedToLinear _ 0 = 1
raisedToLinear x 1 = x
raisedToLinear 1 _ = 1
raisedToLinear x y = x * raisedToLinear x (y - 1)

-- Linear time w.r.t exponent. To make it efficient...
-- Divide and Conquer makes this logarithmic time
raisedTo :: Int -> Int -> Int
raisedTo _ 0 = 1
raisedTo x 1 = x
raisedTo 1 _ = 1
raisedTo x y
| y `mod` 2 == 0 = sq * sq
| otherwise = x * sq * sq
where sq = raisedTo x (y `div` 2)


Here is another example where carrying forward the problem solving thinking from imperative languages can still help obtain an efficient solution using Haskell. It is a O(N) time solution for the maximum subarray problem, that every job applicant to the tech companies has learned how to write while asleep,



acf :: Int -> (Int, Int) -> (Int, Int)
acf elem (rs, best)
| (rs + elem <= 0) = (0, best) | otherwise = (rs + elem, max (rs + elem) best) maxsub :: [Int] -> Int
maxsub a = snd (foldr acf (0, 0) a)



It is the same logic expressed in Haskell, as is well known except that modification of variable that sums up incoming list elements is now handled as an accumulator while 'reducing' in a foldr. Its linear time alright but space ? Refer here for implications of using foldr on memory requirements in a lazy setting.
Also observe how passing an 'accumulator' argument helps the function acf improve efficiency. It uses a common technique of tupling the argument to include the best result so far as well as the ongoing accumulation.

Well, it makes for a rough and ready solution not involving too much knowledge of Haskell and making do with as much generic logical skills as possible. It would still work well as long as one is properly cognizant of mistakes usually made by beginners involving lazy evaluation and space leaks.

However, the techniques described by good books in Haskell approach problems differently. They tend to go strong on first defining types involved, writing composable functions that do a small single thing and then intelligently compose them. It makes the solution difficult to read and understand for someone fed on imperative algorithms and leaves us wondering why the paraphernalia is needed.

Under strict evaluation and in pure languages, one is forced to use explicit recursion to define functions for reasons of efficiency. It does not allow composing program logic out of simple library functions like folds, maps and filters.
Example,

-- With eager evaluation, code to select first element of a list that passed a filter would be -
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f xs
| null xs = error "Empty List"
| p x = x
| otherwise = first p f (tail xs)
where x = f (head xs)

-- However, under lazy evaluation, happyness looks like ...
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p f = head (filter p) (map f)
-- Under eager evaluation, first p f = head (filter p) (map f) makes us
-- evaluate entire list and not stop where
-- the filter returns a single item that the head really needs.

Now, an example where seemingly innocuous code can deliver a painful bite when you least expect,

sum = foldl (+) 0

-- If sum is defined as above,
sum [1..10000]
-- consumes 10000 units of memory before it actually adds anything at all


If you are coming from a Scala background, foldl is known to be much more efficient than foldr because it does not need to consume space due to eagerly evaluating the passed function on elements of the list starting at the beginning. Tail recursively implemented functions were supposed to be memory efficient in a guaranteed manner. foldr was supposed to be the problem which required evaluating the list from the rightmost/last element and thus consuming stack space.
In Haskell, even foldl doesn't cure all ills.

Space Leak ?
Misunderstanding the term 'Space Leak' with the usual 'Memory Leak' leaves most C++ programmers dumbfounded. First, they see a mention of space leak as a problem with automatically garbage collected Haskell programs. Actually space leak should clearly refer to the fact that badly written Haskell code can end up consuming a lot more memory than should be needed. Lazy evaluation can blow up memory requirements and can make writing what should be simple 'bread and butter' code really tricky to get correct. What is worse, the problem doesn't show up until you pass a really large input dataset and track the memory consumption closely. Runtime failures such as stack overflows can be easily triggered.
Here is how it can hurt,



-- We want to compute the simple average of a list of Ints ...
average :: [Double] -> Double
average [] = 0
average xs = sum xs / fromIntegral (length xs)
-- Has a rookie mistake, consumes too much memory when it doesn't need to
-- and has unnecessary list traversals.


Of course, the summing and length calculation traverses the list twice so we try to fix it using a fold and get both together. As explained here, we use an eager version of left fold (foldl') defined as follows,


foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f acc [] = acc
foldl' f acc (x:xs) = x `seq` acc `seq` foldl' f (f acc x) xs

sumlen :: [Int] -> (Int, Int)
sumlen = foldl' g (0, 0)
where g (s, n) x = s `seq` n `seq` (s + x, n + 1)

-- a fresh definition of average ...
mean [] = 0
mean xs = s / fromIntegral n
where (s, n) = sumlen xs



The above piece of code to calculate just an average contains a good number of concepts to grok. First it is the specially made eager version of foldl, that is memory efficient and uses a primitively implemented function 'seq'. seq simply evaluates its left argument and returns the right one as it is.
foldl' was used by another function sumlen so that sum of the input numbers and length of list could be calculated in one shot and using constant amount of memory.
Just knowing about the happy lever named 'seq' that turns on eager evaluation is not enough ! seq needs to be applied thoughtfully while implementing sumlen to take care that both 's' and 'n' are eagerly evaluated.

Purity does benefit Haskell code too, sometimes. It is able to cache values returned by functions in case the returned values do not depend on its arguments or if the returned value is "bound" to the function itself.
In the following example, the sum of first few prime numbers is calculated for the first invocation but running the same code again,



sumOfPrimes :: Int -> Int
sumOfPrimes n = sum (take n primes)

primes :: [Int]
primes = [x | x <- [2..], divisors x == [x]] divisors :: Int -> [Int]
divisors x = [d | d <- [2..x], x `mod` d == 0] -- the following blazes through... map (\x -> x + sumOfPrimes 1000) [1..1000]

-- but the following code is unable to re-use calculations of prime and crawls interminably,
sumOfPrimes2 :: Int -> Int
sumOfPrimes2 n = sum (take n primes2)
where primes2 = [x | x <- [2..], divisors2 x == [x]] divisors2 x = [d | d <- [2..x], x `mod` d == 0] map (\x -> x + sumOfPrimes2 1000) [1..1000]



Finally, read and interpret which of the following two versions of a function is more efficient.
The function basically calculates all possible subsequences of a list. So for any list of length N, it has exponential complexity. But still, one of these functions does the same job better -



subseq1 :: [a] -> [[a]]
subseq1 [] = [[]]
subseq1 (x:xs) = subseq1 xs ++ map(x:) (subseq1 xs)

subseq2 :: [a] -> [[a]]
subseq2 [] = [[]]
subseq2 (x:xs) = sxss ++ map(x:) sxss
where sxss = subseq2 xs

subseq2 is more efficient because it prunes repeat work of calculating subseq2 xs through reusing the common data sxss.
I learnt most of what I know about efficiency in Haskell programming from this excellent resource - Thinking functionally in Haskell by Richard Bird. The writing is quite comprehensive in coverage, precise and systematic.I encourage you all to proceed and read up.

Saturday, April 25, 2020

Folds unfolded


I wish that we, Haskell programmers, never fall into the ditch that Alan Perlis had imagined when he said that a functional programmer was someone who knew the value of everything and the cost of nothing. Haskell is a compiled language and runs with nearly as nice speed as C++. When I am learning Haskell, I do not want to take up learning a new language for merely the promise of elegant and short expression, but for comparable run time performance too. Ability to write concurrent processing code with no more difficulty than a single threaded program should be great for improving runtime performance. Over and above, it should also help deliver software with lesser development time and definitely lesser number of bugs. Haskell as a pure functional programming language promises all of that.

However, I would be far less obsessed with Haskell if I can't find a way to create software with superb runtime performance. Now, Haskell can show surprising runtime behavior and can shoot you in one foot and blow up the other one. For larger pieces of code, it can be tricky to predict runtime performance because of lazy evaluation and non-strict function arguments. I ran into such problems while understanding a very basic Haskell function that helps 'reduce' a list or some complex structure into a single value - "fold" and its variants foldl and foldr. An excellent introduction to the folds as well as comments on efficiency are written here. This writeup adds to the explanation with that wiki. Lazy evaluation is known to consume a lot of memory while holding references to pending computations so understanding how to use folds is critical.

To understand folds, it is necessary to also understand the two concepts of lazy evaluation and non-strict function arguments. Both of them are quite unusual for any programmer born and bred in C++ or Java.

Lazy evaluation - Haskell lazily evaluates expressions implying that it will do no processing until and unless the programmer needs to print the result to the console or save it to the database or some such thing. Further, Haskell will only evaluate to an extent the printing requires and no more. So it is possible to declare and manipulate an infinitely sized list of objects in Haskell as long as you require processing only a finite number of them -

sumOf10Primes = sum (take 10 primes)
    primes = [x | x <- [2..], divisors x == [x]]
    divisors x = [d | d <- [2..x], x `mod` d == 0]
    -- Dutifully prints 129

    last [1..]
    -- gone forever ! 
Non strict functions - The lazy evaluation described previously is made possible due to the non-strictness nature of some of the functions involved. The function 'take' did not touch the second argument it was passed the moment it has done returning 10 numbers. 'take' does not evaluate its second argument strictly, though it is strict about the first one. Basically, if a function is passed 'undefined' as one of the arguments and still sometimes completes successfully is non-strict in that argument.
Folds - Finally coming to the pitfalls for a novice programmer when using foldl or the foldr variant. The following piece of code taken from this wiki exhausts available RAM
-
foldl (+) 0 [1..100000000]
  -- Sum to 100M, are you crazy ? At least on my Ubuntu with GHCi ver 7.10.3
  -- bloats memory, freezes GHCi for several seconds even after a ^C...

  foldr (+) 0 [1..100000000]
  -- foldr doesn't help either ! 
  -- A stack overflow exception is seen fairly quickly. 

Well, both foldl and foldr failed but in apparently different ways. foldl tried to allocate memory for its so called thunks - unevaluated chain of functions piled up in memory. foldr exhausted stack while doing constructor replacement and not being tail recursive. A different implementation of the foldl described in the wiki works well in under 2 seconds, even for 100M numbers
-
foldl' f z []     = z
  foldl' f z (x:xs) = let z' = z `f` x in seq z' $ foldl' f z' xs

  sum' = foldl' (+) 0
Here, foldl' worked by suppressing lazy evaluation (or if you prefer, by forcing eager evaluation), through 'seq', a primitive function. Otherwise, foldl' is very much like foldl. what about foldr ? It works from the opposite end of the list and maintains the initial order of the list in case the output of the fold is another list. Also, foldr can short circuit evaluation in certain conditions but foldl is condemned to process the entire list before returning anything, always. This means foldr may be used to process an infinite list but foldl can never be. There are some intricacies in how the 'seq' function, it won't work as expected if used incorrectly but that is not the topic for now.
Here is an example where the short circuiting while using foldr makes a helpful difference.
foldr (\e a -> if mod e 10 == 0 then 0 else (mod e 10) * a) 1 [1..10000000]
  -- returns 0 instantly.
  -- Short circuting based on non-strict evaluation of the second argument 
  -- of the lambda function. 
  -- It works only because of the way the lambda is implemented.

  foldr (\e a -> (mod e 10) * a) 1 [1..10^7]
  -- Even foldr does NOT work well here. 
  -- The lambda "uses" both its arguments.
Recommendations - So here we are left with three possible choices whenever we want to fold or reduce some list or a complex structure.

Use foldl' when - Performing constant memory consuming operation on inputs where short circuiting is not going to work out. For possible infinitely sized inputs consider using foldr.

Use foldr when - Almost always if foldl' is not usable. If inputs are infinitely sized, or short circuiting will lead to better performance, this is only option.

Use foldl when - Almost never ! May be needed if there is some possibility that the input arguments are undefined and their evaluation can be skipped through lazy behavior of the function passed to foldl.

Consider this concocted example -
mult _ 0 = 0
  mult x y = x * y

  foldl mult 1 [2, 3, undefined, 5, 0]
  -- works and prints 0

  foldl' mult 1 [2, 3, undefined, 5, 0]
  -- blows up for working on undefined.
Haskell being lazy, garbage collected and coming from a C++ background, I personally prefer to use functions that have predictable and low memory consumption. Strictly evaluated foldl' serves such a purpose well. Unless of course, there is a good reason to use foldr or even the god forsaken foldl.

Saturday, January 18, 2014

Beginning with Haskell

I just gave myself permission to spend some time learning a new programming language, and here are my notes on my initial effort. I bet anyone who has read this, has on their radar, to learn some functional programming language. I chose Haskell. Judging by the experiences of others attempting the same thing, and with vastly more neural firepower than me, maybe that is a brave decision !

So, yes, Haskell is daunting from the outside, but I am only first checking out if it is really possible to make any progress without being an expert in Category Theory or Type Systems ... After all, Haskell is supposed to be very general purpose. Can it give me some prop by allowing a humble "Hello World" come out easy ?

Here is one of the first programs I attempted : calculating the 'n'th Fibonacci number.

With Haskell, this is a first attempt,

fib n
       | n == 0 = 0
       | n == 1 = 1
       | n > 1 = fib(n - 1) + fib(n - 2)


So terse and clean. I could challenge someone to write a specification of the Fibonacci program in plain English and in fewer characters. Even without any introduction to the syntax, we can see through this affair easily, as if someone just wrote the pseudo code. Of course, as horribly performing as a naive C++ program following the same pattern ! Calculating the 30th Fibonacci takes upwards of 5 seconds and if someone asks for the 60th Fibonacci, this program won't return before the Haley's comet and much later. Exponential algorithmic complexity can bite hardest, even with Haskell.

So here is a result after more hours of reading this fantastic introductory online book named "Learn You a Haskell" :

fib_improved n = fst $ fib_helper n (0, 1)

fib_helper 0 (a, b) = (a, b)
fib_helper n (a, b) = fib_helper (n - 1) (b, a + b)


Somehow, this is equally short as the earlier version but is super fast. Asking for the 100th Fibonacci instantly gets me this result : "354224848179261915075" !

The last line of this code tells the whole story, which is using recursion to iterate over a "tuple" repeatedly and transforming the tuple in the process. Probably this textual explanation does a far worse job (at explaining) than what the single line of Haskell makes obvious. The first two lines are just special cases and presentation.

The patterns I see so far are twofold :
1. Recursion is cool, but the algorithmic considerations are paramount. Haskell, with its principles about immutable data, presents additional twists to the efficient coding game. We need to worry about unnecessary copying (to implement data immutability), and how not to blow up the stack even while using recursion.
2. One of the techniques to make things efficient in Haskell is to pass additional arguments to function calls and "build" an answer through them, instead of first calling the function and then processing the returned data from a function. That is what the second version of the Fibonacci program does differently from the first.

Encouraging !