"Pearls of Functional Algorithm Design", by Richard Bird. Cambridge University Press. 2010.
Inspirational in it's succinct and clear way of presenting the material.
No wonder: It is a collection of some of the pearls published in the 20-years period (1990-2010) in the Journal of Functional Programming.
P1: Smallest free number -- Given a list of natural numbers, find the smallest number NOT in the list, in LINEAR time -- Array -- Easy.
-- We go over the list xs, and mark in an auxiliary array the numbers that are between [0..length xs]. Then, finding the firsy non-appearing number in this aux-array.
p1.hs (opens in new window).
P2: Surpassing numbers -- Given a list [a] of (Ord a), for each element define as 'surpasser count' the number of remaining elements to the right which are larger than it. Find the maximum surpasser count of a list in O(n*log n) -- Divide
and Conquer -- Medium.
-- We define data-strucutre (and operations) that will enable us to join (=merge) two sublists in linear time. The data base is an ordered list of the elements and their surpasser for each half-list, and then the join (of right and left
lists) can be done element-by-element in linear time.
P3: Saddleback search + Binary search goes 2D -- Given f(x,y)-> z over the Natural numbers. f(,) is strictly increasing in both its arguments.
Find all the pairs (x,y) such that f(x,y) = z_0. -- Medium.
-- Brute force is searching in the whole square of (0,0) to (z_0,z_0). Then, Saddleback goes over a 'line', stepping gently from the top-left along a column, and then jumps to the right when needed, and continues to the bottom right.
Like going on an iso-line and not loosing height. Then, the last method is a simple '2D binary search', in which we do binary search along row (or column), and then 'throw-away' two rectangels (top-right and bottom-left). This turns out
to be the most efficnet by far!
P4: Smallest K element in Union of sets -- Given two disjoint sets X and Y, each is sorted. Find the K-smallest element in their union.
Of course, O(K) solution is simple and direct. BUT, we can do it in O(log |X| + log |Y|), by using binary search and divide-and-conquer. -- Medium
-- Brute force is O(K). 5-lines of code. Then, using lists and split/merging according to binary search. Last, using arrays rather then lists, which implies just playing with indices rather than creating new lists.
Note: The first (maybe) incosistent-code I noticed in the book. This is in the part of working with array-indices rather than creating sub-lists (Maybe they somehow meant the function to be called from a different place). I left in
the code all debugging-printouts to show that in my code, there's perfect agreement between the list and array based solutions.
P6: Making a Century -- Given the digits 1..9, list all the ways the operations + and x can be inserted into the sequence so as to make the total sum 100.
100 = 12 + 34 + 5x6 +7 + 8 + 9
100 = 1 + 2x3 + 4 + 5 + 67 + 8 + 9
No parthneses, normal order of operations. -- Easy.
-- This is a brute-force problem, but the goal of the pearl is to establish some general formulation of 'Brute-Force' search, and improve on it with 'little' assumptions.
Two main things for brute-force methods:
1. Generating the 'values' while generating the 'candidates', can produce savings (Especially if generations of candidates can be 'sequential').
2. Reduce 'good' critertia to 'ok', it can help trim the generation of all sequences.
Note: As of now, didn't implement the 'ok' shortcut, as right now these ru nso fast anyway.
P9: Celebrity Clique -- Finding a "Celebrity Clique" in linear time -- Fusion, Graphs -- Hard.
-- Interesting problem: It is more efficient to *find* a solution assuming one exists, rather than *check* it is actually a solution!
Dijkstra -- Shortest path in a graph (no heuristics).
Needleman-Wunsch -- Global sequences alignment.
Smith-Waterman -- Local sequences alignment.
Clock -- Measuring execution time -- Easy. The goal is to time a ceratin function or program. Of course, there's difference between "wall-time" and "CPU time", and whether you want to count IO-Time as well (waiting for File I/O), multi-threads,
and so on.
System.CPUTime- Simplest. Coems with Base.
Clock package - Various options for 'Time' definition. Also using the Formatting package for nicer output.
Other possible packages : Timeit package, Criterion package.
foldr and fusion -- Simple example of fusing function on foldr -- Med.
Source code example: fusion1.hs.
Fusion is the idea of combining (or 'fusing') a few operations together in a manner that avoids intemediate data-structures or claculations. This is a very common practice in functional programming, where fusion can improve dramatically
Basic text-definition of fusion for foldr is: " f*foldr g a = foldr h b ", Provided f is strict, f a = b, and f (g x y) = h x (f y) for all x and y.
The fusion condition on "f strict" can be relaxed if we just want to use " f (foldr g a xs) = foldr h b xs ", for all finite sets xs.
A note about foldr: " foldr:: (a->b->b) -> b -> [a] -> b " : Takes the second argument, and the last item of the list, and applies the function. Then, takes the result and the pentumilate element of the list, and repeats.
For a more elaborated example, see the Celebrity Clique pearl (P9) in Bird's book.
Triangular numbers -- Fiding all triangular numbers.
Problem 14 -- Longest Collatz sequence: The following iterative sequence is defined for the set of positive integers:
n -> n/2 (n is even) n -> 3n + 1 (n is odd) Using the rule above and starting with 13, we generate the following sequence:
13 -> 40 -> 20 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1. Which starting number, under one million,
produces the longest chain?
Problem 15 -- Lattice Paths Starting in the top left corner of a 2x2 grid, and only being able to move to the right and down, there are exactly 6 routes to the bottom right corner. How many such routes are there through a 20x20 grid?
Problem 16 -- Power digit sum 2^15 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26. What is the sum of the digits of the number 261000?
Problem 17 -- Number letter counts If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 + 3 + 5 + 4 + 4 = 19 letters used in total. If all the numbers from 1 to 1000 (one thousand) inclusive were
written out in words, how many letters would be used?
NOTE: Do not count spaces or hyphens. For example, 342 (three hundred and forty-two) contains 23 letters and 115 (one hundred and fifteen) contains 20 letters. The use of "and" when writing out numbers is in compliance with British usage.
Problem 18 -- Maximum path sum I By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.
2 4 6
8 5 9 3
That is, 3 + 7 + 4 + 9 = 23. Find the maximum total from top to bottom of the triangle below (in the text file)
NOTE: As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires
a clever method! ;o)
Problem 22 -- Names scores: Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical
value for each name, multiply this value by its alphabetical position in the list to obtain a name score. For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the
list. So, COLIN would obtain a score of 938 × 53 = 49714. What is the total of all the name scores in the file?
Problem 25 -- 1000-digit Fibonacci number: The Fibonacci sequence is defined by the recurrence relation: Fn = Fn-1 + Fn-2, where F1 = 1 and F2 = 1. The 12th term, F12, is the first term to contain three digits. (the
7th is the first with 2 digits). What is the index of the first term in the Fibonacci sequence to contain 1000 digits?
Problem 28 -- Number spiral diagonals Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows (see in the code file).
It can be verified that the sum of the numbers on the diagonals is 101.
What is the sum of the numbers on the diagonals in a 1001 by 1001 spiral formed in the same way? -- Easy.