"Pearls of Functional Algorithm Design", by Richard Bird. Cambridge University Press. 2010.
Inspirational in it's succinct and clear way of presenting the material.
No wonder: It is a collection of some of the pearls published in the 20-years period (1990-2010) in the Journal of Functional Programming.
P1: Smallest free number -- Given a list of natural numbers, find the smallest number NOT in the list, in LINEAR time -- Array -- Easy.
-- We go over the list xs, and mark in an auxiliary array the numbers that are between [0..length xs]. Then, finding the firsy non-appearing number in this aux-array.
p1.hs (opens in new window).
P2: Surpassing numbers -- Given a list [a] of (Ord a), for each element define as 'surpasser count' the number of remaining elements to the right which are larger than it. Find the maximum surpasser count of a list in O(n*log n) -- Divide
and Conquer -- Medium.
-- We define data-strucutre (and operations) that will enable us to join (=merge) two sublists in linear time. The data base is an ordered list of the elements and their surpasser for each half-list, and then the join (of right and left
lists) can be done element-by-element in linear time.
P3: Saddleback search + Binary search goes 2D -- Given f(x,y)-> z over the Natural numbers. f(,) is strictly increasing in both its arguments.
Find all the pairs (x,y) such that f(x,y) = z_0. -- Medium.
-- Brute force is searching in the whole square of (0,0) to (z_0,z_0). Then, Saddleback goes over a 'line', stepping gently from the top-left along a column, and then jumps to the right when needed, and continues to the bottom right.
Like going on an iso-line and not loosing height. Then, the last method is a simple '2D binary search', in which we do binary search along row (or column), and then 'throw-away' two rectangels (top-right and bottom-left). This turns out
to be the most efficnet by far!
P4: Smallest K element in Union of sets -- Given two disjoint sets X and Y, each is sorted. Find the K-smallest element in their union.
Of course, O(K) solution is simple and direct. BUT, we can do it in O(log |X| + log |Y|), by using binary search and divide-and-conquer. -- Medium
-- Brute force is O(K). 5-lines of code. Then, using lists and split/merging according to binary search. Last, using arrays rather then lists, which implies just playing with indices rather than creating new lists.
Note: The first (maybe) incosistent-code I noticed in the book. This is in the part of working with array-indices rather than creating sub-lists (Maybe they somehow meant the function to be called from a different place). I left in
the code all debugging-printouts to show that in my code, there's perfect agreement between the list and array based solutions.
P6: Making a Century -- Given the digits 1..9, list all the ways the operations + and x can be inserted into the sequence so as to make the total sum 100.
100 = 12 + 34 + 5x6 +7 + 8 + 9
100 = 1 + 2x3 + 4 + 5 + 67 + 8 + 9
No parthneses, normal order of operations. -- Easy.
-- This is a brute-force problem, but the goal of the pearl is to establish some general formulation of 'Brute-Force' search, and improve on it with 'little' assumptions.
Two main things for brute-force methods:
1. Generating the 'values' while generating the 'candidates', can produce savings (Especially if generations of candidates can be 'sequential').
2. Reduce 'good' critertia to 'ok', it can help trim the generation of all sequences.
Note: As of now, didn't implement the 'ok' shortcut, as right now these ru nso fast anyway.
P9: Celebrity Clique -- Finding a "Celebrity Clique" in linear time -- Fusion, Graphs -- Hard.
-- Interesting problem: It is more efficient to *find* a solution assuming one exists, rather than *check* it is actually a solution!
Dijkstra -- Shortest path in a graph (no heuristics).
Needleman-Wunsch -- Global sequences alignment.
Smith-Waterman -- Local sequences alignment.
Clock -- Measuring execution time -- Easy. The goal is to time a ceratin function or program. Of course, there's difference between "wall-time" and "CPU time", and whether you want to count IO-Time as well (waiting for File I/O), multi-threads,
and so on.
System.CPUTime- Simplest. Coems with Base.
Clock package - Various options for 'Time' definition. Also using the Formatting package for nicer output.
Other possible packages : Timeit package, Criterion package.
foldr and fusion -- Simple example of fusing function on foldr -- Med.
Source code example: fusion1.hs.
Fusion is the idea of combining (or 'fusing') a few operations together in a manner that avoids intemediate data-structures or claculations. This is a very common practice in functional programming, where fusion can improve dramatically
Basic text-definition of fusion for foldr is: " f*foldr g a = foldr h b ", Provided f is strict, f a = b, and f (g x y) = h x (f y) for all x and y.
The fusion condition on "f strict" can be relaxed if we just want to use " f (foldr g a xs) = foldr h b xs ", for all finite sets xs.
A note about foldr: " foldr:: (a->b->b) -> b -> [a] -> b " : Takes the second argument, and the last item of the list, and applies the function. Then, takes the result and the pentumilate element of the list, and repeats.
For a more elaborated example, see the Celebrity Clique pearl (P9) in Bird's book.
Triangular numbers -- Fiding all triangular numbers.
Want to make sure I am not a spoiler, so am not linking to the original page and details.
The idea in most is to use some Math to make the coding (and solution) much simpler.
Update: Since first put over here, I started solving these serially (planning to clean the table, if possible), so removed it alltogether. If you want to discuss these, please reach out to me directly.