Loops vs Recursion: a LISP example

Most programmers are comfortable with loops. I think that is because loops are a necessary flow construct in most imperative programming languages. But when doing high level programming, recursion has several key advantages:

  • logic is more precise and short
  • code is more expressive
  • there is no or minimal overhead where potential bugs could occur
  • logic is isolated
  • reading code doesn’t require evaluating loops in your head while remembering previous states of variables

Yet it is still argued that loops are, on occasion, superior where programmer has a choice. One common example seems to be the Fibonacci function and one such case is “Ansi Common Lisp” by Paul Graham. Here’s the quote from the book:

The other issue to bear in mind is that the obvious recursive algorithm is not always the most efficient. The classic example is the Fibonacci function. It is defined recursively,

  1. Fib(0) = Fib(1) = 1.
  2. Fib(n) = Fib(n – 1) + Fib(n – 2).

but the literal translation of this definition,

(defun fib (n)
  (if (<= n 1)
      1
      (+ (fib (- n 1))
         (fib (- n 2)))))

is appallingly inefficient. The same computations are done over and over. If you ask for (fib 10), the function computes (fib 9) and (fib 8). But to compute (fib 9), it has to compute (fib 8) again, and so on.

Here’s an iterative function that computes the same result:

(defun fib (n)
  (do ((i  n (- i 1))
       (f1 1 (+ f1 f2))
       (f2 1 f1))
      ((<= i 1) f1)))

The iterative version is not as clear, but it is far more efficient. How often does this kind of thing happen in practice? Very rarely – that’s why all textbooks use the same example – but it is something one should be aware of.

The iterative version is “not as clear?” That’s a severe understatement. Try reading it – at first glance it’s just a bunch of gibberish. It takes reading it over and over few times to understand the logic. Pay attention to how f1 is defined in terms of f2; and then f2 is defined in terms of f1. Then it takes another massive wave of thought to evaluate the loop several times in your head to validate its logic. Are you sure there’s no bug, or you haven’t left out an edge case? It’s not as easy to tell. And this function is just adding 2 numbers. Imagine what happens when you actually need to do something more complex! But you don’t have to imagine – if you’re reading this post, you probably have written loops that are a lot worse than this.

Paul Graham is right that the recursive implementation as he provided is very inefficient. But by giving such argument for the iterative logic, it gives readers a false sense of understanding that in these types of cases one should always pick an iterative solution. This bar then becomes completely arbitrary, and most programmers, because of aforementioned existing habits, write iterative logic.

Instead, we should write Fibonacci function in recursive format that is both expressive, concise and efficient. Many programming books I read concentrate on the factorial function. So I thought I’d give a shot to writing a Fibonacci function without looking it up, especially since I’m new to and still learning Lisp. So this is initially what I came up with:

(defun fib (n)
  (fib-raw n 1 1 1))

(defun fib-raw (n f0 f1 cur)
  (cond ((eql n 0) 1)
        ((eql n 1) 1)
        ((eql cur n) f1)
        (t (fib-raw n f1 (+ f0 f1) (+ cur 1)))))

The fib function provides the interface, and the main computation is happening in the fib-raw function. The fib-raw function takes 3 extra arguments: f0 and f1 are previous 2 Fibonacci calculations; and cur is the current number we’re on. n remains the number we’re trying to get to.

Once these arguments are understood, it is really painless to read the function:

  • if n is 0 or 1 the answer is 1
  • if current number is n (the number we want) the answer is the last computed Fibonacci number – f1
  • otherwise, re-run the same logic, but increment cur and add 2 previous Fibonacci numbers to compute a new one

It also closely follows the original definition provided in the book. Looks good, right? It’s easy to read and efficient too.

But wait, we can make this simpler, because Lisp supports optional arguments with default values:

(defun fib (n &optional (f0 1) (f1 1) (cur 1))
  (cond ((eql n 0) 1)
        ((eql n 1) 1)
        ((eql cur n) f1)
        (t (fib n f1 (+ f0 f1) (+ cur 1)))))

So now it’s just one function and the 3 extra arguments are optional.

But why stop there? Looking at it in this form, we can see that the initial default values of arguments f0 and f1 act like seeds to the function, so we might as well use them for the 0 and 1 cases:

(defun fib (n &optional (f0 1) (f1 1) (cur 1))
  (cond ((eql n 0) f0)
        ((eql n 1) f1)
        ((eql cur n) f1)
        (t (fib n f1 (+ f0 f1) (+ cur 1))))

Thinking about this further, what are 0 and 1 really? They are arbitrary basis upon which the rest of the computations are done. There’s no need to have them hard-coded in the function – we are already supplying their results, we might as well supply their values:

(defun fib (n &optional (f0 1) (f1 1) (n0 0) (n1 1) (cur 1))
  (cond ((eql n n0) f0)
        ((eql n n1) f1)
        ((eql cur n) f1)
        (t (fib n f1 (+ f0 f1) n0 n1 (+ cur 1))))

Here I added n0 and n1 optional arguments to specify base values. They default to 0 and 1, but now we can specify any base values when we run it.

This last change allows us to make the computation even more efficient by specifying a larger base. For example, if we knew all our requests are going to be higher than 20, we can seed the function with a known Fibonacci number of 20 and 21 like this:

(defun fib (n &optional (f0 10946) (f1 17711) (n0 20) (n1 21) (cur 1))
  (cond ((eql n n0) f0)
        ((eql n n1) f1)
        ((eql cur n) f1)
        (t (fib n f1 (+ f0 f1) n0 n1 (+ cur 1))))

And now all our computations start at the base of 20 and 21, while all the logic stays exactly the same.

Further, I noticed on Wikipedia page about Fibonacci number that the definition provided by Paul Graham that equates to this sequence:

1, 1, 2, 3, 5, 8…

is a classic one. Wikipedia claims the more modern sequence is defined as:

0, 1, 1, 2, 3, 5, 8…

Notice the latter starts with 0 instead of 1. But this is trivial for us. Similar to how we changed the base, instead of giving the default result of 1 for f0, we’ll give 0:

(defun fib (n &optional (f0 0) (f1 1) (n0 0) (n1 1) (cur 1))
  (cond ((eql n n0) f0)
        ((eql n n1) f1)
        ((eql cur n) f1)
        (t (fib n f1 (+ f0 f1) n0 n1 (+ cur 1))))

And now our computations start from 0, because we like to be modern, of course.

Feel free to attempt to make similar changes to the iterative version and see how it turns out. Can you make changes this simply? Can you reason about your logic with the same clarity? Can you isolate your logic – the problem you are actually solving – better from the overhead of the imperative style? This is why functional style recursion is more practical and makes more sense.

Also, keep in mind I took an example from the book that claimed this to be one of the very few cases where loops were superior. I believe I demonstrated they are not, even in this rare case. This should give you additional hints as to almost every other case where you use loops.

There’s more to recursion than this. Consider that your environment should support tail call optimization before seriously using it – this is a technique that prevents stack frames getting added to the stack with each recursive call. You should also learn how to write tail-recursive functions. Furthermore, if your environment or libraries provide tools for higher order functions, learn how to make use of them. Common functions such as map and fold (aka reduce) provide even easier and more expressive ways to define your logic.

Finally, this is not, by any means, meant as a tool to crusade against loops. I just wanted to demonstrate to myself for sanity’s sake, and others that might stumble upon this post, that when programming in a high level language, functional style recursion almost always makes more sense than imperative loops.

How to Set Up Erlang Common Test for Code Coverage

Common Test code coverage and usage is documented here and here, but:

  • there are no examples given
  • documentation fails to state what the minimum required configuration is to turn on code coverage
  • the implementation has no sane defaults and by default code coverage is disabled

So, here’s the minimum sane default:

  • create a cover.spec file in your application directory with the following contents
    {incl_app, my_app, details}.
    {excl_mods, my_app, [my_app_SUITE]}.

    where my_app and my_app_SUITE are names of your application and the test suite(s) you have
  • when running using ct_run, make sure to
    • either specify -cover cover.spec option; OR
    • add this line to your main CT spec file
      {cover, "cover.spec"}.

That’s it! Next time you run CT you should have a nice code coverage report generated with it.

Oh, also, it’s probably worth mentioning that I couldn’t get erlang.mk to run ct_run with the -spec option, so I manually added it in the CT_RUN command.

Elastic Applications in Erlang

How realistic or useful are elastic applications in general, or specifically in Erlang? To demonstrate, I start with Joe Armstrong’s favorite Erlang program – the universal server:

universal_server() ->
    receive
        {become, F} ->
            F()
    end.

Once spawned, it sits and waits until it gets an instruction to become something. Then it becomes that thing it’s told to be.

First, this is really cool. Second, we should expand a bit:

  • make the wrapping process loop
  • optionally make the internal function loop
  • if internal function receives messages, make it respect an “abort” message which would make it stop executing and return the control back to its parent

So, blah blah – all of the above is only few lines of code in Erlang. I’ll call this module gen_node and the gen_node.erl file is as follows:

-module(gen_node).
-export([start/0]).

start() ->
    receive
        {become, F} ->
            F(),
            start();
        reset ->
            start();
        stop ->
            ok
    end.

and then the way we invoke it would be:

1> c(gen_node).
{ok,gen_node}
2> N = spawn(gen_node, start, []).
<0.40.0>

OK, the process is spawned and waiting, let’s give it something to do. Let’s define 2 functions: one that doubles the given number, and another that squares it. They both communicate via messages:

3> Double = fun Double() ->
    receive
      reset -> ok;
      {From, Args} -> From ! Args+Args, Double()
    end
   end.
#Fun<erl_eval.44.106461118>
4> Square = fun Square() ->
    receive
      reset -> ok;
      {From, Args} -> From ! Args*Args, Square()
    end
   end.
#Fun<erl_eval.44.106461118>

Now, let’s tell the spawned process to become “Double”, then reset it and tell it to become “Square”:

5> N ! {become, Double}.
{become,#Fun<erl_eval.44.106461118>}
6> N ! {self(), 16}.
{<0.33.0>,16}
7> flush().
Shell got 32
ok
8> N ! reset.
reset
9> N ! {become, Square}.
{become,#Fun<erl_eval.44.106461118>}
10> N ! {self(), 16}.
{<0.33.0>,16}
11> flush().
Shell got 256
ok

So, this is cool because now we have a generic computing node that we can tell to transform into any arbitrary processor and communicate to it via messages.

Now, finally, let’s wrap this up in an OTP application using gen_server so the worker processes are also supervised:

https://github.com/unix1/gen_node

Now the interaction becomes:

1> Double = fun Double() ->
    receive
      {_, reset} -> ok;
      {From, Args} -> From ! Args+Args, Double()
    end
   end.
#Fun<erl_eval.44.106461118>
2> Square = fun Square() ->
    receive
      {_, reset} -> ok;
      {From, Args} -> From ! Args*Args, Square()
    end
   end.
#Fun<erl_eval.44.106461118>
3> application:start(gen_node).
ok
4> {ok, N, _} = gen_node:start_server().
{ok,<0.42.0>,#Ref<0.0.0.45>}
5> gen_node:become(N, Double).
ok
6> gen_node:send(N, 16).
32
7> gen_node:reset(N).
ok
8> gen_node:become(N, Square).
ok
9> gen_node:send(N, 16).
256

It worked!

So, what’s next?

This is obviously just a concept code. To be usable something needs to track the states and types of nodes, and something else to intelligently control what they are doing and how the behavior should adapt. Possibilities could include:

  • predicting application-specific demand and adjusting resources: e.g. queue X is filling up, while queue Y processes are waiting too long – so make an adjustment in real-time
  • defining capacities of systems by giving different weights to each type of operation, or calculating resources needed to make certain computations
  • more fully using existing resources, or distributing load where the right resources are available

Questions: is this applicable and interesting? Boring? Already been done? Do you have any ideas which direction this should take?

Any constructive feedback is fully welcome.

PHP Sessions in Erlang Mnesia

Motivation

  • create a very simple first Erlang/OTP application
  • link to conventional web development

I decided one of the simplest things to do would be to create a session storage for PHP in Mnesia. But that, in doing so, I would also create an extremely simple key value store wrapper around Mnesia which would track access times.

Disclaimer

Do NOT use this in production, or anywhere where it’s important. If you do, you’re crazy because:

  • it has not been tested in production or production-like environment
  • as I mentioned, this is my first Erlang application
  • connection is made to a single Erlang node, and there is no failover mechanism if that fails
  • session garbage collection is not implemented

Good Stuff

OK, the disclaimer is out of the way, let’s get to the good stuff! Here’s what I did and how.

Tools used

  • Erlang runtime and development headers
  • PHP >=5.3 (>=5.4 preferred) binary and development headers
  • Apache
  • mypeb
  • kvstore
  • lib360

Architecture

[ PHP ] <–> [ mypeb ] <–> [ Erlang ] <–> [ kvstore ] <–> [ Mnesia ]

Instructions

  • download and install prerequisites as best done in your environment
    • Erlang runtime and development headers
    • Apache
    • PHP >=5.3 binary and development headers
  • download and compile mypeb
    • git clone git://github.com/videlalvaro/mypeb.git
    • cd mypeb
    • phpize
    • ./configure
    • make
    • sudo make install
    • php -m | grep peb
    • Note 1: the last command is a test to verify peb module loads successfully
    • Note 2: you might need to specify --with-erllib and --with-erlinc options with ./configure command if they are not automatically found
  • download, compile and run kvstore
  • download lib360
    • git clone git@github.com:unix1/spoof.git
  • add sample PHP script to your web server and test everything
    • download sample gist
    • replace the path in require_once() to wherever you downloaded lib360
    • replace the your-erlang-cookie-here with the contents of ~/.erlang.cookie file
    • access the file through the web server and have fun!

Missing

These are things that I thought of but didn’t bother implementing that I might someday add:

  • make kvstore more configurable and easy to install/start/stop
    • kvstore currently fails to stop (you have to abort)
    • it could use rebar and more automated scripts instead of manual commands
    • support for configurable multiple table names
    • make access time tracking optional
    • implement other storage backends
  • account for failover when primary node connection fails from PHP
  • PHP session garbage collection
  • implement kvstore access through lib360 database layer

Suggestions/Contributions/Feedback

If you have any feedback, feel free to give it via github, blog comments, etc. as appropriate. Enjoy!

How To Add QML Module with Plugins for Qt Creator

I decided to take a plunge into a KDE Plasma development with a simple plasmoid with Qt Creator. There’s a very good overview and a guide

However, you will notice that when you do this in Qt Creator

import org.kde.plasma.core 0.1 as PlasmaCore

you will get an underline and the following error

QML module not found

followed by import paths and some vague tip about QML_IMPORT_PATH and importPaths property. What this means is that you need to explicitly declare the import path for KDE components in your *.qmlproject file. Specifically, directly in/under the Project container add the following property

importPaths: [ "/usr/lib64/kde4/imports" ]

or whatever the case is for your specific environment. Now switching back to your .qml file you will notice that the red underline is still there but the message is different when you hover over it

QML module does not contain information contained in plugins

First, the QML can be run at this point. Second, this is because KDE has not included plugins.qmltypes files with their Plasma components. To generate them yourself you can run a command

qmlplugindump org.kde.plasma.core 0.1 /usr/lib64/kde4/imports > /usr/lib64/kde4/imports/org/kde/plasma/plugins.qmltypes

substituting your paths, if different. You’ll note that your user probably doesn’t have privileges to write to that directory – so either run the command as root or generate the file in another directory and then copy it.

After refreshing/reloading the project in Qt Creator you will see the red underline has disappeared and Qt Creator is in a happy state. You can use this method with other components with missing plugins.qmltypes files; and use this with your custom C++ components as well.

Simple PHP Object Oriented Framework

This marks the initial release of Simple PHP Object Oriented Framework – spoof on codefly.org. Well, I intend for this to be a framework, but I’m starting with the extensible core – the main class library. To that end, the first crack is at:

  • autoload
  • data access components

Autoload is a simple implementation for spl_autoload_register:

  • one class per file
  • \namespace1\subnamespace\MyClass automatically turns into [root_dir]/namespace1/subnamespace/MyClass.php
  • you never have to use require_once again

Database components are a more significant undertaking, as they are designed to be language, connection and execution agnostic.

Finally, currently both PHP 5.2 and 5.3+ are supported. However, I am making a full use of PHP 5.3 namespaces which are not backward compatible. I would expect to continue work on 5.3+ version only.

Please check it out and feedback is more than welcome.