Colin Macleod's Random Ramblings

Tuesday, 26 April 2022

Excavations in the Midden-Heap of my Memory

And now for something completely different

My secondary school was Morgan Academy, Dundee. In my class there were two people who later became professional newsreaders - Susan Rae and Craig Millar - that can't be very common. Susan was already well-known as an award-winning singer. But I particularly remember Craig because it was through him and a friend of his whose name I can't remember now that I got to know most of the classic Monty Python sketches. My parents didn't approve of MP so we didn't watch it in our house. But the next day Craig and his friend would re-enact the routines during break time, so that was how I got to know the dead parrot sketch, the four yorkshiremen, nobody expects the Spanish inquisition, etc., etc.

Northern Lights

In my second year at St. Andrews University, I shared a flat with a few people including a certain Robert McNaught. To be honest he seemed a bit of a weirdo (which could probably have been said about me too). He was fanatically obsessed with practical astronomy. He would stay up most of the night, photographing the night sky, searching for meteors etc., then sleep in and miss his lectures. One night he woke us all up to tell us that there was a spectacular display of the Northern Lights; that's the only time in my life I have ever seen them. However he fell out badly with the academic astronomers, one time they accused him of coming into the university observatory when drunk and misusing their equipment. Eventually he graduated with a degree in Psychology. So recently I googled him and discovered that he is now considered the world's greatest comet discoverer. He lives in the Australian outback with a partner called Tanya and retired a few years ago after a "stellar" career in more senses than one. He has both an asteroid and a comet named after him!

I did learn something important from Robert. At one time he asked me about how he could approach doing some astronomical calculation by computer. This was long before the days of home computers, there were probably two or three computers in the whole University, and the kind of "user-friendly" interfaces we take for granted now were a distant dream. Robert was a clever guy and he knew what he wanted to do, but he was not a computer specialist and could not see how to achieve it. I don't think I managed to give him much practical help, but his question started me thinking about how computer systems could be made more accessible to a wider audience. That theme is something that has stayed with me ever since. I think it's a lesson that some of my colleagues would do well to learn; sometimes there is an attitude that if the programmer can solve the problem to his/her own satisfaction that is all that's needed, but that solution actually has little value if it can't be comprehended and used straightforwardly by other people.

The Dawn of Functional Programming

At St. Andrews I studied Maths, Physics and Computing; while poking around the computing lab I discovered the manual for a strikingly innovative programming language which had just been invented by one of their junior staff, David Turner - a year or two later he wrote his PhD thesis on it. I started experimenting with this and wrote various programs in St Andrews Static Language (SASL), which made a lasting impression on me. SASL was the first computer language to turn purely functional programming from a theoretical idea into a practical technique, and gave rise to a whole new area of computing. It then took a long time to move from academia into industrial practice, but is now recognised worldwide as an important alternative programming technique which has some downsides, but also some significant advantages.

What a load of old KACC

I later dropped out of university due to mental health problems. But I had caught the computing bug, so when home computers started to emerge a few years later I was keen to experiment with them. In those pre-internet days information about such things was spread by computer magazines and also by local clubs which sprang up all over the country. One such club was started in my home town of Dundee, and I soon got involved. This was the Kingsway Amateur Computer Club (KACC), which met at Kingsway Technical College. The club ran for quite a few years, I was its secretary at one time.

Some of the young lads who came along were fanatical about games (which never interested me much) and started writing their own, showing off their latest work each week at the club. A group who had met at the club then set up a company to turn this interest into a career. After a few moderately successful releases they hit the jackpot with "Lemmings" which sold in millions. A recent documentary looking back at the development of Lemmings can be seen at https://www.youtube.com/watch?v=RbAVNKdk9gA. I don't know of any other computer game which has been commemorated with a group of bronze statues.

The same people later developed the famous (or infamous) and highly influential "Grand Theft Auto" series of games. I learned recently that "Grand Theft Auto: San Andreas" includes a military installation called K.A.C.C., so they haven't forgotten their roots! Several other computer gaming companies were set up in Dundee by people I knew either through the club or from Dundee University, where I later worked. One of those is responsible for the Xbox and Playstation versions of "Minecraft", another innovative and hugely influential game. All this activity eventually led to the other university in Dundee, Abertay, setting up the world's first degree course in computer game technology.

Apart from the gamers, there were quite a few club members, mostly older, who could see the potential of computers to assist them with their professions or hobbies. Since there were no off-the-shelf packages for such things, not even spreadsheets in the early days, they had to write their own programs from scratch. For example I remember one man who worked in electrical engineering and was writing programs for various calculations he needed to perform for that. I later got a job where part of my work involved professionally maintaining and enhancing some similar amateur-written software which had become indispensable to the operation of some local companies.

Early home computers had no disk drives. They could save and load programs on cassette tapes, but this was slow and not very reliable. Also there were few ready-made programs available, but the magazines would publish programs that had to be laboriously typed in. About half-way through one club session, one young member accidentally bumped the main power switch feeding the whole room, turning off all the computers there. Some of those present had just spent an hour typing in programs, only to have them wiped out before they had the chance to actually run them. They were not best pleased 😖.

The college used to run an annual open day or days, and the club would participate in that. We had an area with a few computers set up and we would chat to whoever came along. One year I got access to a early speech output unit that the college had bought, and I wrote a basic text-to-speech program to drive it. Rather naively we set this up for anyone to try at the open day, and of course the local youths got great amusement from typing in all the rude words and phrases they could think of and listening to it struggling to pronounce them.

Eventually the contacts I made through the club helped me get started on a professional computing career. But since I'm now retired I can say I have at last returned to Amateur Computing 😎.

Wednesday, 26 January 2022

Review of the Year 2021

Sadly, 2021 must be rated as Failing To Meet Expectations.

The target for 2022 is Must Improve. If the required standard is not met further action may be taken.

I'm afraid I neglected this blog throughout 2021, I hope to give it some more attention in 2022.

Like everyone (except perhaps our Prime Minister) my and my families lives were limited quite severely by the restrictions made necessary by the Covid 19 epidemic. Also I'm alternately outraged and depressed by the many ill effects of the continuing Brexit disaster, or the "Inglorious Revolution" as I like to call it.

My son John had a very bad year. He started 2021 ok-ish. In June we took John on holiday to the Isle of Wight, which we had done successfully a number of times before. But this time he reacted very badly, possibly because of some upsets on the journey and at the hotel there, though it's hard to be entirely sure. He started refusing to get into his car, and often refusing to go out at all. Unfortunately this continued after returning from the holiday and is still a problem, which made it impossible for him to engage in many of the activities he had previously seemed to enjoy. Then we heard that owner of John's rented house had died and the heirs wanted to sell it, so he would need to move. Arranging new accommodation for John became a real headache, and this process is still going on now.

My wife Eleni has been very concerned about John's difficulties. But in July-August she managed to visit Greece for 7 weeks to catch up with family and friends and have a holiday. In September Eleni and I visited my mother in Scotland.

For myself, my mental health continues to be rather flaky. I had plans for various activities but have achieved very little. In the summer my brother Alasdair asked if I could write a program to design Scottish tartans, which is something he sometimes does as part of his job. I spent about a month on this and eventually got something working that he was happy with. It is now available for anyone to download from https://chiselapp.com/user/cmacleod/repository/tartaniser/index - screenshot:

When I retired in October 2020, I basically treated my time as a long holiday. This was fine for a while and perhaps necessary since I had been feeling quite exhausted, but eventually just slobbing around gets old. So now I'm aiming to follow a more work-like routine:

for weekdays have a TODO list combining tedious but necessary jobs with things which are interesting but require some organisation to make progress with and work through the list.
keep the weekends as unstructured down-time.

In summary, 2022 still has the potential to correct the faults of 2021, but it needs to get its act together, put its nose to the grindstone, and pull its finger out!

Happy New Year to all ✨

Monday, 21 December 2020

Winter Solstice Felicitations!

I wish you a very happy Winter Solstice/Christmas/New Year/(substitute your midwinter festival of choice) 🎈🎈🎈
Here are a few recent pictures -

Canvey Island Seafront:

We're all feeling the pinch these days - the Bentley dealership in High Barnet has closed:

The shining wit of Barnet:

Family selfie:

John with our Christmas Tree:

Wednesday, 18 November 2020

Fundis - FUNctional DIStributed processing

This is a technical post about a project I've been thinking about on-and-off for a while. The idea is to use Redis as the middleware for a heterogeneous network of processing agents. Clients would write requests for particular computations into Redis, servers would pick these up, do the computations and write the results back into Redis, from where the clients would pick them up. I haven't yet written any code for this, I'm just trying to clarify the design at present.

This could be seen as a more ambitious successor to "Fundep" - a little library I developed while working at Bloomberg to connect and control a network of FUNctional DEPendencies within a single process. "Fundis" aims to do a similar job, but working between multiple processes distributed over multiple machines.

Background

This project was motivated by problems I faced as a software engineer at Bloomberg, from where I recently retired. One might say that they are no longer my problems, but I found them interesting and spent quite a bit of time thinking about ways to overcome them, but never had the time to implement these ideas. So I'm reluctant to just abandon all these thoughts, and would still like to implement them in the hope that they may be useful somewhere.

In the early days, all the financial analysis/trading/management/etc functionality which Bloomberg provides its users was driven by a few huge monolithic executables running on huge monolithic "big iron" machines. This was actually pretty efficient, but very inflexible and hard to scale-up further. So the trend has been to move specific chunks of functionality into lots of different executables, each providing specialised services, and distribute these across an increasing number of more cost-effective and energy-efficient "commodity" machines. Bloomberg differs from other prominent providers of online services in that Google, Facebook, etc. provide a fairly narrow range of functionality on a very large scale, while Bloomberg provides thousands of different functions, each of interest to a different subset of their users. (To give an extreme example, I once had to implement some special logic in one function which was only expected to be used by a single client, and then only a few times each year - however that client was the central bank of a country, so the work was considered justified.) So the back-end of the Bloomberg system now consists of many thousands of different executables, each of which may be running tens or hundreds of instances spread over a cluster of machines dedicated to that particular area of functionality. Responding to a single user click on a button may involve chains of calls to dozens of different services.

Clearly, efficient communication between all these services is critical to keep the whole system responsive. I will not go into the details of how this works, it uses some proprietary protocols which always seemed rather heavyweight to me. What is relevant is that there can be major mismatches between the speed of different operations, and updates that need to be fast can be held up waiting for operations that take significant time. Sometimes this is simply unavoidable, but in many cases delays can be minimised by:

Caching results which may be needed more than once;
Pre-computing results which take time and are likely to be needed later;
Doing repeated similar operations in batches, thus saving communication and setup time;
Making requests to multiple services in parallel, as long as the operations are independent;

However these optimisations are often easier said than done. The natural way to write code tends to lead to making requests for remote data/operations as and when they are needed. Implementing any of the optimisations above requires some refactoring and more complex code, so it tends not to get done when the priority is to deliver a working system quickly.

Caching and pre-computing is sometimes done, e.g. with Redis, but extra code has to be written for this in each case. Note that caching within a client process or even one client machine is usually not ideal as the next request involving the same data may be served on a different machine.
Due to the volume and complexity of the existing code, which has often been updated by dozens of developers over tens of years, it can be quite hard to get enough of an overview to see clearly where batching is possible. Similarly, it can be very difficult to trace the interdependencies between sub-computations to see when it is safe to re-order them or do them in parallel.

All these forms of optimisation depend on decoupling the sequencing of calling computationally expensive remote operations from the sequencing of the code which consumes their results. So rather than making such remote calls directly from the consuming code, we need some infrastructure to manage these calls and store their results. Instead of building such infrastructure in ad-hoc fashion for each use, it seems worthwhile to create a generic infrastructure which can manage many such uses.

Note that if the sequencing of calls can be changed by the infrastructure, it is essential that such re-ordering has no side-effects. This implies that the remote calls must operate as pure functions, whose only action is to produce a result which depends only on their inputs. However if the result is expected to vary over time, we can add a timestamp or version number parameter to make this explicit.

Also, it greatly simplifies the client code if it can just ask for what it wants without needing to specify which server its request should be routed to. However in the Bloomberg environment the system for specifying how to route requests to the appropriate servers had become highly complicated, requiring considerable attention to configure and update correctly. I believe it should be possible to manage this in a way which is both simpler and more dynamic.

Proposal

Redis is often used simply as a cache, but the Redis home page describes its uses more broadly as "database, cache and message broker". I want to explore using Redis as a single integrated messaging and caching system which would handle all the communication between services in a distributed processing environment like that described above.

If we assume that processing results are going to be cached in Redis, we will need to have code to write and read input and output in the string key/value form which Redis supports. The key here needs to include (or at least depend on) all the relevant inputs, otherwise we will get false hits. So rather than re-encoding this same data in another format in order to call a remote service, we can use the Redis-compatible format to communicate with the remote service as well. The procedure would be:

Client formats the input parameters as a string that can be used as a Redis key.
Client queries Redis for this key, if found client gets the data in string form and decodes it.
If the key was not present in Redis, client requests it by writing the key to a Redis queue. Note that the query and request (when needed) can be done atomically by sending a Lua script to be executed on the Redis server.
Servers for this data will be monitoring this Redis queue, so one of them will pick up the requested key, do the necessary computation and write the input-key/output-value back to Redis.
Client then reads the result from Redis as it would have done at step 2 if it was already available.

Data structure / granularity

For a first version, I would represent each function in Redis by one hash for the key/data pairs and one list for a queue of keys being requested. In a later version I would hope to support more sophisticated, possibly hierarchical structures. When a client wants the data for a key which has not yet been computed, it will RPUSH the key to the relevant queue. Each server will monitor the queues for the functions they support with BLPOP - this ensures that each request will be processed by one and only one server.

Pre-computation

When a client knows that certain data is likely to be needed soon but not immediately, it could write requests for this data into a low-priority queue (represented as another Redis list). When a server is idle and has no work waiting in the main (high priority) queue it would serve requests from the low-priority queue, writing those results into Redis so that when the client later needs them they are immediately available.

Common parameter data

Sometimes several different computations will require the same input data, e.g. info about a user such as full name, address, organisation, privileges, etc.. Rather than passing each of these parameters individually to each function which needs them, the client could write a "user" record into Redis with all this info and then just pass a single identifier which enables a server to find this info to each function called.

Scaling

Note that depending on load, not only could extra instances of specific servers be started or stopped on-the-fly, but even moved to different machines without needing any special routing configuration changes.
Redis itself could become a bottleneck but if necessary multiple Redis instances could be used, along with some scheme for sharding data across instances.

Side benefits

This system has the side effect that requests to services and their replies are automatically recorded in Redis. Retention times may need to be tuned depending on the storage space available. But this data can then be inspected and monitored by other tools for debugging, testing, system health checks etc..

Next steps

If anyone finds this interesting or has feedback, please post a comment. I hope to start prototyping this scheme soon, and will post any results here.

Update 17/5/24 - after a ridiculously long delay, I have now made a Tcl implementation of this idea, see https://wiki.tcl-lang.org/page/DisTcl+%2D+Distributed+Programming+Infrastructure+for+Tcl.

Wednesday, 4 November 2020

More Autumn Photos

Not much time to post anything as we are now in Family Lockdown No. 2. Every Monday evening my son John goes to an activity group run by Resources for Autism which he enjoys. But a week ago they emailed to say that one of their staff who was at this group had been confirmed to have coronavirus, so everyone who had attended the group, including John, needs to self-isolate for two weeks from that Monday, i.e. until Monday 9th November. Since it's difficult to do any real isolation while John is at his usual Supported Living house with a whole team of staff coming and going, my wife and I have been looking after him back at our family house for the last week.

I have been able to get out for a walk now and then, so here are some pictures taken in and around Hadley Wood (the actual wood that is, not the suburb where the Porches have stickers saying "My other car is a Bentley" 😄) :

Wednesday, 28 October 2020

Odds and Sods

What the ****! I started this blog thinking I would be speaking to maybe a dozen people. At the weekend I posted my usual grumbles about the general lack of appreciation for my favourite programming language Tcl. My former colleagues at Bloomberg must have been fed up of hearing me repeating this stuff. Anyway, someone I don't know linked my post on "Hacker News" and 24 hours later it had 15000 views😵. Any time my ego needs a boost I can reread this comment!

Trying to improve your mood by thinking about it can seem rather like trying to make your car move by pushing on the dashboard (a phrase I read recently in a completely different context). Just getting out for a walk can be much more effective. And it turns out that walking around my area High Barnet is so interesting they made a film about it - see trailer at 23 Walks, also background info. Most of the locations in the trailer are extremely familiar to me, even the council office which appears briefly looks like one my wife and I have visited several times for meetings with social services about our son's care.

But isn't it about time the scriptwriters for the dystopian future drama we seem to be living in decided to lighten up a bit? Surely it was enough to have half the world ruled by mad dictators and would-be dictators, impending environmental catastrophe, the UK tearing itself apart, without adding a world-wide killer virus on top as well? Perhaps in the next episode Stephen Pinker will assure us that all is well?

Some handy points I picked up from Pinker's book How the Mind Works, paraphrased somewhat:

Love is the state of mind where the well-being of another person becomes as important as your own (not very romantic, but it works for me).
The conflict between logic and emotion is bogus, because logic tells you how to do things but not what to do, while emotion tells you what to do but not how to achieve it (in theory of course, in practice this conflict still often seems problematic).
It's really not surprising that human beings can be obsessive and/or highly sensitive about almost anything even remotely related to sex. For the "selfish genes" which ultimately shape our behaviour, whether and with whom we have sex is quite literally a matter of life or death, determining which of those genes live on in the next generation.

The "Keep Calm and Carry On" attitude has a lot to answer for 😕. Of course if you're in the middle of a crisis, you have to focus on the immediate practicalities of the situation and emotional reactions may be luxuries you can't afford. But ignoring these upsets doesn't necessarily mean they go away. In computing terms, the various alarm signals that go off are flagged as high priority, so if they can't be handled at the time they get queued for later processing. If the crisis is intense or prolonged (such as struggling to care for a disabled family member in parallel with a demanding job), it may never be possible to process this queue. But it's also never possible to entirely ignore it and if the queue of deferred alerts continues to build up, its pressure will eventually start to disrupt one's normal functioning. So for the benefit of one's long-term mental health, a better policy may be the classic "When in danger or in doubt, run in circles, scream and shout" 😲.

I used to be rather dubious about Julian Assange and WikiLeaks. But after following reports of his recent extradition hearings, e.g. from the Independent, the indefatigable Craig Murray, etc., I have started to think that he is being "railroaded" for the crime of shining a light into dark places. Also since I tend to get most of my news from the Guardian, I am seriously disturbed by the allegation that the Guardian betrayed Assange after getting a lot of copy out of his earlier revelations.

Finally, since 40+ years ago I was diagnosed as having a schizoid personality, I leave you with the appropriate theme music.

Saturday, 24 October 2020

Why I'm Tcl-ish

I'm a big fan of programming in Tcl, the "Tool Command Language", although it is distinctly out-of-fashion these days. When I have the freedom to choose, I tend to use Tcl for anything that doesn't need to run at maximum possible speed (and probably C++ for anything that does).

One of my colleagues at Bloomberg once asked when I would give up writing utilities in such an ancient language as Tcl and update myself to something more contemporary like Python. I should perhaps have replied "I find your lack of faith disturbing" but I just said something lame to the effect that such an "update" would make me less productive 😉.

Over my 47-year involvement with computing, at various times I have been enthusiastic about several different programming languages:

St. Andrews Static Language - the first practical implementation of a pure functional programming language anywhere, which I just happened to get the chance to use in 1975-6.
Modula-2 - a very clean, predictable, understandable conventional algorithmic language.
Prolog - the classic PROgramming in LOGic language, yet another fundamentally different yet consistent paradigm.
Perl - quite the opposite of all the above, a very "hacky" language based on practicality not purity, great for solving certain types of problems quickly, but really not scaling up nicely at all.
Tcl - "Tool Command Language", for me this hits the "sweet spot" between all of the above.

Programmers who like Tcl tend to think of it as being clean, logical and consistent. However the majority tend to reject it, complaining about "quoting hell" and various awkwardnesses which basically come down to it being too different from what they are used to. Really Tcl has a radical minimalism which makes it genuinely different from the common patterns that most programming languages follow.

Most programming languages blend syntax and semantics. Each language construct (e.g. if-then-else for conditional execution) has individual rules for how it is written (syntax) and how it operates (semantics). The language definition as a whole includes all of these specific elements of syntax and semantics.

In contrast, the essence of Tcl is a very small and simple core which defines only how to define and use variables, data values, commands in general, and events. The only syntactic rules are those which define how to invoke a generic command and pass data to and from it. These are documented at man Tcl, there is no special syntax for specific commands. All functionality is defined as the semantics of individual commands. Flow control is done by commands which take other commands as their arguments. So if-then-else functionality is provided by a command called "if" whose arguments are the condition to test, the code to execute when the condition is true, and optionally the code to execute when the condition is false.

This design can be cumbersome in some ways. For example, the core has no syntax for arithmetic expressions, this is delegated to the command expr, which the programmer has to explicitly invoke in various places where some calculation is needed.

However this division of concerns creates a unique flexibility. Commands can be created or redefined on-the-fly. To give an extreme example, it's perfectly possible to redefine the "if" command to reverse its logic. More constructively, before Tcl added built-in commands for object-oriented programming, many people exploited the language's flexibility to make their own support for object-orientation.

I suspect this modular design has also enabled Tcl to evolve more smoothly. Since it was originally designed, Tcl has incorporated many innovations (caching of optimised internal representations for code and data; unicode support; multi-threading; coroutines; fully-virtualised filesystem operations; decoupling of versioning for language extensions, etc.) with almost no disruption for existing running code, something which Python still struggles with.

I should say that Lisp has many of the same attributes that I'm claiming for Tcl. One difference is that historically, Lisp systems tended to be conceived of as a universe of their own, with little regard for interoperation with anything else. Tcl on the other hand started life as an extension language intended to be embedded in other software, and so has strong support for integrating with other systems on multiple levels.

Finally we have the cross-platform GUI (Graphical User Interface) support provided by Tk. This can be used from other languages, but is most closely integrated with Tcl. For an example of the kind of handy but lightweight tools that can easily be put together with the Tcl/Tk combination, see Diskusage.