A++ [Eric Torreborre's Blog]: programming

Showing posts with label programming. Show all posts

11 March 2009

How we decide

For once, let's think about something else than programming. Or will we ;-)?

The bookshop

Two weeks ago, going back home after some consulting, I arrived early at the airport with a few dollars in my pocket, and I decided to stop by the book shop to have a look at some refreshing mental food.

Haha, I wish that was so easy! I had a first round, looking at almost all the books, then I narrowed down my choice to a few books, maybe 2 or 3 and started thinking really hard.

"Do I prefer a short and small book?"

"Should I take the most interesting?"

"Should it rather be entertaining? I should relax"

"Do I have enough cash? If not, is that worth withdrawing more money?"

"Do I rather to read a more political book?"

"Do I need a book at all?"

All this thinking (I'm not kidding, I spent at least 20 minutes) convinced me that this was the book I should buy: How we decide.

Feelings can be a good decision-maker

This book was for me a very good follow-up on Damasio's book Descartes' error where we learn, through different stories, the role of emotions in the decision making process. Like the story of that guy having a brain injury and not feeling any more emotions.

One morning, coming to Damasio's lab he explained everyone how he's been able to stay very calm in his car, avoiding a deadly car accident due to the iced road that morning. After the session, as he was about to leave, the doctor asked him to select the best date for the next appointment. That emotionless guy stood there, weighting for 30 minutes all the possibles options that were open on his schedule. The team just wanted to strangle him for not being able to make up his mind!

This kind of story is one of many showing the power of emotions or, you could say, the power of the unconscious decision circuits. Most of other Jonas Lehrer examples are taken from games, whether decisions must be taken fast, like baseball or american football, or whether decisions are very complex, like chess or backgammon.

In all those cases, Jonas Lehrer tells us that we have "dopamin learning circuits" which are directly connected to the reward system. Some neurons register when "it's good", when "it works", while others register the surrounding context and the consequences of our actions. The net result is that we progressively get a "feeling", an "intuition" of what works and what doesn't.

For example, Garry Kasparov has a global feeling of what is a good strategic position on the chessboard. Joe Montana had an instinctive feeling of when was the best moment to make a pass on the field. We must also note that both of them have spent a lot of time thinking hard about how they failed, how they made mistakes.

Is there anything we can use out of this for programming?

We should fail early, fail often and try to learn as much as possible from our failures:

by reducing the develop-test cycle so that we have very frequent feedbacks of when changes are breaking stuff

by using retrospectives/post mortems to analyse what is right when working as a group of people on a project

by scrutinizing the code when it hurts. We have to change 100 lines of code all over the place for a seemingly innocent evolution. That hurts, doesn't it? Let's use this as a reinforcement for extreme refactoring. Next time we have to decide if we should refactor or not, the decision should feel obvious

This is also a good justification for code elegance. Code is elegant when it's easy to read, modify, reuse and getting that feeling helps making all the micro-decisions when writing code:

naming
formatting
refactoring
abstracting
checking, validating

Dont always trust your feelings, Luke

Our "dopamin circuits" are trained to recognize success or failure patterns and give us a good feeling of what the next decision should be. Unless we totally fool ourselves.

For example, we're easily fooled by randomness, thinking we understand how probabilities work. In 1913, in a casino at the Roulette wheel, the Black color came out 26 times in a row! Can you imagine the number of people who put all their money on the White after the tenth or fifteenth time,... Even if you're reminded that each draw is independent from the previous one, wouldn't you bet all your money too?

That's not the only situation where we should make better use of our reason. Behavioral science has now plenty of evidence showing how limited our rationality is:

we sometimes just don't learn from failing patterns. In some crisis situations, when emotions are high, we keep doing the wrong thing despite any adverse evidence. It's just impossible to think out of the box.

we're generally very bad at understanding probabilities. For example, if there are 10 lottery tickets at 10$ and the reward is 200$, there's no reason not to play. But if we learn that we buy one ticket while another guy/gal buys the remaining 9 tickets we usually find the situation unfair and we don't play!

we're subject to "framing issues" and "loss aversion". Will you prefer your doctor to say that you have 20% chances to die or 80% chances to live?

The solution here is to use our frontal precortex and really try to take rational decisions knowing our natural biaises and weaknesses.

Is there anything we can use out of this for programming?

Programming is a highly rational activity, but not always. Let's not forget to use our frontal precortex when:

we have "impossible" bugs. Especially under pressure, we're can sometimes be completely stucked on a bug just thinking: "this is impossible!!!! I'm doing A, I should have B!!!!". In this situation the best is to stop and brainstorm. Try to come out with as many ideas as possible. Then carefully select and experiment each of them. There's no magic in programming, things happen for a reason, so let's try to find it fast!

we're tuning for performance. Performance tuning should be as scientific as possible. If I have the feeling that X could be slowing down my system, I shouldn't tweak anything unless I have real evidence that X is the culprit

we're interacting with others and our ego gets in the way

Too much information

Rationality= good. Too much rationality = bad.

Yes, this not obvious but for example, too much information is not the necessary ingredient for a good decision:

some clients are tasting and comparing strawberry jams. They do a first ranking of their preferences. Now we ask them to think about why they like them and the rankings are completely changed! I have the feeling that this is related to Condorcet's paradox.

numerous studies suggest that "experts" may not be good predictors of what's really going to happen in their field (well, also because they can be biaised by their own opinions)

before MRI, doctors were sending patients with back pains to rest for a few days. Now they can see inside the body they suggest surgery and medecine. Even to well-being patients!

Is there anything we can use out of this for programming?

Focus!

when tuning performance, it is very easy to be completely overwhelmed by data. I would suggest taking one "red flag" only (like a display issue) and making a thourough analysis of that only issue until it is completely explained, ignoring what's not relevant

lean programming suggests that accumulating huge backlogs of features and issues may not be the best and most way to make decisions. One reason is that we're very sensitive to irrelevant alternatives.

Conclusion: Emotions vs Reason?

I really learned something with that book. When too many rational criteria come up to my mind on which book I should buy, I'm just losing my time and most probably I am going for a bad choice. I'd rather let my emotions drive me and learn instinctively what a good decision is for me (talk about an obsessional guy,...).

In conclusion, the debate is not such much "emotions or reason" but more why and how we use both of them to make our decisions.

15 October 2008

Commando programming

When you program for a trader you don't have time to write tests

I've been wondering for some time if the quote above, or a variant like: "you don't have time to refactor your code", was really true or not. I generally bug my colleagues when they don't write tests along with their code and I usually prophetize that they will actually spare time by writing tests instead of losing time.

Unfortunately I don't have strong evidence if this is true or not. The only thing I can say is that almost every time I've tried to take shortcuts in my developments, I've been bitten by very silly bugs and regretted my recklessness!

Anyway, what would you do if you had to code at the speed of light for a drug-addict trader (and they really need drugs those days,...)? What kind of practices would you adopt to go faster? How would you prepare yourself for those situations?

With those questions in mind, I was happy to be recently involved in a one week project where the biggest part of my job was to do some "Commando programming" to create the necessary tools to support the project.

In this post, I'll do my own retrospective of that project and try to highlight the points which I think are decisive in the context of "Commando programming". Actually most of the "Ding!" points (does that ring a bell to you?) below are applicable to any programming situation. The only difference is that there's no time for preparation in a "Commando" situation. You have to be seriously fit for the job.

This post is too long, that's the Steve Yegge syndrom,... So here's a summary of the take-away points, so you can get a first impression if it's worth reading or not:
Ding! Know your destination
Ding! Be a fast code-reader
Ding! Know at least one scripting language really well
Ding! Know your platform ecosystem
Ding! Don't rewrite anything
Ding! Take careful risks and prototype
Ding! Have at least one high-level functional test
Ding! Test the difficult stuff
Ding! Isolate the file system
Ding! Use the console
Ding! Have lots of sample data
Ding! Time is money, concepts are time
Ding! PPP: Practice, Practice, Practice!!!
Ding! Cool down and document/review your code
Ding! Cool down and take a break

The project: go see your doctor, now!

In my company we've setup a methodology and a set of tools to assess the performance of our customers deployments and one of them asked us to come over for one week to deploy this so-called "HealthCheck" process. After the first initial interviews, we determined that one very essential objective of the project was to reduce the time to save a trade from 1.2 seconds to less than 500 ms. Very clear and quantifiable objective, that's a good start!

Ding! Know your destination

First point here, this may look totally obvious but still it needs to be said: the overall user objective must be very clear for everyone.

This point is important for at least 4 reasons:

It's hard to say that you're going fast if you don't know where you're going!
The fastest development in the world is the one you don't do because you don't need it
It helps a lot, as seen later, to have a clear overall objective when negotiating with yourself or with your customer the next micro-task to develop
Developing fast is not the alpha and omega of productivity. Seeing the project from its overall objective can make you realize that saving other people's time may actually more important than saving your own

Reading the thermometer

Assessing the performance of our systems involves parsing and analyzing lots of different log files: client requests, server requests, database requests, sql execution times, workflow times, garbage collection times,... The scripts we started working with were shell scripts doing the following:

Reading and managing log files, including archiving old versions
Parsing the files and extracting the time spent for a given "quantity" (client request, SQL call,...)
Computing small statistics: minimum / average / maximum time
Creating graphs in order to be able to spot "spikes" that we will be able to label as "Red flags"

The scripts we had were doing the job, but were pretty slow considering the amount of data we had to process. More than one hour could be spent just running the scripts on a subset of the log files.

Analyzing the scripts, I thought that I could implement something more effective,(...)

Ding! Be a fast code-reader

Commando programming may involve reading a lot of code that's not yours before being able to do anything else. This has been written before: we usually spend far more time reading code than writing it. Reading code faster will necessarily speed you up.

(...) more effective so I started prototyping something using Ruby. Why Ruby? Because I knew I would be fast to implement my ideas with it. Or was I?

Ding! Know at least one scripting language really well

No, I wasn't so fast,... I haven't programmed in Ruby for quite some time and I've forgotten many of the idioms or methods of the core api. My head is so full of Scala nowadays, that I would certainly have been faster using Scala for that task. The morality here is that knowing a "scripting" language is very helpful but you really need to have everything in your mental RAM in order to be effective.

This goes for api knowledge as well as "environmental" knowledge: you should spend no time figuring out how to develop/build/test/deploy your code. Being stuck for 2 hours because you don't know how to set a class path for example is a huge waste of value.

Anyway, I was able to show the effectiveness of using a language like Ruby to speed up the analysis time so I decided to officially port the scripts from shell to another language. But now which language should I select for my commando developments?

Choosing a language is a vast question and the criteria governing that choice may not be entirely related to the intrinsic language features. This choice may also be governed by more "contextual" considerations, like the ability of other developers to learn the language and associated tools.

In my case, Groovy was the winner for 4 reasons:

It has all sort of features to help write code faster like closures, literals,...
It can access Java libraries and I was planning to integrate some Database analysis tools we had written in Java
That's one of the closest language to Java on the JVM so other developers will be able to pick up the new scripts faster. And it is usually better known than JRuby, Scala or Jython for example

Ding! Know your platform ecosystem

I also knew, when choosing Groovy, that I would be able to use all sorts of niceties. One of them was the Ant integration which allowed me to easily reuse the exec task (see below). Knowing my way around Ant was a good thing (I know that's not exceptional for a Java developer, it's just an example ;-) ).

Rewrite everything?

That was the most "dangerous" part of this project.

I would honestly have preferred avoiding it. Since I had to replace one small part of the scripts with some new Groovy logic and also, since the scripts would need to be extended anyway, I decided to rewrite them all in Groovy.

Ding! Don't rewrite anything

Haha, I just said I did it! Well, not that you can never rewrite code or some library part, but you have to be damn sure about what you're doing. So,...

Ding! Take careful risks and prototype

Doing "Commando programming" will undoubtedly require taking risks and trying out some solutions before going on. More than ever, it needs to be done with lots of care because the last thing you want, is to find yourself in the middle of the road at the end of the week, because you didn't have time to finish implementing your new "Grand Vision", whatever it is.

In my case I mitigated the risks a lot by "wrapping" the existing shell commands in my Groovy scripts. So I was essentially running the same thing but executed from Groovy! That was really worth it because in the process, I could also reduce a lot the existing amount of code thanks to the usual Groovy tricks.

Test or not to test

Now we come to the meat of the discussion! TDD, BDD, unit tests, code-and-see, what should be done? I honestly don't have a definitive answer but here's what I did.

Ding! Have at least one high-level functional test

You need to have at least one high-level functional test guiding you in your developments. Something which you can come back to and say: "this case still doesn't pass, I'm not finished yet". You may actually have implemented more code than just what was necessary to pass that test case but at least this primary use case should be ok.

Ding! Test the difficult stuff

Complex logic is definitely something you want to isolate and test early. If you don't unit test it thoroughly, it will usually haunt you in the most painful way: hidden inside all the rest. Besides, this complex logic usually means "value" for your customer, more than anything else, so it's worth cherishing it.

Ding! Isolate the file system

Actually, you should soon realize that any interaction with the file system is slowing you down.
Being able to mock the file system, or to write to temporary files with automatic clean up is a huge time-saver for unit testing. In a "Commando" situation you would be prepared with this and have all sorts of functions and ready to use design to help with this. In my case, I had to write all lot of it,...

Ding! Use the console

The Read-Eval-Print loop. In my developments I found very useful to test interactively my regular expressions inside the Groovy console. Whether or not I turned those expressions to actual unit tests,... I didn't do it all the time I must say. One reason is that, once I had confidence that this regular expression was encapsulated in a higher-level concept, like "extractClassName", it would be right forever.

Ding! Have lots of sample data

If possible try to gather as much sample data as you can and run it through your program. Does it blow-up right away? Well, that's actually good, it may be the sign that you forgot a requirement. Instead of spending time trying to refine your requirements until they're exhaustive, or try to think about all possible cases, run the system with the maximum data available. There are 2 drawbacks to this: one is the time spent re-executing large datasets and analyzing failures, the other is not knowing what to look at! Your program may not break with enough smoke to prove you wrong.

On my project I had lots of log files I could run my scripts against. And fortunately my scripts broke in very unambiguous ways, showing where I was wrong.

To design or not to design

So you're programming like hell, you don't have time to draw all those nifty UML diagrams or devise the best design patterns for the job. "We'll design when we'll have time."

But is it really buying you time? I don't think so. I'm not saying that you should write diagrams or add tons of interface and clever indirections.

But you need design in the sense that the problem and solution domains should be very, very clear to you.

Ding! Time is money, concepts are time

So it's worth taking a bit of time and think: "What's really this system like?". Clarifying the concepts is very, very important. For us, it was first of all giving proper names to our analysis concepts:

An entity can have measurable properties ("A trade server has requests")
For each property, you can define several quantities ("each request can take some average time")
For a given "Test environment", we can run several "Test campaigns"

This helped a lot in our discussions, in structuring the scripts and planning our actions.

The next step was to realize that the type of queries we wanted to execute to analyze logging data looked pretty much like SQL queries: "between 15:00 and 15:03. return all update queries with a time > 300ms and group them by workflow rules". This gave us a good hint that the next step for our scripts would be to put everything in a database!

Coding

Consider this:

"I know where I'm going, I have a clear mind about my system and it's concepts, I know how to write minimal and effective tests, is there anything more I could do now to speed up my coding?"

Let's reformulate the sentence above:

"I know I want a gold medal at the Olympics, I know I'll get it by being a table tennis champion and I know the rules of the game, I have a very good racket, is there anything more I could do now to be the best at table tennis?"

Ding! PPP: Practice, Practice, Practice!!!

That's so obvious I'm ashamed to have to write it,... but that makes a real difference for "Commando programming":

technique: practice your regular-expression-fu for example, or know how to write a parser for a simple language (I planned to do that for the verbose gc logs but we didn't have time nor interest to rewrite this part)
libraries: no time should be spent looking at the API docs
tools: if you need remote debugging, it should be fast to setup, because you did it thousands of times before
computer science: you should have at least a good sense of the complexity of your algorithms
integration: how to call Excel from Groovy to create graphics was one of our coding issues
system / network tools: that wasn't too necessary on our project, but I can imagine this making a real difference in other settings

Cooling down for a minute

Two things of equal importance I've also noticed during that week. First of all, when I tried to go fast I couldn't help but noticing entropy. Yes my code was doing the job, but at the same time it wasn't always consistent, documented, properly named or refactored.

Ding! Cool down and document/review your code

This really can look like a pure loss of time. But I really observed that it wasn't. For 2 reasons:

Actually it doesn't take that long to review/document/refactor the code (try it at home)! Except maybe for the naming part, it's a bit like playing sudoku, you just try to make things fall into place
I observed that I was much less dragged down implementing new feature on clear-cut code with obvious intentions, I had much less mental noise. You know, like: "This computeTotal function is actually also sorting the results, so I know I don't have to do it there". If computing and sorting are better separated or named, that can reduce the "mental tax" while reasoning about something else.

The second thing is the equivalent of "cooling down the code",... applied to your body!

Ding! Cool down and take a break

We're not machines. I was so enthusiastic with that project that I almost coded day and night (and I was away from my family,...). But I also observed one very interesting phenomenon: time was slowing down! Sometimes I was just starring at my screen without realizing that 5 minutes had passed without doing nothing substantial. That should be a clear signal for taking a break. This gives me an interesting idea: "stopwatch" programming. Prepare the list of tasks you want to program and program features by 5, 7 or 10 minutes cycles. And watch out when you have almost empty cycles.

The doctor says: fit for service

Conclusion for the project: it was very successful (this doesn't happen all the time so I'm glad to report it). We were able to analyze and spot 3 major bottlenecks and give good recommendations so that our maximum saving time was eventually around 400 ms. The tools served us, but actually ours brains served us more.

But what's the biggest room in the world? The room for improvement ;-) ! I think we'll be able to transfer more of that wisdom to smarter analysis scripts in the future.

Back to the original question

When you program for a trader you don't have time to write tests

True? False? At the light of my experience of this project, the first thing I'm really convinced of is the first part of the sentence: "You don't have time"!

And if you don't have time, you can essentially do 2 things:

Observe and experiment: whatever makes you spare time is good. In my case, writing tests is a proven way to go faster for example
Practice, learn, invest: new tools, new languages, new libraries, contests, coding dojos,...

Ok. Thanks for listening, I leave you there, I have to go,... and practice my "Commando-fu"!

Pages

11 March 2009

How we decide

15 October 2008

Commando programming