Jan 052015
 
Commenting and documenting your code

From observation of the computer industry going back to 1987, when I did a pre-university year at IBM Scientific Centre in Winchester, it seems that the very brightest and best programmers (software creatives) are often not very good at or not very keen on documenting their work.

I think the reason for this is often misunderstood. Poor documentation is a source of great consternation in the world of Linux. In fact it’s a large part of the raison d’être for RasPi.TV. We, the users, often feel that the geniuses who write awesome software can’t be bothered to write good instructions for it because…

  • it’s beneath them
  • they’re lazy
  • they think “it’s obvious how it works unless you’re a retard”

…and, in some cases, there may be some truth in some of those statements. But I suppose the real reason is different. They find creating and writing great software quite a stimulating and fun mental challenge. But, for them, writing instructions is totally boring. And since, when we have a choice, we all naturally shy away from doing things that we don’t enjoy, a lot of Free and Open Source Software (FOSS) remains…

  • undocumented
  • badly documented, or
  • incompletely documented (or permanently out of date docs)

I think it’s the nature of the beast. Free software is free because someone enjoys writing it. But once you’ve sweated long hours over debugging your code and it finally works just as it should, you feel like it’s finished. At that point, the thought of going through it all to show people how to use it may well seem quite repellent. “I’ll have a rest and do it next week.” You think to yourself. But next week a new and interesting programming challenge comes along and the documentation gets swept under the carpet and forgotten about.

Professional Manual Writers Do Exist

Back in 1987, at IBM, I met a guy who was a professional software manual writer (Technical Author). That was a new one on me. “Wow. You mean people actually write these things for a living?” This guy’s job (I’ve forgotten his name) was to play with the software alongside the guys who were writing it and make them show him how it worked. Then he’d convert that knowledge and information into a readable user manual (oxymoron?). The software in question was called IAX – an image processing system, which ran on a mainframe. The digital cameras back then were huge and cost tens of thousands of pounds.

So this guy’s full-time, well paid job was to write software manuals. I don’t know if this approach is still taken in the world of paid-for software? Judging by the quality and comprehensibility of some software guides and built-in help, you’d have to guess “not always”.

But it certainly doesn’t apply in the world of FOSS. Some people are great at documenting their work. Others aren’t. This leaves a niche for people like me, who are willing and even sometimes enjoy embracing the challenge of figuring out how something works and writing “bullet-proof instructions that work first time” for those who want or need them.

So How Should We Document Our software?

The answer depends on how complex it is. For something really simple, a ReadMe file or comments in the code might be enough. For something more complex, a web page or even a fully-fledged user guide may be in order.

No Comment?

Peter Onion pointed out shortly after I published this article that, carefully chosen variable names can greatly reduce the need for comments.

Particularly in languages like Python, your code can often read almost like English, so the need for comments can be eliminated or greatly reduced.

Comments in the code

Here’s an example of commenting in the code…

# This is a comment in a Python script. 
# Anything preceded by a # is ignored by the program 

# The official way to comment is to write your comments
# as separate lines immediately before the
# section of code they describe.

# Official way to comment...
# Here is a simple loop from 0 to 9
for x in range(10):
    print x                  

# Alternative way. Not officially recommended, but I sometimes use it...

for x in range(10):             # this is an inline comment - I like them if kept short
    print x                     # as you can see uninterrupted flow of the code

Schools Sometimes Get It Wrong

Apparently, in some UK schools, a lot of people are taught to write long comments inline on every line. This is highly questionable! It’s considered best practice to keep your lines ~85 characters’ length or less. Inline comments that increase line length >>85 are not a good idea. And the idea of commenting every line is ludicrous!

There’s a common saying that you should comment your code as if the person who will have to maintain it is an axe-wielding psychopath (i.e. Plenty of comments, clear enough to explain what the program is doing).

Read Me!

A “Readme” file is simply a text file, usually called ReadMe.txt, which accompanies the files for the software you’re creating/installing/using. It’s used for cases where a full-blown manual is not needed, but where simple program commenting is not enough. If you do a ReadMe file, you should still put comments in your code, as it will be easier to understand the code if you come back to it later.

An example of a ReadMe.txt file can be found in the GertBoard Python Scripts I wrote a couple of years ago. There’s a ReadMe.txt file in the .Zip file
This one contains information like a full list of programs, installation instructions and usage instructions.

Example ReadMe file - click to enlarge

Example ReadMe file – click to enlarge

Manual Labour

If your software is really involved or complex, a manual might be in order. This is really just a comprehensive set of instructions for how best to use the software. I include it for completeness, but the chances are if you’re reading this article you’re not yet writing software that requires a full-blown manual.

Lots Of Examples Please

However you choose to document your work, it’s great if you can have plenty of examples including code snippets. It’s quite hard scanning through a list of options, parameters, operators, arguments, modifiers and tags to work out the exact command syntax required. Please try to show examples of each (if used) in your documentation. The world will thank you for it.

Examples are worth a lot more than descriptive text.

Document As You Go

If you are documenting your own work, why not do it as you go rather than leaving it all for the end (where it won’t get done as it’s too big a task)?

I read an article the other day that advocated open sourcing your code from day 1. This forces you to be thorough right from the start. It might be a bit of a radical step for some, but it does make you think. I can imagine if you open source from day 1, you’ll be commenting and documenting your code well from day 1 too.

Testing, Testing, 1, 2, 3

Most importantly of all? Test everything. People will take for granted that your examples should work if they cut and paste them. Make sure you test the exact code snippets from your documentation to avoid confusing the hell out of all your users. Also try to make sure that your publishing platform doesn’t mangle or alter the commands. (I’ve noticed that WordPress mangles Python indents horribly unless you use a plug-in.)

It takes a bit more effort at the time, but you will have fewer people emailing you with support queries, so it’s well worth it.

I usually base my tutorials upon a complete run-through of a procedure. For something complex, it’s really the only way to be sure. And even then, things can change with time and go out of date.

Updates

If you get queries or feedback about a problem, remember it may be necessary to revisit your documentation to change something that no longer works. This may also happen if you change your software. Do your best to try and keep the documentation up to date as well. Then you will look good and you’ll have happy users. Many people who publish open source software get asked to write software or do other work (freelance and paid) by people who’ve seen what they can do. It’s like a viral form of CV. I can think of at least 5 people in the Pi Community (including me) who have leveraged their open source work in this way.

When you look at it that way, it kind of makes you want what you put out there to be a reflection of your very best doesn’t it?

  27 Responses to “Documentation and Commenting Your Code”

  1. Please, developpers across the globe, do read this article.

  2. Very nice tip here from Ben Croston (author of RPi.GPIO)

  3. Another tip from UKScone

  4. I’ve worked/known many Technical Authors aka “Bloody Pedants”. They are some of the nicest, weirdest people i’ve known. they put programmers to shame in the weird stakes

    • Pedantry is an art-form ;p

      I reckon the hard part is knowing when to back off. There’s only so far you can go before people start wanting you dead. ;p

      • >There’s only so far you can go before people start wanting you dead. ;p

        oh that explains why so many of the ones i’ve known lived on house/canal boats. for a fast getaway :)

  5. We still have professional technical authors. They’re very useful. The other bit everyone always missed is the importance of REALLY GOOD examples.

    The comment every line is just plain wrong.

    • Yeah I wouldn’t even have mentioned the comment every line thing if one of our young friends hadn’t told us last week that this is how they are taught to comment. That needed to be outed and challenged so people know it’s wrong.

    • Actually, the comment every line is not bad for dynamically evolving code. At least you have a comment BUT it should be about what the line of code is doing on its own. Back in the days when we wrote in assembler, the “line of code” comment was used to explain why you were using a particular register or element of memory so that other people when they came along to modify it later understood what you were doing. So the “best” way to document code is a block of text which explains what you are going to do and then a short piece of text which explains what each line is going to do. The length of the comment was restricted by the number of characters you could get on a punched card

      In that light, the example given:

       
      # Here is a simple loop from 0 to 9
      for x in range(10):
          print x  
      

      is a “bad” piece of documentation because it doesn’t say what the code is going to do, what would have been better would be:

      # This code prints out the numbers 0 to 9
      for x in range(10):      #do it from 0-9
         print x                     #print the number
      • If you need a comment “print the number” to “print x” then we have problems beyond commenting….. The fact that range (10) gives 0 to 9 is worth noting for novice programmers.

        • True but the example given is only two lines long and the point is that it explains what is happening. This is about documenting the code, not writing a user manual. When writing a user manual, it doesn’t matter how the code does it, just what it does. Documenting the code makes it easy for somebody else to modify it.

  6. There is school of thought – or was, when I was studying computer science – that the user manual should be written before the software.

    • That’s not really a good idea. With iterative development (aka Agile programming) things often change. Also, everyone makes the HUGE mistake that once something is shipped then that’s it. Products spend FAR longer in maintenance than development so continuous updates are the norm.

      Also missed out is WHO should write the manual. Pro tip: NEVER developers. They are the WORSE option as the don’t understand it from a user’s point of view. That’s why we use professional manual writers. For internal use the best idea is one of the users. No, this isn’t a joke: THEY are the ones who understand what needs to be stated and how.

  7. In my work environment documentation suffers because the time allocated to the project is often absorbed by changing requirements. Changes and requests for new features by the customer often leave no time for documentation. Once the time runs out the developer has to start their next task. However, you’re right that “software creatives” like the thrill of the chase and writing it up afterwards isn’t that appealing. In general the quality of documentation is rarely given as much emphasis as the product itself.

    • This is why we have professional documenters. We view documentation as a separate part of product development from coding and testing. They even have a specific name InfoDev (as opposed to ProductDev). Different teams in a different part of the organisation.

      Documentation is viewed as too important to be left to coders! Think about that really carefully.

      • “Documentation is viewed as too important to be left to coders! Think about that really carefully.”

        I like your style there Jim ;)

  8. Good read, thank you. About the line length, I keep it under 80 characters, in case somebody ever have to read it in basic console text mode.

  9. Simon John contributed this on twitter…

  10. In any writing project, the first thing to consider is the audience. Once you know who you are writing for, just write what your readers need to read, in a way that makes sense to them. Common-sense, yes, but unfortunately very hard to apply to software documentation, as the potential audience is so varied.

    Something written for newbies who are learning the language and the problem domain needs to be very explicit, describe things from multiple angles, and be full of complete, immediately usable examples demonstrating each concept.

    Something written to help with finding and fixing problems needs to say what each part is supposed to do, why each choice was made, and its consequences for the rest of the system.

    Something written to help with changing and extending the software needs to discuss the structure, dependencies and intended use of each part, and the consequences for other parts of the system if it is changed or replaced

    Something to help with using the software needs loads of real-world examples, walk-throughs, scenarios as well as the meaning, type, and possible values of all parameters, options, inputs and outputs.

    My humble suggestion is that there is no single documentation approach which addresses all these kinds of readers, and even attempting to do so runs the very real risk of obscuring the code itself under a mountain of waffle. And to top it all, every word has to be completely accurate and up-to-date all the time, even while the code is rapidly changing, even when the changes are elsewhere in the codebase, otherwise readers will lose trust and the whole edifice will end up being ignored.

    In short, Peter Onion has the right idea. Write code that is so simple and obvious that comments are hardly needed, so decoupled that changes have few consequences, and so well tested that the tests both prevent unintended breakages, and serve as genuinely useful examples of every part and every scenario of the system. And don’t be afraid to throw away or re-write anything which does not meet these goals.

  11. […] an excellent piece by Alex from RasPi.TV, sharing a number of different approaches, opinions, and suggestions for how to document your code […]

  12. It is always interesting to come back to code you wrote a few years ago and see if you can determine how it works from the comments you wrote at the time. In my case I find it quite difficult as what was obvious at the time of writing the software is not obvious now.

  13. As a technical writer for several software companies before I retired, more than once the software developers had to go back and rewrite their code on my insistence because it just didn’t work from a user interface and logical function standpoint. Says something for proper analysis and design before one starts to spew code!

    • Yeah that backs up what Jim said above “Documentation is viewed as too important to be left to coders! Think about that really carefully.”

      I’m sure that still holds for really big important mission-critical stuff. But judging by the user guides, manuals and documentation provided with some products, the standards may have slipped over the years.

  14. I am an Engineer and I think a lot of this goes hand in hand with properly documenting engineering calculations. There is a need to say what you were trying to do and why you were doing it. Why did you choose this approach over any other approach. For many scientific and engineering applications you need to justify where the numbers came from and what units they are in.
    When new to a language and writing mainly for oneself I think there is a justification in over commenting and explaining what the code does, even if that is from the language or text book definitions. It helps you to understand, learn and remember. I think this is particularly true in many scripting type languages, as you are your own main customer for much of this. The important bit may be to note quirks or differences from another language.
    I have been working with roxygen2 recently and it is an interesting discipline to define the parameters and output systematically and automatically have formatted help generated as part of package generation.
    I have recently been maintaining and repackaging some code I wrote a number of years ago. Although I have commented around the functions fairly extensively and used useful function and variable names, it is apparent that I have had to hunt back through the various files and functions to find orginal definitions for some parameters because after that time had elapsed they were not as self evident as I had thought at the time.

Leave a Reply