Faster, Better, And Cheaper Software Development? Copy Someone Else's Code!

“Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” – Abraham Lincoln

When faced with the job of creating an application, most programmers start by doing what they know and love best: they start coding.  This is one strategy, but it’s not a good one.  A better strategy is to get onto the Web and find out how others have solved the problem – and hopefully to get a copy of the code they used.

In some cases, we get lucky – there’s an excellent application that already does what we want, at a price that’s right.  However, even if we are not so lucky, we’ll often find pieces and parts – usually with source code - that solve parts of the problem, but not the entire thing.
We live in an incredible age of open source software.  Free code is available in abundance.  Heck, even Microsoft and IBM, the most commercial of enterprises, are giving away piles of useful and well-tested code.
 
In general, if there’s a low-level programming issue that you want to solve, then somebody has probably already solved it, and the source code is available. 

I have long been an advocate of changing the title of “programmer” or “software engineer” to something like “person who gets quality applications completed”.  Our job is not to write code – rather, it is to get applications completed in the least possible time,  at the lowest cost, and with the highest quality.  Writing code is just one tool in our toolbox.  Taking code that already exists is another.

The abundance of free software has made this distinction between writing and completing code all the more important.  Nowadays, when I write code, I picture myself as more of an integrator of functional blocks than as a writer of code.  By pulling in other people’s code, I can, in just a few lines, get something done that would have taken days or months had I started from scratch.

However, as the saying goes, “There’s no such thing as a free lunch”.  There are several important steps in successfully using other people’s software in your applications.  The steps basically speak to answering the following questions:

  1. What does this code actually do?
  2. How well does it work?
  3. What do I need to do to integrate it into my code?
  4. Do I run any risks by using it?

Let’s tackle these one at a time.

What Does This Code Actually Do?

This seems like a no-brainer, but it’s really easy to fall into the trap of assuming that a piece of code does X, then finding out, later in the project, that it doesn’t quite do X.  Or that it actually does Y.  Teeth are gnashed, and time and money spent on the ensuing code rewrites.  Don’t do it!

Look, you’re going to have to understand the thing in some detail now or later – no way around it.  The Curve Of Stark Reality tells us that earlier is always better.

Spend some time with the code’s documentation and understand the code and how it might fit within the context of your application.  This seems straightforward, but it rarely is.  There is a broad spectrum of documentation out there – some is voluminous, some is sparse (much is nonexistent), some is well-written, some is awful.

I find that a picture is worth a thousand words, and a demo or walkthrough is worth ten thousand words – in other words, show me, don’t tell me.

The vast bulk of programmers hate to write documentation – most have to be dragged into writing good comments in their code, so getting them to write documents and/or make a video is a stretch.  So, in many or most cases, determining just what a project does will require a bit of investment in trying it out in some sort of a sample project.

In the best case, you’ll find a concise walk-through or demo video on the web site of the code you’re considering.  This is more common for larger projects, but is starting to become spread to even smaller projects.  Just the availability of things like walkthroughs and videos is an excellent sign – it means that someone involved in the project is at least thinking about the end-user, which infinitely increases the odds that they’ve produced something that will satisfy end-users.

How well does it work?

There are two aspects to this:

  1. Does it do what it says it does?
  2. is it solid, e.g., does not often crash or otherwise fail in use.

Suppose that we’re looking for software to encrypt sensitive financial data before it’s stored in a database.  We find some block of code that claims to encrypt data.  We try it and it seems to encrypt and decrypt.  Can we actually say that it works?

In the best case, there is some sort of commonly-recognized standard that the code has been certified to meet, and the certification was performed by a third party.  Alas, this isn’t often the case for open source software, so we have to make a decision based on other criteria.
There are several factors to consider here:

  • Where does the code come from?
  • How much stuff is hidden behind the scenes?  In other words, if it doesn’t work, would we even know?
  • Who else uses the code?

    Where Does The Code Come From?

    If code is produced by a Microsoft-level company (e.g., the Cryptography Application Block of the fabulous Microsoft Enterprise Library), then you’re likely to be OK.  It probably does what it says it does – and be sure to read the documentation to see what it does say that it does! 

    Obviously, if it’s written by some “software wizard” known only by a username on community web site, then you’re cruisin’ for a bruisin’.

    How much stuff is hidden behind the scenes?  In other words, if it doesn’t work, would we even know?

    In the world of developing medical devices, one of the things that we take into account when examining potential failures is the visibility of a problem.  If you can see the problem, at least you know it’s there – it’s much worse not to know!

    Let’s look at the case of function that encrypts data on its way to a disk, and decrypts it on the way back.  We could test it by using it to write data to the disk, and then again to read it back.  If the data we read matches the data we write, then, to a first approximation, things are working, no?

    Well, not necessarily.  Suppose the routine actually does no encryption at all – it only writes the data directly to disk, and reads it directly back.  Then our program will not crash – but the data won’t be encrypted.  Or suppose that, because of the encryption method and improper pointer use, strings that are an integral multiple of 64 bytes in length are not encrypted properly.  You might never find a problem in casual testing – or even formal testing – but you’d have a big problem.  Or suppose that the encryption method is trivial to crack?  And so forth.

    And what about things that you don’t even know enough to think about?  For example, robust symmetrical encryption (the ability to encrypt and decrypt) requires a “key” to be stored on your server.  How do you protect the key?  If anyone swipes the key, they can easily decrypt all of the encrypted data.

    Something substantial and “mysterious” like an encryption package should only be “swiped’ from experts, who really, really know what they’re doing.

    There are certainly other functions where a failure will be more obvious.  For example, some sort of fancy masked textedit box, used in a situation where the user will verify their input on another page, is much less of a concern.

  • Who else uses the code?

    The more, the merrier; the more it’s used, the more likely that it works as advertised and has been exorcised of bugs.  Good signs include:

    • An active users’ forum frequented by business users.
    • Lots of hits for the code on Google, particularly in Google Groups (i.e., usenet).

    Do I run any risks by using it?

    The concern here is licensing – it’s important to make sure that the license agreement for any borrowed code is respected.  This is a substantial issue that requires a good bit of thought, but here’s a very brief rundown of license types (not to be taken as legal advice, of course):

    1. FreeBSD-style licenses: these allow you to use the code in pretty much any way that you want.
    2. GPL-type licenses: if you use any GPL-licensed software in your application, you are generally forces to supply all of your source if you supply the binary code to customers – in other words, if you’re supplying a hosted service rather than a shipped application, you do not need to supply your source code.
    3. Everything else. These range from the well-conceived, well-documented, and widely-understood (Creative Commons, Microsoft Shared Source, and some others) to the poor or nonexistent.

    Note that when a person or company releases software under a given license, they may-re-release it under another license.  For example, if I release code under the GPL, and you like that code but don’t like the license, you and I could negotiate the sale of that same code under a different license, say, one that allows you to ship your application without source code.

    What do I need to do to integrate it into my application?

    This is best thought of before committing to using someone else’s code.  There’s the obvious issue of what your code needs to do to accommodate the new source code.  If you’re lucky, you were able to find code in your preferred language that snaps right in to your app.  The next-best thing is finding something that’s not in quite the same language but still just snaps in – for example, if you’re developing a C# application, Visual Basic.NET and other CLR compiler code will effortlessly compile and link to your application.

    If the acquired source is more significantly different than your application, it may be possible to compile it down to a binary library (e.g., a DLL) that you can call.  This is a trick that I’ve used often – it’s not trivial, but not too bad either.  If you need to call C/C++ from a .NET language or from some other higher-level language, the free SWIG (Simplified Wrapper and Interface Generator) utility can build wrapper classes – overkill for small projects but a godsend when you need to hook to a library with an extensive interface.

    Also look out for more subtle issues around the configuration and deployment of the new code.  Perhaps it requires its own configuration file to run.  Perhaps it requires some other external resources that you can’t count on being in place.

    Another Source of Other People’s Code

    Have you searched the ‘net but can’t find the code you’d like?  Another way to get what you want is to find an application that does what you need, and then hire the developer to pull out the functions that you want and supply that code.