Saturday, October 20, 2012

PHP Assertions

I stumbled upon assertions in PHP today, though why I didn't know they existed after working with the language for so long and what I was looking for originally when I came across them are both mysteries. And with the increasing focus on software quality in the PHP community, I wondered why I hadn't seen them used by others. I decided to ask around, look into PHP's implementation of assertions, and do some tinkering.

I asked a few friends if they knew about assertions; they did. I asked if they used them; they didn't.

Remi Woler: I think nobody has found a good use case. It weaves tests into code. How are you going to recover from a failed assertion?
Davey Shafik: They kinda suck. For example: assert('mysql_query("")') It's a string of code that gets eval'd.

Indeed, PHP assert did not get stellar endorsements from people whose opinions I respect.

My primary experience with assertions comes from C where they are defined as macros. Its argument must evaluate true otherwise the program terminates with an error. These checks can be stripped at compile time using -DNDEBUG if desired for performance reasons, although there is some disagreement on the wisdom of doing so.

PHP asserts are implemented a bit differently. First, assertions are configurable in php.ini or with assert_options(). A failure doesn't necessarily have to abort the script – you can bail if you want to, or disable them, or convert them to run-time warnings, or even invoke a callback to handle them. This makes them very flexible and much less black-and-white than in C.

The actual assert() function accepts either a string or a Boolean for its condition. So, for example, you can write either:

<?php
assert(is_string($foo));

or:

<?php
assert('is_string($foo)');

In the first instance, the statement is evaluated and then the resulting Boolean is passed to assert(). While perhaps a little more traditional, it may not be as efficient as you will see momentarily.

In the second, the string is passed to assert() directly which eval's it to determine its truthiness. This is a better approach for two reasons:

  1. assert() immediately returns true if assertions are disabled. The code string is not evaluated and any performance hit from executing unnecessary statements is minimized.
  2. If the assertion fails, the code string is passed to a callback (if one is used) and can be included in any output or logging.

I'm not convinced Davey's eval concerns are entirely well-founded in this instance because of the above reasons and the fact that it's static code to be evaluated by PHP. It's a controlled environment, not eval($randomUserSuppliedCode).

PHP 5.4 also added a second parameter to assert() – a string description to annotate the test. If present, the string is also passed to the callback.

The PHP manual offers some guidance on using assertions:

Assertions should be used as a debugging feature only. You may use them for sanity-checks that test for conditions that should always be true and that indicate some programming errors if not or to check for the presence of certain features like extension functions or certain system limits and features.

Assertions should not be used for normal runtime operations like input parameter checks. As a rule of thumb your code should always be able to work correctly if assertion checking is not activated.

Both are good pieces of advice, but are contradictory; your code may not work if assertion checking is disabled and you are using them to test system limitations.

Wikipedia explains the difference between assertions and error handling:

Assertions should be used to document logically impossible situations and discover programming errors — if the impossible occurs, then something fundamental is clearly wrong. This is distinct from error handling: most error conditions are possible, although some may be extremely unlikely to occur in practice. Using assertions as a general-purpose error handling mechanism is unwise: assertions do not allow for recovery from errors; an assertion failure will normally halt the program's execution abruptly. Assertions also do not display a user-friendly error message.

I (perhaps foolishly) disregarded the manual's and Wikipedia's advice while I was tinkering with them. PHP assertions don't behave like their C brethren, so perhaps the traditional C way of thinking (asserts as debugging only) might be artificially restrictive? What I found was that PHP assertions, with a bit of creativity, could be used to write readable, quality code.

Consider a naive Active Record implementation. You might have code that resembles:

<?php
class User
{
    protected $id;
    ...

    public function setId($id) {
        if (!is_null($this->id)) {
            throw new BadMethodCallException('ID already set for user.');
        }
        if (!is_int($id) || $id < 1) {
            throw new InvalidArgumentException('ID for user is invalid.');
        }
        $this->id = $id;
    }

    ...
}

It is possible to use assert() to test the $id argument (disregarding the manual's advice) and a callback to throw the exceptions (ignoring Wikipedia).

<?php
assert_options(ASSERT_CALLBACK, function ($file, $line, $code, $desc) {
    list($exClass, $msg) = explode(':', $desc, 2);
    throw new $exClass($msg);
});

class User
{
    protected $id;
    ...

    public function setId($id) {
        assert('is_null($this->id)',
            'BadMethodCallException:ID already set for user.');
        assert('is_int($id) && $id > 1',
            'InvalidArgumentException:ID for user is invalid.');

        $this->id = $id;
    }

    ...
}

No, this isn't how assertions are intended to be used, but it does address Remi's concern about recovery. One doesn't typically recover from an assertion but now the condition has been converted into an exception so recovery is possible to the same extent that recovery from the exception would be.

If assertions have been turned off then the code won't work, so if you wanted to go down this path then it'd be wise to add assert_options(ASSERT_ACTIVE, true) to your bootstrap file.

Now don't get me wrong, I'm not about to doing this in all of my code (at least not yet anyway!). It's fun to play but there's still some questions worth pondering.

If you were to use assert() properly instead of something along the lines of my bastardized exception example, what type of things would be worth asserting?

Assertions are meant to identify program logic/design bugs, not as a run-time error handling mechanism. Isn't this why we do unit testing? Playing devil's advocate, what's wrong with pushing unit tests directly into your code if we have doc comments that are extracted for documentation?

Feel free to let me know your thoughts in the comments section below. Do you constrain yourself to the classical interpretation of assertions, or do you take advantage of the flexibility of PHP's implementation? Where and when do you use them in your code?

3 comments:

  1. Doing a benchmark of creating the object and setting the id 10,000 times it appears the assertion is nearly ten times slower on my pc (even with neither triggering an assertion). It also requires PHP 5.4.8. I hadn't see the new description flag though. Thanks for pointing that out.

    ReplyDelete
  2. Spot on. Assertions are great for proofing parameter types for example. That's typically an error cause that ought to be found within the development stage. Using exceptions for delaying their discovery until deployment isn't sensible. (Unless of course method parameters are highly volatile due to unsettled code paths or raw user input that is).

    ReplyDelete
  3. For debugging I use assert during development as well as a rewritten error handler to allow for strict types in my code.

    for example:

    error handler reports a fatal error with full debug back trace if $bar is not a string

    function foo (string $bar)
    {
    print $bar;
    }

    // assert informs me that $foo is a zero length array

    function bar (array $foo)
    {
    assert (sizeof ($foo) > 0, '$foo has no length inside bar() ' . implode ("\n", debug_backtrace ());
    }

    During a release build all of these checks are removed before they are checked into the release branch in the version control.

    This has helped me develop really stable code that works as intended by providing logging of code errors that would normally bypass most debuggers and tests.

    This is how I believe these features should be used correctly and as with compiled applications, for optimization, these extra debugging tools should be stripped from your code.

    ReplyDelete