Parse Your Way Through Errors
In our code base there's a static function to log an error:
final class Debug
{
public static function error(
$msg,
$obj = false,
$category = 'user',
$type = 'user'
) {
// ...
}
}
Debug::error('API call failed', $response, 'facebook', 'api-call-failed');
The first argument is a message, the second argument can be a value of any type, the third and fourth argument are used for grouping errors by category and type. This is very useful for analytics.
Ignore for a second that we should use the PSR-3: Logger Interface. We could easily move this Debug
class behind an implementation of that interface if we ever wanted to.
The problem
Did you notice that the category and type have a default value "user"
? I don't know when or why this was introduced in our code base but it's bad. When the category and type is not set the errors will get lost in the category "user"
and the type "user"
.
How can we find these Debug::error
statements without a category and a type? We don't want to ignore these errors.
Use a regex?
Maybe... If the arguments are simple:
(RegExr)
Note that this is the first regex that came to my mind. We could improve that regex but let's not waste our time. We need a better way. We need something that truly understands our code.
Nikita's PHP Parser to the rescue
A parser is useful for static analysis, manipulation of code and basically any other application dealing with code programmatically. A parser constructs an Abstract Syntax Tree (AST) of the code and thus allows dealing with it in an abstract and robust way.
Parsing a programming language is hard. This package makes it easy!
Let's see how we can use the package to find Debug::error
's without a category and type.
$ mkdir parser-demo
$ cd parser-demo
$ composer require nikic/php-parser
This package is bundled with a binary (php-parse
) to play with the parser:
$ vendor/bin/php-parse '<?php Debug::error("message", $var);'
====> Code <?php Debug::error("message", $var);
==> Node dump:
array(
0: Expr_StaticCall(
class: Name(
parts: array(
0: Debug
)
)
name: error
args: array(
0: Arg(
value: Scalar_String(
value: message
)
byRef: false
unpack: false
)
1: Arg(
value: Expr_Variable(
name: var
)
byRef: false
unpack: false
)
)
)
)
The abstract syntax tree is a list of statements. The first statement is an Expr_StaticCall
. A static call has a class
, name
and args
. If the count of args
is not equal to 4 we know that the category and/or type is missing. Let's translate that in some code:
$ touch find.php
First, we'll need a way to find all PHP files in our code base:
<?php
$directory = $argv[1];
$directoryIterator = new RecursiveDirectoryIterator($directory);
$directoryIterator = new RecursiveIteratorIterator($directoryIterator);
$files = new RegexIterator($directoryIterator, '/^.+\.php$/i', RecursiveRegexIterator::GET_MATCH);
We can now pass a path to our code base as a command line argument:
$ php find.php ~/path/to/code
Secondly, we'll need a way to traverse all nodes of an AST. Meet the NodeTraverser
.
<?php
use PhpParser\Error;
use PhpParser\NodeTraverser;
use PhpParser\ParserFactory;
use PhpParser\NodeVisitor\NameResolver;
require_once __DIR__ . '/vendor/autoload.php';
// ... directory iterator
$parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
foreach ($files as $file) {
try {
$code = file_get_contents($file[0]);
$statements = $parser->parse($code);
$traverser = new NodeTraverser;
$traverser->addVisitor(new NameResolver);
$traverser->addVisitor(new DebugErrorVisitor($file[0]));
$traverser->traverse($statements);
} catch (Error $e) {
echo 'Parse Error: ', $e->getMessage(), PHP_EOL;
}
}
The NodeTraverser
will call a method (enterNode
) on each visitor for each node in the AST. Let's implement DebugErrorVisitor
:
<?php
use PhpParser\Node;
use PhpParser\NodeVisitorAbstract;
use PhpParser\Node\Expr\StaticCall;
require_once __DIR__ . '/vendor/autoload.php';
final class DebugErrorVisitor extends NodeVisitorAbstract
{
private $file;
public function __construct($file)
{
$this->file = $file;
}
public function enterNode(Node $node)
{
if (!$node instanceof StaticCall) {
return;
}
if ($node->class instanceof Name && $node->class->toString !== 'Debug') {
return;
} elseif ($node->class !== 'Debug') {
return;
}
if ($node->name !== 'error') {
return;
}
if (count($node->args) !== 4) {
echo "Found {$node->class}::{$node->name} in {$this->file} on line {$node->getLine()}", PHP_EOL;
}
}
}
// ... directory iterator
- If
$node
is not aStaticCall
we'll return. - If the class name is not
Debug
we'll return (The class will be aName
instance if it's in a namespace). - If the name of the method we're calling is not
error
we'll return. - If the count of arguments is not 4, we found an
Debug::error
we're interested in.
Let's put it all together:
<?php
use PhpParser\Node;
use PhpParser\NodeVisitorAbstract;
use PhpParser\Node\Expr\StaticCall;
use PhpParser\Error;
use PhpParser\NodeTraverser;
use PhpParser\ParserFactory;
use PhpParser\NodeVisitor\NameResolver;
require_once __DIR__ . '/vendor/autoload.php';
final class DebugErrorVisitor extends NodeVisitorAbstract
{
private $file;
public function __construct($file)
{
$this->file = $file;
}
public function enterNode(Node $node)
{
if (!$node instanceof StaticCall) {
return;
}
if ($node->class instanceof Name && $node->class->toString !== 'Debug') {
return;
} elseif ($node->class !== 'Debug') {
return;
}
if ($node->name !== 'error') {
return;
}
if (count($node->args) !== 4) {
echo "Found {$node->class}::{$node->name} in {$this->file} on line {$node->getLine()}", PHP_EOL;
}
}
}
$directory = $argv[1];
$directoryIterator = new RecursiveDirectoryIterator($directory);
$directoryIterator = new RecursiveIteratorIterator($directoryIterator);
$files = new RegexIterator($directoryIterator, '/^.+\.php$/i', RecursiveRegexIterator::GET_MATCH);
$parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
foreach ($files as $file) {
try {
$code = file_get_contents($file[0]);
$statements = $parser->parse($code);
$traverser = new NodeTraverser;
$traverser->addVisitor(new NameResolver);
$traverser->addVisitor(new DebugErrorVisitor($file[0]));
$traverser->traverse($statements);
} catch (Error $e) {
echo 'Parse Error: ', $e->getMessage(), PHP_EOL;
}
}
We can now call the script with a path to our code base and it will find Debug::error
's without a category and a type accurately. Perhaps we can use this script in our Jenkins setup to let a build fail if there's a Debug::error
without a category and type.
We could take this one step further and ask for a category and type on the command line (using readline
) and replace it automatically but I'll leave that as an exercise for the reader. 😏
Happy programming!
Categories: PHP