Tutorial: Web interface for grep and cat in PHP

As a developer, I often dream about the fusion of the power from the Unix shell with the elegance and ease-of-use of today’s web interfaces. When you watch your fellow developers in front of the terminal, you can often observe patterns in command usage: the same commands are typically used over and over, and on specific projects or frameworks, certain commands are more likely to be executed than others.

For instance, it may be common for one to grep for all templates that reference a particular CSS style, or pages that invoke one of your JavaScript or module methods. On other projects that involve monitoring, the administrator may be trailing a tail of a server log file.

In this tutorial, we will be building a PHP web interface that will act as a wrapper around shell commands such as grep and cat. We will show how you can combine the power and flexibility of the shell with the rendering capabilities of a web browser to easily transform plain text into syntax-aware results that are easier to sort through. In essence, we will be turning this:

Ahh, remember the good ol' days of Terminal?

Ahh, remember the good ol' days of Terminal?

into this:

It's like your own personal Google Code search!

It's like your own personal Google Code search!

So let’s not delay any longer – let’s start this coding workout!

Warming up with exec

The entire basis of today’s exercise revolves around PHP’s exec function. exec allows you to run any external shell command or program, and can optionally capture the output of the command in an array through the second argument passed to the function. A very simple example of its usage which you can test on the php -a command line interface:

<?php echo exec("echo hello world"); ?>

which will simply execute the command specified in the first argument: echo hello world, and print it out to the interactive console.

It is easy to see how we can expand this to handle other shell commands as well:

// list all files in the current directory in chronological order
$command = "ls -latr";
// determines the file type of a given filename
$command = "file -i " . $filename;
// recursively perform a case-insensitive search of all files in the current directory matching the grep pattern provided
$command = "grep -Rni \"" . $grep_pattern . "\" .";

Tip: The warning sirens should be going off in the back of your head when you see any unescaped variable being passed to a shell command. At the very least, you should make sure that all variables have been passed through escapeshellarg to enforce that all user input is safely escaped.

Now that we know how to execute a command, we’ll want to capture the output from exec, and mould it into a format we can output to the browser.

<?php
$formatted = "";
// grep command which ignores binary files, piped to remove any search results that match .svn filenames
$command = "grep -RnIi " . escapeshellarg($_POST['grep_pattern']) . " . | grep -v \".svn\"";
$output = array();
exec($command, $output);
foreach($output as $line) {
    // append each line, but make it HTML-friendly first
    $formatted .= htmlspecialchars($line) . "\n";
}
?>

This PHP snippet takes in a submitted grep pattern, escapes it, and performs the grep command. The output of grep is iterated and then encoded, ready to be placed into HTML pre tags. In this example, I have excluded any results that contain those pesky “.svn” files, common to all projects that are committed into Subversion.

Now to get a complete working example, we need to create the front end interface, with a form where users can submit their grep patterns.

php_grep.php:

<?php 
    $formatted = "";
    $output = array();
    $command = "";
    if (!empty($_POST['grep_pattern'])) {
        $command = "grep -RnIi " . escapeshellarg($_POST['grep_pattern']) . " . | grep -v \".svn\"";
        $result = -1;
        $return_code = -1;
        $result = exec($command, $output, $return_code);
        foreach($output as $line) {
            $formatted .= htmlspecialchars($line) . "\n";
        }
    }    
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <meta http-equiv="Content-type" content="text/html; charset=utf-8">
        <title>PHP grep</title>
    </head>
    <body id="" onload="">
        <form action="php_grep.php" method="post" accept-charset="utf-8">
            <label for="grep_pattern">Grep for: </label><input type="text" name="grep_pattern" value="" id="grep_pattern">
            <p><input type="submit" value="grep it! &rarr;"></p>
        </form>
<?php if (!empty($command)) { ?>        
        Command: <code><?php echo htmlspecialchars($command); ?></code><br>
        Output: 
        <pre><?php echo $formatted; ?></pre>
        Last line of result was: <code><?php echo htmlspecialchars($result); ?></code><br>
        Return code is: <code><?php echo $return_code; ?></code><br>
        Number of results: <code><?php echo count($output); ?></code>
<?php } ?>
    </body>
</html>

For debugging and educational purposes, the exact command executed, last line of the resulting output, return code and the number of grepped matches are displayed to the user.

Add some optional stretches

Now that we have setup a wonderful interface for shell commands, it’s time to spice up our game with some search options. In this example, we’ll give the user the ability to perform case-insensitive searches or to restrict the search to only a particular location (that way, each grep won’t always take 5 minutes to return!). I am placing this PHP grep utility on top of an existing Drupal installation, so we’ll configure our search locations to the most common Drupal directories.

Add the following before line 25 of php_grep.php:

            <label for="search_path">Search in: </label>
            <select name="search_path" id="search_path">
                <option value=".">entire drupal site</option>
                <option value="modules">modules</option>
                <option value="themes">themes</option>
                <option value="includes">includes</option>
            </select>
            <input type="checkbox" name="case_insensitive" value="1" id="case_insensitive" checked="checked">
            <label for="case_insensitive"> Case Insensitive</label>

We will need to make backend changes to ensure each of the search options modify the grep filters accordingly.

The following should replace line 6 of php_grep.php:

        $case_insensitive = false;
        if (!empty($_POST['case_insensitive'])) {
            $case_insensitive = true;
        }
        $search_path = $_POST['search_path'];
        $command = "grep -RnI" . ($case_insensitive ? "i" : "") . " " . escapeshellarg($_POST['grep_pattern']) . " " . $search_path . " | grep -v \".svn\"";

The user can now perform both sensitive and non-sensitive queries from a select location of their choice.

Mix in a few syntax highlighting exercises

This is turning into quite the featured web-based grepper, but we’re not done yet! Let’s add some syntax highlighting to our search results, so that developers will be able to easily tell if the line is a comment or a function declaration.

To be able to do this, we will need to split up each matched grep line into three components: the filepath, the line number of the match, and the matched line to highlight.

Replace lines 15 to 17 of php_grep.php with:

        foreach($output as $line) {
            $grep_parts = explode(":", $line, 3); // split into three parts
            $filepath = $grep_parts[0];
            $lineno = $grep_parts[1];
            $matched_line = $grep_parts[2];

            $formatted .= "<span class=\"path_component\"><span class=\"filename\">" . htmlspecialchars($filepath) . "</span>:" . $lineno . "</span> <span class=\"match_component\"><pre class=\"brush: php; first-line: " . $lineno . "\">" . htmlspecialchars($matched_line) . "</pre></span><br>";
        }

That’s one doozy of a line! Instead of returning each row as plain text, we’re placing the filename and line number in one span, and displaying the matching result in another. Of course, that means instead of dumping all of the output in a pre tag at the bottom of the page, we should just print out the formatted content normally:

Replace lines 48 and 49 of php_grep.php with:

        Output: <br>
        <?php echo $formatted; ?>

Keen eyes may notice that there’s this php “brush” class that is used on the pre tag on line 21. This is where some JavaScript magic comes into play. Alex Gorbatchev wrote an amazing syntax highlighter, which knows how to colour lines of code based on syntax files called “brushes” (in fact, this blog you are reading now makes good use of Gorbatchev’s highlighter). There is already significant support for several languages, but if you don’t find yours there, feel free to create your own!

To get syntax highlighting working, download and install the plugin into your PHP site – I placed these files in the top-level drupal directory, so I can easily grep all Drupal files, but you may feel free to place them anywhere where you can serve PHP scripts. The installation on your development environment could be as straightforward as placing these files in the same location as php_grep.php, in which case, you should add the following lines in the header of php_grep.php:

Add the following before line 31 of php_grep.php:

        <style type="text/css" media="screen">
            .filename {
                font-family: Helvetica;
                font-size: 10pt;
            }
        </style>
        <script type="text/javascript" src="xregexp-min.js"></script>
        <script type="text/javascript" src="shCore.js"></script>
        <script type="text/javascript" src="shBrushPhp.js"></script>
        <link href="shCore.css" rel="stylesheet" type="text/css" />
        <link href="shThemeDefault.css" rel="stylesheet" type="text/css" />

and the following JavaScript snippet to the bottom of the document body to fire off the syntax highlighting:

Add the following before line 65 of php_grep.php:

        <script type="text/javascript">
             SyntaxHighlighter.all()
        </script>

Note: Another point you may have noticed is that only the php brush is used for highlighting grep results. Since Drupal is predominantly composed of PHP, this is not too big of a deal. However, for your own projects, you will want to include additional brushes in the header so that the highlighting remains as accurate as possible. When formatting, you can use another exec command to determine the precise file type (and thus render with an appropriate brush type). Parsing the output from the following:

$command = "file " . $filename;
exec($command, $output);

may work for you.

Cooling down with cat

Now let’s wrap up our nifty web-based grep with a web-based cat. When a user clicks on a filename link, we’ll open up a new window containing the source code for the corresponding file, render it with the appropriate brush, and then highlight the line that matches the line number in the grep output.

Not surprisingly, php_cat.php looks very similar to php_grep.php, with the exception of the highlighted lines.

php_cat.php:

<?php 
    $formatted = "";
    $output = array();
    $command = "";
    if (!empty($_GET['filepath'])) {
        $result = -1;
        $highlight_lines = $_GET['lineno'];
        $command = "cat " . htmlspecialchars_decode($_GET['filepath']);
        $return_code = -1;
        $result = exec($command, $output, $return_code);
        $highlight = "";
        if (!empty($highlight_lines)) {
            $highlight = "; highlight: " . $highlight_lines;
        }
        $formatted = "<pre class=\"brush: php" . $highlight . "\">";
        foreach($output as $line) {
            $formatted .= htmlspecialchars($line) . "\n";
        }
        $formatted .= "</pre>";
    }    
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <meta http-equiv="Content-type" content="text/html; charset=utf-8">
        <title>PHP cat: <?php echo $_GET['filepath']; ?></title>
        <script type="text/javascript" src="xregexp-min.js"></script>
        <script type="text/javascript" src="shCore.js"></script>
        <script type="text/javascript" src="shBrushPhp.js"></script>
        <link href="shCore.css" rel="stylesheet" type="text/css" />
        <link href="shThemeDefault.css" rel="stylesheet" type="text/css" />
    </head>
    <body id="" onload="">
<?php if (!empty($command)) { ?>
        Output: <br>
        <?php echo $formatted; ?>
<?php } ?>
        <script type="text/javascript">
             SyntaxHighlighter.all()
        </script>
    </body>
</html>

Although we format the text in a similar fashion to php_grep.php, there is additional logic to highlight the appropriate line number that was passed in. You may want to include special highlighting options in your cat output, which can be setup through the syntax highlighter configuration.

Finally, we’ll need to update the grep output formatter in php_grep.php to link to our new PHP cat function.

Replace line 21 of php_grep.php with:

$formatted .= "<span class=\"path_component\"><a href=\"php_cat.php?filepath=" . htmlspecialchars($filepath) . "&lineno=" . $lineno . "\" target=\"_blank\"><span class=\"filename\">" . htmlspecialchars($filepath) . "</span></a>:" . $lineno . "</span> <span class=\"match_component\"><pre class=\"brush: php; first-line: " . $lineno . "\">" . htmlspecialchars($matched_line) . "</pre></span><br>";

And there you have it. We have built a working web interface around two widely used shell commands using less than 120 lines of code! The complete source code for this week’s tutorial can be found at the bottom of this blog post.

Extensions

There is no limit to where you can go from here. You can implement wrappers for the commands you use most often. As you have seen, it is straightforward to concatenate shell commands together. Save time by designing a simple easy-to-use interface that will automate your more mundane tasks so you will have more time to concentrate on the actual implementation.

Note of warning: this tutorial only covers the very basics of setting up a web interface around executed shell commands. This is intended only for development environments, as it is very easy for a user to enter malicious commands into your form fields or to modify the URLs to access data that should otherwise be local and private.

This tutorial covers a basic prototype of a possible web interface for grep and cat. You may want to expand it so that the user has a bigger set of options to choose from (for example, to exclude results containing some other criteria, or to interpret Perl regular expressions).

Furthermore, this prototype does not consider error reporting of any kind. Because what you may want to do is largely experimental, you may want to show all warnings through the PHP runtime configuration:

error_reporting( E_ALL );

Thanks for sticking all the way to the end. I look forward to what crazy applications and interfaces you can now come out with in the future!


References

  1. PHP: exec, today’s superhero, and its faithful sidekick, escapeshellarg
  2. htmlspecialchars, a handy function to convert special characters to their HTML entities
  3. SyntaxHighlighter, the code highlighting magic used by Apache, Mozilla, Yahoo!, WordPress, and Freshbooks

Source Files

As promised, here is a zip file that contains a working source for this tutorial which you can drop into your PHP environment to get you going.

  1. I just came across this blog and it is a good article, a little bit on the long end but pretty satisfactory one.

    Sep 4th, 2010
  2. Just what I wanted. But what happened to the ZIP file?

    Peter Newman
    Feb 20th, 2011
  3. Oops – should be fixed now. Try again.

    Jeremy Chan
    Feb 22nd, 2011
  4. This is absolutely gorgeous and elegant. Kudos and thanks for the script highlighting integration, very extensible.

    Oct 9th, 2011

Add a comment

Comment feed
The better to greet you with
No one will ever see this
Your pride and joy
The reason this comment form exists

The crew behind ASOT

We're a team of interactive, software, and business intelligence experts skilled in the design, construction, and management of online enterprise systems.

Visit The Jonah Group site

Get in touch with us