Sent Headers Strike Back, Episode II

An adventure of exactly one-byte proportions

This is the story of the extra byte, which inevitably took hours to track down and finally exterminate.

What Happened to Episode I?

For those of you just joining in the PHP community or getting started with WordPress or Drupal, the “Cannot modify header information – headers already sent” warning is PHP’s way of telling you that some text is being outputted before the HTTP headers are printed. The offending text could be an empty line, a single character or space, or your difficult ex-coworker’s misbehaving script that automatically adds a commit comment to the top of every file whenever you check-in something and try to be at least somewhat productive!

While it might seem like an unlikely scenario that would occur, it is actually quite easy to reproduce. For instance, if we wanted to write a script, transfer_pdf.php, to transfer a PDF to the user, it might look like this:

transfer_pdf.php

 <?php
header('Content-type: application/pdf');
// open PDF for reading
?>

However, running transfer_pdf.php produces the error:

PHP Warning: Cannot modify header information - headers already sent by (output started at /Users/dchang/Desktop/transfer_pdf.php:1) in /Users/dchang/Desktop/transfer_pdf.php on line 2

Yep, someone added an extra space before the opening <?php tag, which causes it to be printed before PHP can output the Content-type header. Dang, that’s picky.

Why is PHP so uptight about sent headers? Because the first few lines of the response that are comprised of the HTTP headers that tell the client’s web browser what to expect – an HTML document, an image, an audio file, or a special command (like a redirection request or error message). PHP is not the only one that’s overly fussy about headers – most other CGI programs like Perl and Python have the same rigid requirements.

As you can imagine, an error of this sort can be incredibly difficult to track down – especially when there are possibly thousands of included files to check for in content-management systems. That’s why pages upon pages of technical literature have been written about this topic, with still hundreds and thousands more of the uninformed public angry and confused with every passing day. But now you know, so you can be the enlightenment that will save the masses when the time comes to pull out that debugger.

Return of the Sent Header: a new Phantom Menace

Now that we’ve left sent headers for dead, we can move on to more important things in life, such as ascertaining the number of licks it’ll take to get to the Tootsie Roll center of a Tootsie Pop. Right?

Not so fast! You would’ve expected someone who had run the gauntlet and has seen more than enough Drupal errors to finally be rid of this fiend once and for all. Yet, many months later, our wicked sent headers have struck again, this time in an entirely inconspicuous manner. Here is how our story goes:

We first noticed this phantom menace when we were experiencing decompression issues with our ZIP files that we generated through a custom Drupal module. Although the file that was created on the server-side unzipped properly, transferring the zip file through to the browser and then decompressing it failed. On Mac OS X Leopard, this came in the form of a recursing ZIP file structure with a .cpgz extension; attempting to open this newly unzipped archive created yet another duplicate .cpgz file.

Without a better idea of what caused this infinite loop to occur only when downloading (but not on the server, which was generating the zip files on the same machine) we headed to Terminal and tried:

unzip entries.zip

Now this produced an obscure warning:

Archive: /Users/dchang/Downloads/entries.zip
warning [/Users/dchang/Downloads/entries.zip]: 1 extra byte at beginning or within zipfile
(attempting to process anyway)
inflating: SentHeadersStrikeBack-David.doc

which very much left us flabbergasted. Not only did we know our zip file was corrupted, but it was corrupted by exactly one extra byte. But no matter how many times we looked at the zipping and file transferring code – forwards, backwards or upside-down – we could not make heads or tails of it. In the end, we resorted to using the built-in file_transfer Drupal method, which seemed to have at least handled the file transfer bit properly.

Not many days later, when we were working on a CSV view, we noticed that a blank row started appearing at the top of all CSV files that were generated. We played around with style plugins and settings with no luck, completely removing the column headers and taking apart theme templates in our frustration.

After what must have seemed like years of fruitless and frustrating investigations, by sheer coincidence, we viewed the source of one of the Drupal-rendered HTML pages, and there we stumbled upon a blank line at the top of the file! Did this ever strike a chord with the past!

In overdrive mode, we knew from experience that somewhere in the vast jungle of Drupal code, there was a blank line at the top of an included file – most likely a custom module – but we didn’t know which one.

Drupal’s “Headers already sent” troubleshooter suggests that you “disable modules one by one to find out which one is causing the problem.” But with 106 enabled modules, this wasn’t only time-consuming, but infeasible as well.

Instead, we made Drupal tell us which module had the extra line by temporarily modifying the core module-loading functionality: 1

drupal/includes/module.inc

function module_load_all() {
	foreach (module_list(TRUE, FALSE) as $module) {
		print "|" . $module; // output each module's name
		drupal_load('module', $module);
	}
	die(); // quit immediately after looping
}

Line 14 will output each module’s name, separated by a pipe character. Since there are no line-breaks specified, all of the module names will appear on a single line. If there were any blank lines at the top of these modules, they will be printed while they are being loaded. We quit immediately after loading the modules on line 17, so no other actual content will be processed and outputted.

Now when you load the source code for any Drupal page, a bunch of module names will be listed:
|getid3|uc_discounts|content_profile|webform|block|oops
|dblog|filter|locale|menu|node|path|php|search|system|taxonomy|upload|user...

In the example output above, the oops module is the guilty party, and sure enough, one of the files in this module contained an extra line before the opening <?php tag. We blasted the little byte-sized bugger to oblivion, removed lines 14 and 17 from module.inc, and lo and behold, no more blank rows in our CSVs!

Now the question remains: why didn’t PHP throw a “Headers already sent” error in the first place if there was a blank line before the opening <?php tag? It would’ve saved hours of frustration if PHP threw out the warning and indicated the culprit file. That was when we realized the irony of it all. Simply said: the headers were already sent.


References

  1. Warning: “Headers already sent” or “Cannot modify header information” @ drupal.org
  2. PHP: header – Manual
  3. Solve PHP error: Cannot modify header information – headers already sent @ Tech-Recipes
  4. And in general, Error messages @ drupal.org
  1. Note: Never hack Drupal core on a live production site! This trick should only be employed on a development site. Depending on your version of Drupal, the source may vary from what you see here. Be sure to backup core files before making any changes to them.
  1. Nice work! I’m glad I didn’t have to track that one down.

    Jeremy Chan
    May 21st, 2010

Add a comment

Comment feed
The better to greet you with
No one will ever see this
Your pride and joy
The reason this comment form exists

The crew behind ASOT

We're a team of interactive, software, and business intelligence experts skilled in the design, construction, and management of online enterprise systems.

Visit The Jonah Group site

Get in touch with us