PHP Basics

In its most basic form, the CLI interface can be used just like any PHP script. Here’s an example:

<?php
print "Hello World!";
?>

Filename: helloworld.php

You can execute this script by typing:

php helloworld.php

You can avoid specifically having to call the php binary first on UNIX-based systems by placing a SheBang right at the start of the script. This “points” the shell at the binary that should be used to execute the script. Here’s an example:

#!/usr/local/bin/php
<?php
print "Hello World!";
?>

Note the path to your PHP executable may vary, /usr/bin/php being another common location.

I can now execute this script with:

$ helloworld.php

Oh wait… If you’re running on UNIX based system, you first need to make this script executable with:

chmod +x helloworld.php

On UNIX-based systems, you may also consider dropping the .php file extension, as it’s not required in order to execute the script, and because there is some benefit to be had by doing so; if, later, you choose to replace it with a script that’s written in another language, while maintaining the functionality, there’ll be no confusion.

On Windows, you can achieve almost the same thing by associating the PHP file extension with the PHP binary. You’ll find this explained at the end of Replacing Perl Scripts with PHP Scripts.

Here, I’ll stick to using a .php extension, and I’ll assume you’ll be executing the scripts by first calling the PHP binary, as this makes the most effective cross-platform approach.

When it comes to displaying output from your command line scripts, using the normal echo or print commands is not the best way to go, for two reasons.

First, on Windows, see what happens when you execute the following from a command prompt:

<?php
$i = 0;
while ( $i < 10 ) {
    print $i."\n";
    sleep(1);
    $i++;
}
?>

Filename: printproblem.php

Instead of seeing the numbers displayed at the moment they are printed, with one-second intervals between them, they’re all flushed out at once when the script terminates (note that running the same script on Linux results in the numbers being displayed as they happen).

The other reason is more a question of good practice. PHP (the underlying shell, in fact) allows you to direct normal output that’s meant for the user to what’s known as the “standard out”, while diverting any error messages to the “standard error”. The advantage in making this division is that it allows you to log errors separately from messages generated by the normal, smooth operation of the script, which can be very useful when running batch jobs.

Part of PHP’s CLI interface are three “streams” with which you can interact in more or less the same way as you would a file resource returned from fopen(). The streams are identified with the strings:

php://stdin (read)
php://stdout (write)
php://stderr (write)

With the PHP 4.3.0+ CLI binary, these three streams are automatically available, identified with the constants STDIN, STDOUT and STDERR respectively. Here’s how I can use the STDOUT to fix the above script so that it behaves correctly on Windows:

<?php
$i = 0;
while ( $i < 10 ) {
    // Write some output
    fwrite(STDOUT, $i."\n");
    sleep(1);
    $i++;
}
?>

Filename: streamout.php

The count should now be displayed at the moment at which fwrite() is called.

Note that, on Windows, PHP takes care of linefeed characters for you, so that they display correctly (you weren’t trying to use <br>, were you?).

To be able to read input from the script’s user, you can use STDIN combined with fgets(), fread(), fscanf() or fgetc(). For example:

<?php
fwrite(STDOUT, "Please enter your name\n");

// Read the input
$name = fgets(STDIN);

fwrite(STDOUT, “Hello $name”);

// Exit correctly
exit(0);
?>

Filename: nameplease.php

What happens when the script is executed? Well, when the interpreter reaches the fgets() command (it sees the code reading from STDIN), execution pauses and waits for the user to hit enter. On pressing return, fgets() receives everything the user typed from the point at which execution paused, until the moment he or she pressed return. This includes the return linefeed character — see what happens if you display the contents of the $name variable after passing it through nl2br().

This “pause / continue” behaviour is actually controlled by the terminal (the command prompt). It works on a line-by-line basis and is known as Canonical Mode Input Processing.

Note that, on UNIX only, it’s possible to have your script process input in Non-canonical Mode using the pcntl_signal(), which allows you to respond directly to particular key sequences.

Note also that I’ve started to use the exit() command at the end of the script, passing it an integer value. This is another question of “good manners”. Common practice on UNIX-based systems is for a script to return a status code when execution halts, zero being the standard for denoting the “All OK”. Other codes (integers between 1 and 254) are used to identify problems that caused the script to terminate prematurely (it’s up to you to define the error codes for you script, which is commonly achieved by defining a set of constants at the start). If another command line script (perhaps written in another language) is used to execute your script, it will rely on the status code returned from exit() to determine whether you script ran successfully or not. For some more thoughts on the subject, see the Advanced Bash-Scripting Guide on Exit and Exit Status.

You can also use STDIN to handle choices:

<?php
fwrite(STDOUT, "Pick some colors (enter the letter and press return)\n");

// An array of choice to color
$colors = array (
'a'=>'Red',
'b'=>'Green',
'c'=>'Blue',
);

fwrite(STDOUT, "Enter 'q' to quit\n");

// Display the choices
foreach ( $colors as $choice => $color ) {
    fwrite(STDOUT, "\t$choice: $color\n");
}

// Loop until they enter 'q' for Quit
do {
    // A character from STDIN, ignoring whitespace characters
    do {
        $selection = fgetc(STDIN);
    } while ( trim($selection) == '' );

    if ( array_key_exists($selection,$colors) ) {
        fwrite(STDOUT, "You picked {$colors[$selection]}\n");
    }

} while ( $selection != 'q' );

exit(0);
?>

Filename: pickacolor.php

The end user sees something like this:

Pick some colors (enter the letter and press return)
Enter 'q' to quit
a: Red
b: Green
c: Blue
b
You picked Green
c
You picked Blue
q

Using a loop, I can get my script to continue execution until the user enters ‘q’ to quit. Another approach would be a while loop that continues forever, but contains a break instruction when some condition is met. It may take you some time to get familiar with this, particularly if you’ve been using PHP in a Web application in which execution normally halts automatically after 30 seconds, and user input is provided in a single bundle at the start of a request. The main thing is to pay close attention to your code when you insert “never ending” loops like this, so you’re not inadvertently executing a database query over and over, for example.

Now, let’s get back to the example above. Have a look at what’s happening here:

// A character from STDIN, ignoring whitespace characters
do {
$selection = fgetc(STDIN);
} while ( trim($selection) == '' );

The fgetc() function pulls a single character of the STDIN stream that’s coming from the user. When execution reaches this point, there’s a pause while the terminal waits for the user to enter a character and press return. Both the character I enter, such as the letter ‘a’, and the new line character are placed on the STDIN stream. Both will eventually be read by fgetc(), but on consecutive loop iterations. As a result, I need to ignore the new line characters which, in this case, I have done using the trim() function, comparing what it returns to an empty string.

If that doesn’t make sense, try commenting out the do { and } while ( trim($selection) == '' ); lines in this inner loop, before and after fgetc(). Then, see what happens when you run the script.

Also, right now, users of this script can enter multiple characters between presses of the return key. See if you can work out a way to prevent that by confirming that return was pressed between the user’s choices.

Earlier on, I mentioned that you could write errors to a different output location defined by the STDERR constant in PHP. This is achieved in the same way as writing to STDOUT:

<?php
// A constant to be used as an error return status
define ('DB_CONNECTION_FAILED',1);

// Try connecting to MySQL
if ( !@mysql_connect('localhost','user','pass') ) {
// Write to STDERR
fwrite(STDERR,mysql_error()."\n");
exit(DB_CONNECTION_FAILED);
}

fwrite(STDOUT,"Connected to database\n");
exit(0);
?>

Filename: failedconnection.php

Any connection errors will now be reported using STDERR. What this script doesn’t do is demonstrate why this functionality can be useful. For that, consider the following:

<?php
// A custom error handler
function CliErrorHandler($errno, $errstr, $errfile, $errline) {
fwrite(STDERR,"$errstr in $errfile on $errline\n");
}
// Tell PHP to use the error handler
set_error_handler('CliErrorHandler');

fwrite(STDOUT,"Opening file foobar.log\n");

// File does not exist - error is generated
if ( $fp = fopen('foobar.log','r') ) {
// do something with the file here
fclose($fp);
}

fwrite(STDOUT,"Job finished\n");
exit(0);
?>

Filename: outwitherrors.php

Now, when you execute this script normally (assuming the file foobar.log doesn’t exist in the same directory as the script), it produces output like this:

Opening file foobar.log fopen(foobar.log): failed to open stream: No such file or directory in /home/harryf/outwitherrors.php on 11 Job finished

The error messages are mixed with the normal output as before. But by piping the output from the script, I can split the errors from the normal output:

$ php outwitherrors.php 2> errors.log

This time, you’ll only see these messages:

Opening file foobar.log
Job finished

But, if you look into the directory in which you ran the script, a new file called errors.log will have been created, containing the error message. The number 2 is the command line handle used to identify STDERR. Note that 1 is handle for STDOUT, while 0 is the handle for STDERR. Using the > symbol from the command line, you can direct output to a particular location.

Although this may not seem very exciting, it’s a very handy tool for system administration. A simple application, running some script from cron, would be the following:

$ php outwitherrors.php >> transaction.log 2>> errors.log

Using ‘»‘, I tell the terminal to append new messages to the existing log (rather than overwrite it). The normal operational messages are now logged to transaction.log, which I might peruse once a month, just to check that everything’s OK. Meanwhile, any errors that need a quicker response end up in errors.log, which some other cron job might email me on a daily basis (or more frequently) as required.

There’s one difference between the UNIX and Windows command lines, when it comes to piping output, of which you should be aware. On UNIX, you can merge the STDOUT and STDERR streams to a single destination, for example:

$ php outwitherrors.php > everything.log 2>&1

What this does is re-route the STDERR to STDOUT, meaning that both get written to the log file everything.log.

Unfortunately, Windows doesn’t support this capability, although you can find tools (like StdErr that can help you achieve the same ends.

In general, the subject of piping IO is one that I can’t do full justice to. On UNIX, it’s almost a (black?) art, and usually comes into play when you start using other command line tools like sed, awk and xargs (O’Reilly have dedicated entire books to the subject, including the Loris Book). What you should be getting a feeling for, however, is that, by conforming to the standard conventions for shell scripting, you have the chance to tap into a powerful system administration “framework” that provides all sorts of other tools.

You’ve already seen some examples of how to read user input from STDIN, once a script has already begun execution. What about being able to pass information to the script at the point it is executed?

If you’ve ever programmed in a language like C or Java, you’ll no doubt be familiar with variables called “argc” and “argv”. With PHP, the same naming convention is used, with the integer variable $argc containing the number of arguments passed to the script, while $argv contains an indexed array of the arguments, the “delimiter” between each argument being a space character.

To see how they work, try the following script:

<?php
// Correct English grammar...
$argc > 1 ? $plural = 's' : $plural = '';

// Display the number of arguments received
fwrite(STDOUT,"Got $argc argument$plural\n");

// Write out the contents of the $argv array
foreach ( $argv as $key => $value ) {
fwrite(STDOUT,"$key => $value\n");
}
?>

Filename: arguments.php

I execute this script as normal:

$ php arguments.php

The result is:

Got 1 argument
0 => arguments.php

The first and only argument is the name of the script being executed.

Now, if I execute the following:

$ php arguments.php Hello World!

The code returns:

Got 3 arguments
0 => arguments.php
1 => Hello
2 => World!

The strings “Hello” and “World!” are treated as two separate arguments.

If, instead, I use quotes:

$ php arguments.php "Hello World!"

The output is as follows:

Got 2 arguments
0 => arguments.php
1 => Hello World!

As you can see, passing basic arguments to a PHP script in this manner is very easy. Of course, there are always a few “gotchas” to watch out for.

The variables $argc and $argv are made available in PHP’s global scope to those using the PHP 4.3.0+ version of the CLI binary. They are not available outside the global scope (such as inside a function). When you use a CGI binary with register_globals off, $argc and $argv will not be available at all. Instead, it’s best to access them via the $_SERVER array you’re well used to, using $_SERVER['argv'] and $_SERVER['argc']. They will be available here irrespective of SAPI and register_globals, so long as (wait for it…) the 'register_argc_argv' ini setting is switched on in php.ini (which, thankfully, it has been, by default, since PHP 4.0.0).

The other main “gotcha” is that first argument you saw above — the name of the script itself. This happened because I named, in the command line, the PHP binary to execute the script, so arguments are calculated relative to the PHP binary, not to the script it’s executing. Meanwhile, as you saw above, it’s possible to execute a script directly by identifying the binary with the “SheBang” on UNIX, or by file association on Windows. Doing so will mean the script’s name no longer appears in the list of arguments. In other words, you need to be careful to check what the first argument contains and whether it matches the contents of $_SERVER['SCRIPT_NAME']. Of course, life would be too easy if $_SERVER['SCRIPT_NAME'] was always available, which it generally won’t be if you’re using the CGI binary. A more reliable mechanism is to compare the contents of the __FILE__ constant with the contents of $_SERVER[argv][0]:

<?php
if ( realpath($_SERVER['argv'][0]) == __FILE__ ) {
fwrite(STDOUT,'You executed "$ php whereami.php"');
} else {
fwrite(STDOUT,'You executed "$ whereami.php"');
}
?>

Filename: whereami.php

If you’ve had any experience executing commands using a UNIX shell, you’ll have no doubt run into the use of command line options such as:

$ ls -lt

This lists the contents of a directory in “long” format, the files being ordered by modification time. Another example you may know is as follows:

$ mysql --user=harryf --password=secret

The above is used to log in to MySQL.

These are known as command line options, the first example having the “short option” syntax, while the second uses the “long option” syntax.

If I try passing arguments like this to my arguments.php script above:

$ php arguments.php --user=harry --password=secret

I get:

Got 3 arguments
0 => arguments.php
1 => --user=harry
2 => --password=secret

The $argv array is ignorant to the options syntax, restricting itself to breaking up arguments using the space character between them. It’s up to you to parse the contents of each argument to handle “options” like this within your script.

While, at first glance, it may not seem too hard to write a function for parsing options, why would you bother? PEAR::Console_Getopt provides a ready-rolled solution, and solves the issues with $argv vs. $_SERVER['argv'] you saw earlier.

Note the version of Console_Getopt I used here was 1.2. See Getting Started with PEAR if you’re wondering how to install it. Be warned that uninstalling Console_Getopt is a bad idea, as the package manager itself depends on it. If you need to upgrade your version, use:

$ pear upgrade Console_Getopt-stable

PEAR::Console_Getopt provides three public methods, each of which is called statically:

1. array Console_Getopt::readPHPArgv()

The array returned from readPHPArgv() is the same as the $argv array, but Console_Getopt offers better protection against different PHP versions. If it is unable to fetch access the command line arguments (e.g. if they’re not available, such as with a CGI binary where register_argc_argv=Off), it returns a PEAR Error object.

2. array Console_Getopt::getOpt(array argv, [string short_opts], [array long_opts])

The getOpt() method takes the array of command line arguments and parses them into an array of options and arguments. It expects the first element of the argv to be the name of the script that’s being executed, and it’s meant for use when the following types of scripts are executed:

$ php some_script.php

The ‘short_opts‘ string identifies short options that the user can provide, followed by characters that identify given options. It allows two special markers in the string: ‘:‘ specifies that the preceding letter must be followed by a “value”, while ‘::‘ says that the preceding option may be followed by a value. It’s easiest to see this by example.

If $short_opts = 'lt', the following command lines are possible (no errors will be returned from getOpt()):

$ php some_script.php -lt
$ php some_script.php -tl
$ php some_script.php -l -t
$ php some_script.php -t -l
$ php some_script.php -l
$ php some_script.php -t -t -l -l
$ php some_script.php -t
$ php some_script.php

Setting $short_opts = 'lt:' means the ‘t’ option must be followed by at least one character (not a space), which will be its value. Look at this example:

$ php some_script.php -ltv

Here 't' = 'v'.

$ php some_script.php -tvalue -l

Now 't' = 'value'.

$ php some_script.php -tl

Here 't' = 'l' (l is not an option in this case).

Finally, setting $short_opts = 'lt::' means that ‘t’ can have an optional value (anything that proceeds it, up to the next space). Note that I can place further options after the colons in the short options string. For example, 'lt:o::B'.

This may seem a little arcane, but specifying the short options string in this way is fairly standard in many languages used on UNIX, from C to Python. Once you get used to it, this method provides a useful mechanism to get the most out of command line arguments with minimum effort.

The long_opts are specified in a similar manner to the short_opts, but, instead of using a string, an array is used; also, the ‘:‘ marker is replaced with an = sign to identify when values should be given. To allow the long options ‘–user=harryf –pass=secret‘, I’d need an array like this:

$long_opts = array (
'user=',
'pass=',
);

In this case, both the 'user' and 'pass' options require a value. If I omit the equals sign, no value is allowed, while using ‘==‘ allows the user to provide an optional value.

The array returned from getOpt() always has two elements in its first order; the first contains the command line options, while the second contains the arguments. We’ll look at the structure in more detail in a moment.

The final method provided by Console_Getopt is:

3. array Console_Getopt::getOpt2(array argv, [string short_opts], [array long_opts])

This method is almost exactly the same as getOpt(), except that it expects the first element in the argv array to be a real argument, not the name of the script that’s being executed. In other words, you’d use it when executing a script such as:

$ some_script.php -tl

Here’s a simple example of PEAR::Console_Getopt in action:

<?php
// Include PEAR::Console_Getopt
require_once 'Console/Getopt.php';

// Define exit codes for errors
define('NO_ARGS',10);
define('INVALID_OPTION',11);

// Reading the incoming arguments - same as $argv
$args = Console_Getopt::readPHPArgv();

// Make sure we got them (for non CLI binaries)
if (PEAR::isError($args)) {
fwrite(STDERR,$args->getMessage()."\n");
exit(NO_ARGS);
}

// Short options
$short_opts = 'lt:';

// Long options
$long_opts = array(
'user=',
'pass=',
);

// Convert the arguments to options - check for the first argument
if ( realpath($_SERVER['argv'][0]) == __FILE__ ) {
$options = Console_Getopt::getOpt($args,$short_opts,$long_opts);
} else {
$options = Console_Getopt::getOpt2($args,$short_opts,$long_opts);
}

// Check the options are valid
if (PEAR::isError($options)) {
fwrite(STDERR,$options->getMessage()."\n");
exit(INVALID_OPTION);
}

print_r($options);
?>

If we invoke this script like so:

$ php getopts.php -ltr --user=harryf --pass=secret arg1 arg2

The output from print_r() looks like this:

Array
(
[0] => Array
(
[0] => Array
(
[0] => l
[1] =>
)

[1] => Array
(
[0] => t
[1] => r
)

[2] => Array
(
[0] => --user
[1] => harryf
)

[3] => Array
(
[0] => --pass
[1] => secret
)

)

[1] => Array
(
[0] => arg1
[1] => arg2
)

)

The options array always contains two first order elements, the first of which contains another array that represents each command line option. The second element contains an array of arguments. If you look back at how the script was invoked, you should be able to see the relationship.

You can see how PEAR::Console_Getopt might be used to parse the arguments and options used with the UNIX ‘ls’ utility when executed as follows:

$ ls -ltr --width=80 /home/harryf/scripts/

If you parsed the above command line with Console_Getopt, you would have an array of options such as:

Array
(
[0] => Array
(
[0] => Array
(
[0] => l
[1] =>
)

[1] => Array
(
[0] => t
[1] =>
)

[2] => Array
(
[0] => r
[1] =>
)

[3] => Array
(
[0] => --width
[1] => 80
)
)

[1] => Array
(
[0] => /home/harryf/scripts/
)

)

You’re now set to re-implement ‘ls’ in PHP, should you have endless hours on your hands.

As I mentioned earlier in this article, it’s possible to use the PHP CGI binary in more or less the same way as the CLI binary, but some additional “tweaking” is needed. Some of these changes are best made at runtime by including an additional PHP script. Others need to be made either in php.ini itself, or by passing additional arguments to the PHP binary.

Starting with the settings that you can change at runtime, here’s a script that fixes most of the issues for users working with the CGI binary, or CLI versions below 4.3.0:

<?php
/**
* Sets up CLI environment based on SAPI and PHP version
*/
if (version_compare(phpversion(), '4.3.0', '<') || php_sapi_name() == 'cgi') {
// Handle output buffering
@ob_end_flush();
ob_implicit_flush(TRUE);

// PHP ini settings
set_time_limit(0);
ini_set('track_errors', TRUE);
ini_set('html_errors', FALSE);
ini_set('magic_quotes_runtime', FALSE);

// Define stream constants
define('STDIN', fopen('php://stdin', 'r'));
define('STDOUT', fopen('php://stdout', 'w'));
define('STDERR', fopen('php://stderr', 'w'));

// Close the streams on script termination
register_shutdown_function(
create_function('',
'fclose(STDIN); fclose(STDOUT); fclose(STDERR); return true;')
);
}
?>

Filename: cli_compatibility.php

Including this code in your scripts will resolve most of the problems, but some further issues remain.

The CGI executable normally sends HTTP headers when executed — even from the command line. Users are likely to see output similar to the following when they execute command line scripts using the CGI executable:

X-Powered-By: PHP/4.3.6
Content-type: text/html

To prevent this, PHP needs to be invoked with the ‘-q’ option for ‘quiet’. For example:

$ php -q some_script.php

Unfortunately, this places the burden on users. On UNIX, you can get round this issue by adding the option to the SheBang as a script, and encouraging users to execute it directly:

#!/usr/local/bin/php -q
<?php
// Code starts here

Another issue with the CGI executable is that it changes automatically the current working directory to that in which the script that’s being executed resides. Imagine I use a simple script that contains the following:

<?php
print getcwd()."n";
?>

Filename: getcwd.php

Now, I execute it like so:

$ pwd
/home/harryf
$ php ./scripts/getcwd.php

If I’m using the CLI binary, this will display “/home/harryf” — my current directory. But, if I use the CGI binary, I get “/home/harryf/scripts” — the scripts directory.

I’m not aware of a workaround that would solve this issue, so, if it’s critical for your application, your best bet is to force users to work with the CLI binary. Change the start of the compatibility script above to:

<?php
if ( php_sapi_name() == 'cgi' ) {
die ('Unsupported SAPI - please use the CLI binary');
}
if ( version_compare(phpversion(), '4.3.0', '<') ) {

That finishes this first part of our tour of PHP’s command line interface. So far, we’ve got the basics covered. Next time, I’ll discuss the execution of external programs from PHP, we’ll explore some of the packages PEAR has to offer for sprucing up your output, and we’ll take a look at some of the (UNIX only) extensions PHP provides for the command line.

PHP Basics

Hello World

Stream In, Stream Out

Multiple Choice

Errors and Pipes

Coping with Arguments

Parsing Options

Compatibility

Wrap Up