Monday, October 12, 2009

Pop Quiz

Here's a little test to separate the serious coders from the cut-and-paste script kiddies. Given the need to generate an arbitrarily long string consisting of random alpha-numeric characters, which solution is best?

Solution A:

function randomString($len) {
$chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" .
"abcdefghijklmnopqrstuvwxyz" .
"0123456789";
$rndMax = strlen($chars) - 1;
$str = "";
while ($len-- != 0) {
$str .= $chars[rand(0, $rndMax)];
}
return $str;
}
$str = randomString(8);
echo "$str\n";

Solution B:

class RandomSequenceIterator implements Iterator
{
protected $seqMembers;
protected $key;
protected $limit;

public function __construct() {
$this->setMembers(null)
->setLimit(0)
->rewind();
}

protected function setMembers($strValue) {
$this->seqMembers = $strValue;
return $this;
}

protected function getMembers() {
return $this->seqMembers;
}

protected function setLimit($intValue) {
$this->limit = $intValue;
return $this;
}

protected function getLimit() {
if (empty($this->limit)) {
return 0;
}
else {
return $this->limit;
}
}

public function current() {
$index = rand(0, strlen($this->getMembers()) - 1);
return substr($this->getMembers(), $index, 1);
}

public function valid() {
return $this->key() < $this->getLimit();
}

public function key() {
return $this->key;
}

public function next() {
$this->key++;
}

public function reset() {
$this->rewind();
}

public function rewind() {
$this->key = 0;
}
}

class RandomCharacterSequenceGenerator extends
RandomSequenceIterator
{
public function setChars($strValue) {
$this->setMembers($strValue);
return $this;
}

public function getChars() {
return $this->getMembers();
}

public function generate($limit) {
$strBuffer = "";
$this->setLimit($limit);
foreach ($this as $char) {
$strBuffer .= $char;
}
return $strBuffer;
}
}

abstract class RandomGeneratorBase
{
protected static $instance;
protected static $generator;

protected function __construct() {
self::$generator = new RandomCharacterSequenceGenerator();
}

// abstract public static function getInstance();
public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}

public function generate($limit) {
return self::$generator->generate($limit);
}
}

class RandomNumericStringGenerator extends RandomGeneratorBase
{
const VALID_MEMBERS = "0123456789";

protected function __construct() {
parent::__construct();
self::$generator->setChars(self::VALID_MEMBERS);
}

public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}
}

class RandomUpperCaseAlphaStringGenerator extends
RandomGeneratorBase
{
const VALID_MEMBERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

protected function __construct() {
parent::__construct();
self::$generator->setChars(self::VALID_MEMBERS);
}

public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}
}

class RandomLowerCaseAlphaStringGenerator extends
RandomGeneratorBase
{
const VALID_MEMBERS = "abcdefghijklmnopqrstuvwxyz";

protected function __construct() {
parent::__construct();
self::$generator->setChars(self::VALID_MEMBERS);
}

public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}
}

class RandomMixedCaseAlphaStringGenerator extends
RandomGeneratorBase
{
protected function __construct() {
parent::__construct();
self::$generator->setChars(
RandomUpperCaseAlphaStringGenerator::VALID_MEMBERS .
RandomLowerCaseAlphaStringGenerator::VALID_MEMBERS);
}

public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}
}

class RandomMixedCaseAlphaNumericStringGenerator extends
RandomMixedCaseAlphaStringGenerator
{
protected function __construct() {
parent::__construct();
self::$generator->setChars(
RandomNumericStringGenerator::VALID_MEMBERS .
self::$generator->getChars());
}

public static function getInstance() {
if (empty(self::$instance)) {
self::$instance = new self();
}
return self::$instance;
}
}

$str = RandomMixedCaseAlphaNumericStringGenerator
::getInstance()
->generate(8);
echo "$str\n";
If you managed to scroll down this far, congratulations! You are a true programmer who knows the correct answer is B.

Solution A is a good example of unorganized spaghetti-code. It "works" but it's amateurish. A uses only language-primitives-- none of the advanced, enterprise architecture concepts embodied in Solution B. It is certainly not flexible. (What if one changed the requirements to generate a random string of only digits, for example?)

Solution B is much more organized, encapsulated, and extensible. Object-oriented code models the way things are in the "real world" so it's easier to conceptualize what the code is doing. The problem is broken down into several smaller, more manageable ones, and uses interfaces, abstract classes, and inheritance to eliminate redundant code, facilitate better organization, and foster code-reuse. Understanding existing code is an important part of programming and the liberal use of design patterns such as Singleton and Decorator embody solutions to common problems giving programmers a common vocabulary with higher levels of abstraction to communicate with one another.

Of course, solution B should only serve as an example and not be used as-is in a production environment. Exceptions should be used, but are omitted here for the sake of brevity. And the actual invocation to generate the random string could probably be further encapsulated and abstracted by implementing the Factory design pattern and maybe the Registry pattern-- one would query the registry for a value object to pass to the factory, and in turn the factory would determine which type of generator to return: random uppercase alpha, random numeric, mixed case, etc.

Wednesday, October 7, 2009

Setting Up SFTP From PHP

PHP's ftp_ssl_connect() function is for SSL-FTP, where as what I need for a client's application is SFTP. Isn't life grand! Well, it's not really too much trouble... PHP can handle that too with functions from the ssh2 PECL extension. I'm just glad I caught it early on instead of at the 11th hour. I figured though I might as well continue my previous post about setting up this project with a brief description on installing ssh2 and testing it to ensure everything is in working order.

Installation

The ssh2 extension provides bindings to libssh2 which must be installed first on the system. My target platform is CentOS 5.3, so I was able to install libssh2 and libssh2-devel via yum (using the RPMForge repository).
yum install libssh2 libssh2-devel
The ssh2 extension is available from the pecl.php.net website. There is a minor bug in the current version (0.11.0) which prevents it from compiling against PHP 5.3 so I needed to apply this patch.
wget http://remi.fedorapeople.org/ssh2-php53.patch
tar zxf ssh2-0.11.0.tgz
cd ssh2-0.11.0
patch < ../ssh2-php53.patch
I compiled and installed the extension after the code was patched.
phpize
./configure --with-ssh2
make
cp modules/ssh2.so /usr/local/php/lib/php/extensions/
Adding module=ssh2.so to php.ini and restarting Apache completed the installation. I was then ready to move on and test it to make sure it worked as it should.

Testing the Extension

Using the extension to upload and retrieve files programmatically over an SFTP connection is relatively simple. First, a secure shell connection is established with the SFTP server using ssh2_connect(), and then a login is authenticated with ssh2_auth_password(). Both functions return false if they fail:
$sess = @ssh2_connect(SFTP_SERVER, SFTP_PORT);
if ($sess === false) {
echo "Connection failed.";
exit(-1);
}
$result = @ssh2_auth_password($sess, SFTP_USERNAME,
SFTP_PASSWORD);
if ($result === false) {
echo "Unable to authenticate.";
exit(-1);
}
Once a session has been established, the ssh2_sftp() function is used to retrieve an SFTP resource.
$sftp = @ssh2_sftp($sess);
if ($sftp === false) {
echo "Unable to initialize SFTP subsystem.";
exit(-1);
}
From that point on, the SFTP resource is used with the ssh2.sftp filestream to read and write files.
file_put_contents("ssh2.sftp://" . $sftp . "/tmp/test.txt",
$data);
In writing my short automated test case, I dumped some bytes from /dev/urandom to generate a test file, wrote the data to the SFTP server, read it back, and compared the results to make sure they matched after the round-trip.
// generate random data for test file
$data = file_get_contents("/dev/urandom", FILE_BINARY, null,
0, 512);
$data = substr(convert_uuencode($data), 0, 512);

// write data to file on SFTP server
file_put_contents("ssh2.sftp://" . $sftp . "/tmp/test.txt",
$data);

// retrieve file from SFTP server
$retrieved = file_get_contents("ssh2.sftp://" . $sftp .
"/tmp/test.txt");

// compare original data with retrieve data
echo ($data == $retrieved) ? "Success!" : "Corruption!";

Saturday, September 26, 2009

Creating a CentOS-Based LAMP Virtual Image

In doing some preliminary research and planning for a client's new project, I determined his current in-house deployment platform would not be sufficient given his requirements. Specifically, the project calls for a moderate amount of URL re-writing and the ability to programmatically FTP files to a remote host. The client is running IIS on Windows Server 2008; I’m not too keen on ISAPI rewrite and IIS Rewrite seems to have fallen off the face of the Internet, and the ftp_ssl_connect() function is only available in PHP if both the ftp module and OpenSSL support are statically built-in so we would have to maintain a build environment for him, too. A LAMP-stack makes more sense. Apache can rewrite URLs with mod_rewrite and compiling PHP is a more supported practice on Linux than it is on Windows.

I discussed the obstacles and some possible solutions with the client and he's okay with LAMP. Instead of bringing in more hardware, though, I suggested taking advantage of virtualization. I assured him I could create a virtual platform that would provide us with everything we need, appear as a new machine on his network, and run directly on top of Windows Server 2008.

Installing CentOS

Originally I wanted to use the new Slackware64, but VMware-Tools proved too much of a struggle to install and I didn't feel comfortable using it for a client's project because of that. I eventually settled on CentOS 5.3 instead.

I fired up the trial version of VMWare Workstation to configure a basic machine image... though I have VMware Workstation 6.5, I chose to set the virtual machine's hardware compatibility for Workstation 5 and compatible with ESX Server. I figured this will give us some flexibility if we need to move the image to bare-metal in the future. CentOS is built from RHEL sources, so I was able to set the Guest Operating System as Red Hat Enterprise Linux 5 and use any Red Hat-specific documentation VMware has.

I tried to keep the installation small, so I unchecked everything in Anaconda-- including the Base packages. I still got packages what I feel are unnecessary dependencies (Requiring wireless-tools on a sever installation, for example. Seriously, Red Hat!), but I guess I can live with it and it won't matter much to the client.

Once CentOS was installed and booted and I was logged in, I needed to install some packages (and their dependencies) with yum that I didn't install during the installation:
  • autoconf

  • curl-devel

  • freetype-devel

  • gcc

  • gcc-c++

  • libjpeg-devel

  • libpng-devel

  • libxml2-devel

  • lynx

  • make

  • ncurses-devel

  • ntp

  • openssl-devel

  • patch

  • perl

  • sendmail

  • wget

  • which

  • zlib-devel
Notice I didn't install Apache, MySQL, or PHP. That's because I like to compile and install the major software from source. This way I can make sure they're up to date and configure their builds exactly how I need them.

Configuring Mapped Directories

I want to keep the application's data separate from the virtual image so I wouldn't be constrained by the size of the image (trying to explain why he couldn't save more than a gig of data when it was running on a physical server with 100 gigs of free drive space wouldn't be fun). The next task was to create shared data directories on the host and install VMware-Tools so I could map them. I created a directory shared as apache to hold the bulk of the application's code (.php, .html, etc), and mysql to hold the database's tables.

The VMware documentation describes the VMware-Tools installation process in detail, but it's no more difficult than selecting "VM" -> "Install VMware tools..." in VMware Workstation, and then proceeding to install the VMware-Tools RPM in CentOS.
mount /dev/cdrom /media
rpm -Uvh /media/ VMwareTools-7.8.5-156735.i386.rpm
umount /media
vmware-config-tools.pl
VMware adds the following to /etc/fstab:
# Beginning of the block added by the VMware software
.host:/ /mnt/hgfs vmhgfs defaults,ttl=5 0 0
# End of the block added by the VMware software
That entry will make the shared folders on from the host operating system accessible as /mnt/hgfs/apache and /mnt/hgfs/mysql. Everything within them owned by root with global read, write, and execute permissions. There's not much that can be done about the lax permissions, but I could at least have the files owned by a more appropriate user than root. I wanted to have them each mounted under /srv instead of /mnt/hgfs as well to be a little more LSB compliant (suck it, /var/www!), so I replaced their entry with my own:
.host:/apache   /srv/apache   vmhgfs   defaults,ttl=5,uid=99,gid=99   0 0
.host:/mysql /srv/mysql vmhgfs defaults,ttl=5,uid=27,gid=27 0 0
It would be nice if future version of VMware will have a more flexible HGFS driver-- but this will be sufficient for the task at hand. At last I could install Apache, MySQL, and PHP.

Compiling

There isn't anything too exciting about installing Apache, MySQL, and PHP from source to talk about, so I'll just share with you my configure options.
MySql Enterprise 5.0.88sp2
./configure \
--prefix=/usr/local/mysql \
--localstatedir=/srv/mysql \
--with-unix-socket-path=/tmp/mysql.sock \
--with-mysqld-user=mysql \
--without-debug \
--with-archive-storage-engine \
--with-csv-storage-engine \
--with-federated-storage-engine \
--disable-maintainer-mode \
--enable-assembler \
--enable-largefile \
--enable-local-infile \
--enable-thread-safe-client
Apache 2.2.13
CFLAGS=-O3 ./configure \
--prefix=/usr/local/apache \
--with-pcre \
--disable-status \
--enable-mods-shared=all \
--enable-so \
--enable-ssl \
--enable-setenvif \
--enable-rewrite
PHP 5.0.3
CFLAGS=-O3 ./configure \
--prefix=/usr/local/php \
--with-apxs2=/usr/local/apache/bin/apxs \
--with-mysql=/usr/local/mysql \
--with-pdo-mysql=/usr/local/mysql \
--with-mysqli=/usr/local/mysql/bin/mysql_config \
--with-gd \
--with-jpeg-dir=/usr/lib \
--with-freetype-dir \
--with-curl \
--with-openssl \
--enable-ftp \
--with-openssl-dir
After that I needed to open CentOS's firewall to allow HTTPS traffic using system-config-securitylevel-tui, and change the security context of the libphp5.so module for Apache because SELinux is enabled.

Final Housekeeping

There were only a few minor housekeeping things to attend to after I had everything installed. I had to add a couple kernel parameters and configure ntp according to VMware's Time Keeping Best Practices for Linux so the time didn't drift. It was also important that I configure logrotate to rotate Apache and MySQL's log files as I did not install them via RPM. Otherwise they could grow unwieldy and use up all the space I had allocated for the virtual image.

So in short order I had not only a sane platform for deployment, but one I could easily clone and use for development as well. The client only needs the free VMware Player software to use the image. The data directories are on the host operating system alongside the image so they are not constrained by the size of the image and can be backed-up independently of the image. When necessary, upgrading the virtual platform can be done independently of the data.

Update 10/04/2009: It appears the above procedure didn't install a cron daemon, though it did install crontab files-- now isn't that interesting!
rpm -qa | grep cron
crontabs-1.10-8
yum install vixie-cron resolved the issue. Don't forget to issue chkconfig crond on so it starts automatically, and /etc/init.d/crond start to start cron for the current session (so you don't have to reboot).

Tuesday, July 28, 2009

New Feature at Paste Ninja

Some of you may have noticed last week a new feature was rolled out on Paste Ninja, the premier PHP-powered paste bin-- patches! Here's how it works:
  1. Update an existing paste

  2. Click the Compare link that appears in the updated paste's header

  3. In the Compare dialog that opens, select your desired current and target revisions; a colorized unified diff is displayed so you can verify the changes

  4. Click the dialog's Download button

It’s that simple!

Tuesday, July 21, 2009

Seamless Error Highlighting

A lot of output can be generated when you compiling large projects. When it breaks, it can be difficult to identify the particular spot in the build-process where things when wrong. Highlighting the error messages can help them stand out from the rest of the output.

ANSI escape sequences can be used to modify how your terminal window displays its text. For example, outputting the sequence \033[41;37mHello World\033[0m would result in "Hello World" displayed in white text against a red background. Escape sequences begin with an escape character (ASCII 27, octal 033) and bracket. The control values are then given (multiple values are semicolon-separated) and the entire sequence closes with m.

You can highlight certain messages by routing the STDOUT and STDERR streams to sed and performing a replacement.
s,(.*error.*|.*fail.*|.*undef.*),\033[41;37m\1\033[0m,gi
The values you match are of course entirely up to your discretion.

The tricky part is quoting and escaping the expression correctly so various metacharacters aren't intercepted by the shell. And some implementations of sed won't correctly convert \033 to an escape character, so you may need to enter it directly by typing CTRL+V, CTRL+[. Depending on your terminal, an actual escape character may be displayed as ^[ or a special glyph like ESC when you enter it.
make install 2>&1 | sed -e \
's,\(.*error.*\|.*fail.*\|.*undef.*\),ESC[41;37m\1ESC[0m,gi'
If you find yourself using such highlighting often, you may want to define a function to save yourself some typing. For example, with bash you can add something like this to your ~/.bashrc file:
function make() {
/usr/bin/make $@ 2>&1 | sed -e \
's,\(.*error.*\|.*fail.*\|.*undef.*\),ESC[41;37m\1ESC[0m,gi';
}
You can type make install at the prompt like you normally would. bash will call the new make() function, which in turn calls the actual make utility with any arguments (such as install) and colorizes the output. Error highlighting is now seamless and automatic!

Tuesday, July 14, 2009

OOP in JS

When people would ask me about Object Oriented Programming in JavaScript, I would always send them to two pages written by Gavin Kistner. The other day when I went to visit them myself to quickly check something, they were gone! What happened, Gavin?!

I promptly dug the pages up from a search engine's cache to mirror them, but all the original copyrights still belong to Gavin.

Sunday, June 28, 2009

Currying in PHP

What happens if you don't have all the arguments handy for a function, but you want to give whatever arguments you do have now and then provide the rest of them to the function later? This is called currying, and is a core concept in functional programming. It's messy, but possible to curry functions in PHP now that closures have been added.

First, let me show you how currying looks in a functional language. Here's a basic example in OCaml/F#:
let do_math op x y =
match op with
'+' -> x + y
| '-' -> x – y
| _ -> failwith "Invalid op"

let add = do_math '+'

let inc = add 1
let dec = add (-1)
;;
A function named do_math is defined that accepts an operator and two operands. The function's return value will be either the sum or difference of the operands, depending on whether the given operator is + or -. Notice how do_math is then called with a single argument. OCaml doesn't raise an error; it simply returns a function that "remembers" the first argument and accepts the remaining two arguments later (this is an over-simplified and slightly inaccurate statement, but a good enough description for our purpose here). This intermediate function can be used elsewhere, as in the bindings for inc and dec.

Now here's a version of the do_math() function in PHP:
function do_math($op, $x, $y) {
switch ($op) {
case '+':
return $x + $y;

case '-':
return $x - $y;

default:
throw new Exception("Invalid op");
}
}
Unfortunately, PHP will throw warnings if you call do_math() without the three arguments it expects.

Warning: Missing argument 2 for do_math(), called in /home/tboronczyk/curry.php on line 16 and defined in /home/tboronczyk/curry.php on line 2

Warning: Missing argument 3 for do_math(), called in /home/tboronczyk/curry.php on line 16 and defined in /home/tboronczyk/curry.php on line 2


Whereas functional languages have currying "built-in," you must explicitly code this ability in an imperative language. Doing so in PHP requires the use of closures:
function do_math($op) {
return function ($x) use ($op) {
return function ($y) use ($op, $x) {
switch ($op) {
case "+":
return $x + $y;

case "-":
return $x - $y;

default:
throw new Exception("Invalid op");
}
};
};
}
It's also possible to extend the function using func_num_args() and func_get_arg() functions, anonymous functions, and closures, so that any number of parameters can be given at a time.
function do_math() {
if (func_num_args() >= 1) $op = func_get_arg(0);
if (func_num_args() >= 2) $x = func_get_arg(1);
if (func_num_args() == 3) $y = func_get_arg(2);

switch (func_num_args()) {
case 1:
return function () use ($op) {
if (func_num_args() >= 1) $x = func_get_arg(0);
if (func_num_args() == 2) $y = func_get_arg(1);

switch (func_num_args()) {
case 1:
return function ($y) use ($op, $x) {
return do_math($op, $x, $y);
};

case 2:
return do_math($op, $x, $y);

default:
trigger_error(
"invalid argument count",
E_USER_WARNING);
}
};

case 2:
return function ($y) use ($op, $x) {
return do_math($op, $x, $y);
};

case 3:
switch ($op) {
case "+":
return $x + $y;

case "-":
return $x - $y;

default:
throw new Exception("Invalid op");
}

default:
trigger_error("invalid argument count",
E_USER_WARNING);
}
}
It's messy... but it works! Now you are able to pass one or two arguments to do_math(), capture the intermediate function that's returned, and pass the remaining argument(s) later.
$add = do_math("+");

$inc = $add(1);
$dec = $add(-1);

echo do_math("-", 3, 2);
echo do_math("+", 1, 1);
echo $inc(2);
echo $add(2, 2);
echo $dec(6);
echo $add($inc(4), $dec(2));
The switch statements are rather unmanageable and the spaghettification of code grows exponentially with the addition of each argument. This pattern is straight forward, though. You may want to consider writing a code generator to handle the dirty work of retrofitting a function to one capable of being curried rather than writing them all by hand. Of course, if you know of a better way to curry functions in PHP then let me know by leaving a comment!

Update 06/29/09: Someone asked me what the "real-world use" for all this would be. Currying is used all the time in functional programming, but the hassle of explicitly enabling the behavior in PHP makes that a valid question. My motivation was just to see if it were possible and share my results. Indeed it is. Functions can be curried in any language that supports closures. But for those who want something a little more concrete, let's consider callback functions.

In a previous post I gave the following example to illustrate the use of closures:
$userPercent = 0.5;
$userList = array_filter($percentVowels,
function($percent) use ($userPercent) {
return ($percent >= $userPercent);
});
It showed an anonymous function being used with array_filter() to filter an array. The array is filtered based on a dynamic value, and a closure is used to "inject" the threshold rather than using a global statement. The same could also be accomplished with currying.

The problem is array_filter() expects a callback function that accepts one argument--the current array element. Currying will allow us to prepare the function with the sorting threshold, and the intermediate function can be used as the callback.
function callback($userPercent) {
return function($percent) use ($userPercent) {
return ($percent >= $userPercent);
};
}
$userList = array_filter($percentVowels, callback(0.5));