Active Record Ramblings

Go to Project Site
Project Data
Language PHP
License GPLv2
Started on
Last touched on

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. Martin Fowler

A Bit of History

The Active Record pattern (AR for short) was originally named such by Martin Fowler in his classic book Patterns of Enterprise Application Architecture. It’s a way to map rows in a database to objects that allows simple access to columnular data. Most ORMs are based around this pattern and the rise of development frameworks such as Ruby on Rails and Zend Framework has pushed it into common usage. Used properly, a good AR implementation can save programmers time and sanity.

Merits and Detractions of Using AR

In general, I find that AR usage improves my code by making it easier to read and understand; in my opinion, comprehension as a value in programming is often under valued. Being able to inject additional functionality (validation or debugging for instance) into all the descendant model classes is a powerful advantage as well. AR provides a clear separation of the database logic from the code that uses it and isolates and frees the developers from the monotony of repetitive SQL statements. It’s the power to be able to just use code like this to access your table full of data:

class Creature extends ActiveRecord {
  public function tableName() { return 'creatures'; }
}
$yak = new Creature();
$yak->food = 'yakalicous grass';
$yak->save();
Pros and Cons of Active Record
ProsCons
Readability of code. Complexity is hidden.
Reduction in code size and duplication. Lack of execution visibility.
Abstraction from database variant. Yet another abstraction layer.
Easy to add validation and debug. Speed tradeoffs can add up.
Caching is faster than database lookup. Memory consumption is larger.

When Shouldn't AR Be Used

There are good reasons to not use AR in every database situation. Though there are tuned AR solutions for the below, these are my common cases for not using AR:

  • When dealing with large rows and datasets.
  • Cases where database consistency and syncronization are primary.
  • High throughput cases where writes are common.
  • When learning database programming - all users should understand the consequences of their code.

A Quick Implementation and Tutorial

Let’s whip up a quick and dirty AR to see how this works. The sample files included here use SQLite as a simple backend but we can really use any SQL database with this pattern — we could adapt this code easily to NoSQL or memcache solutions as well but that’s beyond the scope of this document. Let’s setup a table to map our ActiveRecord to using the following SQL:

CREATE TABLE creatures (id INTEGER PRIMARY KEY AUTOINCREMENT, species VARCHAR, threat_level INTEGER, food VARCHAR );
INSERT INTO creatutes (id,species,threat_level,food) VALUES (1,'Moose',4,'woodland shrubs');
INSERT INTO creatutes (id,species,threat_level,food) VALUES (2,'Zombie',10,'braaainnns');
INSERT INTO creatutes (id,species,threat_level,food) VALUES (3,'Duck',1,'floaty bits');

A Simple AR in PHP

This is intended to be a stripped down Active Record implementation in PHP. This doesn’t have many optimizations, support debugging, partial fetches, schema checking on object creation, prepared statements or any other bells or whistles. See below for some ideas of where to go from here.

/** Base class for ActiveRecords. Extend this and fill in the
 * abstract methods for profit.
 *
 * @license GPLv2
 * @copyright Patrick Audley <paudley@blackcat.ca> 2010
 */
abstract class ActiveRecord {

	/** Our internal database handle.
	 *  @var PDO
	 */
	static private $_dbh = NULL;

	/** Set the PDO database handle for all ActiveRecords.
	 *  @param PDO A PDO database handle, created using "new PDO(...)".
	 *  @note Call this once per session.
	 */
	static public function setDb( PDO $dbh ) { self::$_dbh = $dbh; }

	/** Get the current table name.
	 *  @return string SQL table name this class.
	 */
	abstract public function tableName();

	/** Get the primary key name.
	 *  @return string the name of the primary key. */
	public function primaryKey() { return 'id'; }

        /** Primary key for this record, NULL if it's a new one.
	 *  @var integer (usually)
	 */
	private $_id = NULL;

	/** The cached data from the db.
	 *  @var array
	 */
	private $_data = NULL;

	/** Constructor
	 *  @param integer $id The primary key to load, or NULL if it's new.
	 */
	public function __construct( $id = NULL ) {
		// New record, don't attempt to load.
		if( $id == NULL ) return;
		$this->_load( $id );
	}

	/** Are we dirty?  (do we need writing to the database?)
	 *  @var booleon
	 */
	protected $_dirty = FALSE;

	/** General Mutator.
	 *  @param string $name  The column name to update.
	 *  @param mixed  $value The value to set it to.
	 *  @return ActiveRecord Returns self for fluent interface.
	 */
	final public function __set( $name, $value ) {
		if( $this->_id !== NULL and !array_key_exists( $name, $this->_data ) )
			throw new Exception( 'Column "'.$name.'" does not exists in table "'.$this->tableName().'" in set.' );
		$this->_data[$name] = $value;
		$this->_dirty = TRUE;
		return $this;
	}

	/** General Accessor.
	 *  @param sting $name The column name to get.
	 *  @return mixed The requested data.
	 */
	final public function __get( $name ) {
		if( $name == $this->primaryKey() )
			return $this->_id;
		if( $this->_id !== NULL and !array_key_exists( $name, $this->_data ) )
			throw new Exception( 'Column "'.$name.'" does not exists in table "'.$this->tableName().'" in get.' );
		return array_key_exists( $name, $this->_data ) ? $this->_data[$name] : NULL;
	}

	/** Load a row from the database.
	 *  @return void
	 */
	private function _load( $id ) {
		$this->_id = $id;
		// Issue the SELECT
		$rows = self::$_dbh->query( 'SELECT * FROM '.$this->tableName()
		        .' WHERE `'.$this->primaryKey().'` = \''.$id.'\'' );
		$row = $rows->fetch( PDO::FETCH_ASSOC );
		if( $row === FALSE )
			throw new Exception( 'Bad ID in ActiveRecord Fetch for '.get_called_class().':'.$id );
		$this->_data = $row;
		unset( $this->_data[ $this->primaryKey() ] );
	}

	/** Write this record to the underlying database.
	 *  @return boolean TRUE if successful.
	 */
	public function save() {
		// Skip writing clean records.
		if( !$this->_dirty ) return TRUE;
		// Handle UPDATEs
		$base = $this;
		if( $this->_id !== NULL ) {
			// This is a complex bit of PHP, I was feeling perl-ish, excuse me...
			$sql = 'UPDATE '.$this->tableName().' SET '
				. implode( ', ', array_map( function( $column ) use ( $base ) {
							return '`'.$column.'` = "'.$base->$column.'"';
						}, array_keys( $this->_data ) ) )
				. ' WHERE `'.$this->primaryKey().'` = "'.$this->_id.'"';
			return self::$_dbh->query( $sql ) !== FALSE;
		}
		// Handle INSERTs.
		$sql = 'INSERT INTO '.$this->tableName().' (`'.$this->primaryKey().'`,`'
			. implode( '`, `', array_keys( $this->_data ) ).'`) VALUES (NULL, "'
			. implode( '", "', array_values( $this->_data ) ).'")';
		$rc = self::$_dbh->query( $sql );
		if( $rc === FALSE ) return FALSE;
		$this->_id = self::$_dbh->lastInsertId();
		return TRUE;
	}

	/** Remove this record from the database.
	 *  @return void
	 */
	public function remove() {
		// Records that haven't been saved can't be removed.
		if( $this->_id == NULL )
			return;
		// Issue the DELETE SQL
		$sql = 'DELETE FROM '.$this->tableName()
			.' WHERE `'.$this->primaryKey().'` = "'.$this->_id.'"';
		$rc = ( self::$_dbh->query( $sql ) !== FALSE );
		if( $rc == TRUE )
			$this->_id = NULL;
		return $rc;
	}

}

Using in a Simple Test

<?php
error_reporting( E_ALL | E_STRICT );

// Clean up the old db.
!file_exists( './ar.db' ) ?: unlink( './ar.db' );

// Setup our database.
$dbh = new PDO('sqlite:./ar.db');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$dbh->exec( 'CREATE TABLE creatures ( id INTEGER PRIMARY KEY AUTOINCREMENT, species VARCHAR, threat_level INTEGER, food VARCHAR ); ');
$dbh->exec( "INSERT INTO creatures (id,species,threat_level,food) VALUES (1,'Moose',4,'woodland shrubs')" );
$dbh->exec( "INSERT INTO creatures (id,species,threat_level,food) VALUES (2,'Zombie',10,'braaainnns')" );
$dbh->exec( "INSERT INTO creatures (id,species,threat_level,food) VALUES (3,'Duck',1,'floaty bits')" );

// Load our Active Record Class.
require_once __DIR__.'/ActiveRecord.php';
ActiveRecord::setDb( $dbh );

// Create our Active Record for this Creatures table.
class Creature extends ActiveRecord {
      public function tableName() { return 'creatures'; }
}

echo "<pre>\n";
// Load and update a record.
$moose = new Creature( 1 );
$moose->food = 'pirates';
$moose->save();
$moosecheck = new Creature( 1 );
echo 'Moose Modification Test: '.( $moosecheck->food == 'pirates' ? 'PASS' : 'FAILED' )."\n";

// Create a new record.
$cat = new Creature( NULL );
$cat->threat_level = 12;
$cat->species = 'Cat';
$cat->food = 'mice';
$cat->save();
echo 'Cat Insertion Test: '.( $cat->id == 4 ? 'PASS' : 'FAILED' )."\n";

// Delete a record.
$zombie = new Creature( 2 );
$zombie->remove();
$no_zombie = FALSE;
try {      $zombieless = new Creature( 2 ); }
catch( Exception $e ) { $no_zombie = TRUE; }
echo 'Zombie Removal Test: '.( $no_zombie ? 'PASS' : 'FAILED' )."\n";

echo "</pre>\n";

This should output:

oose Modification Test: PASS
Cat Insertion Test: PASS
Zombie Removal Test: PASS

Next Steps

Expanding on the above, what are the obvious things to add? If you are the kind of person that likes to tour the inside of a cruise ship before building your Ultimate Row Boat™, have a look at something like Doctrine or Zend DB — I personally prefer Doctrine as my ORM of choice unless I’m using all the user Zend bits. Larger projects that I’ve worked on in the past have mainly used home grown AR implementations for speed and flexibility.

Debugging Injection

I’ve alluded to both caching and debugging above — both are trivial to add though caching takes a little thought to make sure that data consistency is guaranteed to your tolerances. Adding debugging can be done at many points; see the prepared statements example for how to inject query debugging. Let’s add a simple debug flag to facilitate the later examples:

/** Are we debugging?
 *  @var boolean
 *
 *  @note Turn this one with this code:  ActiveRecord::$debug = TRUE
 */
static public $debug = FALSE;

Preparent Statements

I generally recommend using prepared statements for DB interaction if possible — they are not a panacea of security but they are cheap and provide additional security against injection attacks and similar database foibles. In order to make this simpler, we can centralize the query calls into a single function in our AR class. This also makes it easier to swap out the database specific pieces later if we have to. Here is our prepared statement _query() method:

/** Centralize prepared database queries so we can inject debugging.
 *  @param string $sql    The SQL to excute.
 *  @param array  $params The parameters to bind to the above query.
 *  @return array of rows from DB.
 */
private function _query( $sql, array $params = array() ) {
	// Add some debugging since we have all the query bits available for display.
	if( self::$debug === TRUE )
		echo ' sql debug: '.$sql.'; '.preg_replace( '/\n/', '', var_export( $params, TRUE ) )."\n";
	$stmt = self::$_dbh->prepare( $sql );
	if( count( $params ) > 0 )
		foreach( $params as $column => $value )
			$stmt->bindValue( $column, $value );
	$stmt->execute();
	$rows = $stmt->fetchAll( PDO::FETCH_ASSOC );
	if( self::$debug === TRUE )
		echo '   returned: '.count( $rows )." rows\n";
	return $rows;
}

Now we make the modifications to the three places that execute queries to user our spiffy new _query() method.

/** Load a row from the database.
 *  @return void
 */
private function _load( $id ) {
	$this->_id = $id;
	// Issue the SELECT
	$rows = $this->_query( 'SELECT * FROM '.$this->tableName()
			.' WHERE `'.$this->primaryKey().'` = :id',
	array( ':id' => $id ) );
	if( count( $rows ) < 1 )
		throw new Exception( 'Bad ID in ActiveRecord Fetch for '.get_called_class().':'.$id );
	$this->_data = $rows[0];
	unset( $this->_data[ $this->primaryKey() ] );
}

Note the cunning last line about that removes our primary key from the data array; this lets up use implode and array_map with impunity on the data array later.

/** Write this record to the underlying database.
 *  @return boolean TRUE if successful.
 */
public function save() {
	// Skip writing clean records.
	if( !$this->_dirty ) return TRUE;

	// This is used to bind the local object in the anonymous functions below.
	$base = $this;
	$params = array();

	// Handle UPDATEs
	if( $this->_id !== NULL ) {
		// This is a complex bit of PHP, I was feeling perl-ish, excuse me...
		$sql = 'UPDATE '.$this->tableName().' SET '
			. implode( ', ', array_map( function( $column ) use ( $base, &amp;$params ) {
						$params[':'.$column] = $base->$column;
						return '`'.$column.'` = :'.$column;
					}, array_keys( $this->_data ) ) )
			. ' WHERE `'.$this->primaryKey().'` = "'.$this->_id.'"';
		$this->_query( $sql, $params );
		return TRUE;
	}

	// Handle INSERTs.
	$sql = 'INSERT INTO '.$this->tableName().' (`'
		. implode( '`, `', array_keys( $this->_data ) ).'`) VALUES (:'
		. implode( ',:', array_keys( $this->_data ) ).')';
	array_walk( $this->_data,
         	    function( $value, $column ) use ( $base, &amp;$params ) {
				$params[':'.$column] = $value;
		} );
	$this->_query( $sql, $params );
	$this->_id = self::$_dbh->lastInsertId();
	return TRUE;
}
/** Remove this record from the database.
 *  @return void
 */
public function remove() {
	// Records that haven't been saved can't be removed.
	if( $this->_id == NULL )
		return;
	// Issue the DELETE SQL
	$sql = 'DELETE FROM '.$this->tableName()
		.' WHERE `'.$this->primaryKey().'` = :id';
	$this->_query( $sql, array( ':id' => $this->_id ) );
	$this->_id = NULL;
}

At this point, if we turn on debug output using ActiveRecord::$debug = TRUE; we should get this (ugly but useful) output:

 sql debug: SELECT * FROM creatures WHERE `id` = :id; array (  ':id' => 1,)
   returned: 1 rows
 sql debug: UPDATE creatures SET `species` = :species, `threat_level` = :threat_level, `food` = :food WHERE `id` = "1"; array (  ':species' => 'Moose',  ':threat_level' => '4',  ':food' => 'pirates',)
   returned: 0 rows
 sql debug: SELECT * FROM creatures WHERE `id` = :id; array (  ':id' => 1,)
   returned: 1 rows
Moose Modification Test: PASS
 sql debug: INSERT INTO creatures (`id`,`threat_level`, `species`, `food`) VALUES (NULL,:threat_level,:species,:food); array (  ':threat_level' => 12,  ':species' => 'Cat',  ':food' => 'mice',)
   returned: 0 rows
Cat Insertion Test: PASS
 sql debug: SELECT * FROM creatures WHERE `id` = :id; array (  ':id' => 2,)
   returned: 1 rows
 sql debug: DELETE FROM creatures WHERE `id` = :id; array (  ':id' => 2,)
   returned: 0 rows
 sql debug: SELECT * FROM creatures WHERE `id` = :id; array (  ':id' => 2,)
   returned: 0 rows
Zombie Removal Test: PASS

Using a real debug library like FirePHP or a logfile is left as an exercise to the reader.

Query Factory Method

We need a simple way to query the database for IDs we don’t know. In the absence of full schema definitions and with the provisio that we are not validating the SQL, we can write a simple one:

/** Find the primary keys of matching objects.
 *  @param string $sql_clause The SQL WHERE clause to use.
 *  @note This method is not SQL injection safe, please wear extra safety pants when calling.
 */
static public function search( $sql_clause ) {
    $ar_class = get_called_class();
    $base = new $ar_class( NULL );
    $sql = 'SELECT *,`'.$base->primaryKey().'` FROM '.$base->tableName().' WHERE '.$sql_clause;
    $rows = $base->_query( $sql );
    $results = array();
    if( count( $rows ) > 0 )
        foreach( $rows as $row )
            $results[] = $row[$base->primaryKey()];
    return $results;
}

And here is the code to use it along with a simple example of the output:

// Search for something.
$dangerous = Creature::search( 'threat_level > 2' );
foreach( $dangerous as $dangerous_id ) {
         $danger = new Creature( $dangerous_id );
         echo $danger->species." is a dangerous species.\n";
}
Moose is a dangerous species.
Cat is a dangerous species.

Better Error Handling and Validation

Data validation is a complex topic that deserves it’s own book — there are oodles of them if you check Google. This is well beyond the scope of this article but are some tips to point you in the right direction:

  • Checking for a valid PDO connection before using it in save() and load().
  • Defining a full schema in the descendant classes and checking the data types on input.
  • Coupling the scheme with other input validation code inside your application; either in forms or perhaps even client side Javascript.

Caching Slow Data

There are a few obvious places to inject caching as well: at database load and save points. We could use something simple like APC or Memcache to do lookups against the primary key in _load() coupled with storing the data to the same cache on write in save(). Caching only makes sense to evaluate in a larger context so I’m not going to give examples here.

In Conclusion

The Active Record pattern should be part of any modern developer’s toolkit. It can speed development time, make code more readable, future proof against database changes, and offer caching and debugging shims. AR is not the perfect database hammer but it occupies a solid niche between full blown ORMs and straight SQL queries.