What is the fastest way to serialize and unserialize values in PHP?
In PHP one can serialize values in many ways. There are at least serialize
, json_encode
and var_export
. Unserialization can be done with unserialize
, json_decode
and rather ugly eval
.
While doing search functionality for this site I started thinking which is the fastest.
So I wrote this script to test it:
<?php
date_default_timezone_set('Europe/Helsinki');
ini_set('display_errors', 1);
error_reporting(E_ALL);
echo "synthetic:\n";
$data = generate_synthetic_data(0, 5);
$data = test_performance($data, 10);
show_results($data);
echo "realistic:\n";
$data = generate_realistic_data();
$data = test_performance($data, 10);
show_results($data);
// just prints out averages for each serialization function
function show_results($data) {
echo '<pre>';
foreach ($data as $function => $results) {
$total = 0;
$average = 0;
echo $function."\n";
foreach ($results as $iteration => $time) {
//echo "\t".$time."\n";
$total += $time;
}
$average = $total/count($results);
echo "avg:\t".$average."\n\n";
}
echo '</pre>';
}
// helper function to generate random string with given lenght
function random_string($length) {
$keys = array_merge(range(0,9), range('a', 'z'));
$key = '';
for($i=0; $i < $length; $i++) {
$key .= $keys[array_rand($keys)];
}
return $key;
}
// generates 1000 blog post like arrays
function generate_realistic_data() {
$data = array();
$post = array();
for ($i = 0; $i < 1000; $i++) {
$date = mt_rand(1262304000, 1325376000);
$title_len = mt_rand(1, 100);
$content_len = mt_rand(100, 2000);
$content_html_len = mt_rand($content_len-50, $content_len+500);
$author_len = mt_rand(5, 20);
$post['date'] = date("Y-m-d H:i:s", $date);
$post['title'] = random_string($title_len);
$post['content'] = random_string($content_len);
$post['content_html'] = random_string($content_html_len);
$post['author'] = random_string($author_len);
$data[] = $post;
}
return $data;
}
// generates nested arrays
function generate_synthetic_data($depth, $max) {
static $seed;
if (is_null($seed)) {
$seed = array('a', 2, 'c', 4, 'e', 6, 'g', 8, 'i', 10);
}
if ($depth < $max) {
$node = array();
foreach ($seed as $key) {
$node[$key] = generate_synthetic_data($depth + 1, $max);
}
return $node;
}
return 'empty';
}
// runs tests with given data
function test_performance($data, $iterations) {
$json_encode = array();
$json_decode = array();
$serialize = array();
$unserialize = array();
$var_export = array();
$eval = array();
$results = array();
for ($i=0; $i < $iterations; $i++) {
// json_encode
$json_encoded_data = array();
$start = microtime( true );
foreach ($data as $key => $value) {
$json_encoded_data[] = json_encode($value);
}
$time_json = microtime(true) - $start;
$json_encode[] = $time_json;
// serialize
$serialized_data = array();
$start = microtime(true);
foreach ($data as $key => $value) {
$serialized_data[] = serialize($value);
}
$time_serialize = microtime(true) - $start;
$serialize[] = $time_serialize;
// var_export
$exported_data = array();
ob_start(); // otherwise var_export would echo all over the place
$start = microtime(true);
foreach ($data as $key => $value) {
$exported_data[] = var_export($value, true);
}
$time_export= microtime(true) - $start;
ob_end_clean(); // otherwise var_export would echo all over the place
$var_export[] = $time_export;
// json_decode
$start = microtime( true );
foreach ($json_encoded_data as $key => $value) {
json_decode($value);
}
$time_json_decode = microtime(true) - $start;
$json_decode[] = $time_json_decode;
// unserialize
$start = microtime(true);
foreach ($serialized_data as $key => $value) {
unserialize($value);
}
$time_unserialize = microtime(true) - $start;
$unserialize[] = $time_unserialize;
// eval
$start = microtime(true);
foreach ($exported_data as $key => $value) {
eval('return '.$value.';');
}
$time_eval = microtime(true) - $start;
$eval[] = $time_eval;
}
$results['json_encode'] = $json_encode;
$results['serialize'] = $serialize;
$results['var_export'] = $var_export;
$results['json_decode'] = $json_decode;
$results['unserialize'] = $unserialize;
$results['eval'] = $eval;
return $results;
}
Spage.fi runs on Amazon micro instance, and the results on that look like this:
synthetic:
json_encode
avg: 0.032392120361328
serialize
avg: 0.076668643951416
var_export
avg: 0.070131397247314
json_decode
avg: 0.11707892417908
unserialize
avg: 0.083262372016907
eval
avg: 0.16133031845093
realistic:
json_encode
avg: 0.034033703804016
serialize
avg: 0.0052637815475464
var_export
avg: 0.016560435295105
json_decode
avg: 0.040717649459839
unserialize
avg: 0.0040281295776367
eval
avg: 0.017697238922119
"Synthetic" results are done with deep nested arrays where all values are short. "Realistic" results are done with arrays that mirror typical blog post, so no nested arrays, few keys and all values are 1 to 2500 characters.
So based on this test, with shallow arrays serialize
and unserialize
are about ten times faster than json_encode
and json_decode
and twice as fast as var_export
and eval
. With deep arrays json_encode
is about twice as fast as serialize
or var_export
and unserialization functions perform about the same.