Serialize or encode data in PHP to be used in URLs

As you search the Internet, you’ll find many sources (like BinaryTides blog) with many useful (or not) tips on how to serialize or encode variables and data in PHP.

The problem is, that not all of these methods are useful, when we’re dealing with URLs.

The goal

We want to “convert” a variable or data into such form, that we will be able to attach it as a part of URL address. Method must assure, that URLs itself won’t break (so resulting string must not contain spaces and any other strange or unacceptable variables). Plus, it can’t be to easily readable to end user (as if URL is easy to readout, some people feels an urge need to change it).

Following functions are wrong for coding strings / data to be used in URLs:

  • pure (see below) serialize unserialize (long, readable string),
  • json_encode and json_decode (same as above),
  • wddx_serialize_value and wddx_deserialize (XML string),
  • convert_uuencode and convert_uudecode (string containing many unacceptable characters).

All of above breaks any of two mentioned asumptations. This goes especially to serialize. It produces string that can be fearless attached to any URL, but it makes them to easily readable (and overall URLs fairly to long).

A bit different approach of using var_export (will export a variable in php syntax that can be eval’d to produce the same variable again) and eval is also wrong. Why? Let me just cite once again, a nine years old quote, said to be first time spoken by Rasmus Lerdorf, the creator of PHP: “If eval() is the answer, you’re almost certainly asking the wrong question“.

We can’t use functions that makes use of MD5, SHA1 and similar algorithms, as there are one-way algorithms and can’t be decoded (unserialized).

What do we have left?

Serialize on steroids

We can use:

  • base64_encode to power up serialize and
  • base64_decode to power up unserialize

Just like that:

$serializedString = base64_encode(serialize($string));

$string = unserialize(base64_decode($serializedString));

This is fairly good for URLs, as we’ll most likely be using this for a short strings and/or simple objects. If you’re looking for a solution to encode or serialize 5 MB or more of data then this approach is most certinaly wrong, as data overhead would be deadly huge here and speed would be killing slow.

As Silver Moon of BinaryTides suggests, we can also use gzcompress and gzuncompress to further power up this process. But my short tests of an average URL proven, that using base64_encode plus gzcompress produced only four characters shorter string (129 instead of 133 characters), so this isn’t much saving. And, since gzcompress‘ed strings ma contains / charcter, using them in URLs may be wrong idea.

On poorly (or to extensively) configured web server, this could lead to URL malfunction. So, for URLs use them with maximum care.

An alternative

Mcrypt is a good alternative, especially for sensitve and secure data. It produces fairly results basing on acceptable performance. But again, can’t be considered for large amount of data, due to speed.

It is also much more complex as involves calling several functions instead of one mcrypt_module_open function contais a good example on both crypting and decrypting data.

Leave a Reply