Serialize or encode data in PHP to be used in URLs
As you search the Internet, you’ll find many sources (like BinaryTides blog) with many useful (or not) tips on how to serialize or encode variables and data in PHP.
The problem is, that not all of these methods are useful, when we’re dealing with URLs.
The goal
We want to “convert” a variable or data into such form, that we will be able to attach it as a part of URL address. Method must assure, that URLs itself won’t break (so resulting string must not contain spaces and any other strange or unacceptable variables). Plus, it can’t be to easily readable to end user (as if URL is easy to readout, some people feels an urge need to change it).
Following functions are wrong for coding strings / data to be used in URLs:
- pure (see below)
serialize
unserialize
(long, readable string), json_encode
andjson_decode
(same as above),wddx_serialize_value
andwddx_deserialize
(XML string),convert_uuencode
andconvert_uudecode
(string containing many unacceptable characters).
All of above breaks any of two mentioned asumptations. This goes especially to serialize
. It produces string that can be fearless attached to any URL, but it makes them to easily readable (and overall URLs fairly to long).
A bit different approach of using var_export
(will export a variable in php syntax that can be eval’d to produce the same variable again) and eval
is also wrong. Why? Let me just cite once again, a nine years old quote, said to be first time spoken by Rasmus Lerdorf, the creator of PHP: “If eval()
is the answer, you’re almost certainly asking the wrong question“.
We can’t use functions that makes use of MD5, SHA1 and similar algorithms, as there are one-way algorithms and can’t be decoded (unserialized).
What do we have left?
Serialize on steroids
We can use:
base64_encode
to power upserialize
andbase64_decode
to power upunserialize
Just like that:
$serializedString = base64_encode(serialize($string)); $string = unserialize(base64_decode($serializedString));
This is fairly good for URLs, as we’ll most likely be using this for a short strings and/or simple objects. If you’re looking for a solution to encode or serialize 5 MB or more of data then this approach is most certinaly wrong, as data overhead would be deadly huge here and speed would be killing slow.
As Silver Moon of BinaryTides suggests, we can also use gzcompress
and gzuncompress
to further power up this process. But my short tests of an average URL proven, that using base64_encode
plus gzcompress
produced only four characters shorter string (129 instead of 133 characters), so this isn’t much saving. And, since gzcompress
‘ed strings ma contains /
charcter, using them in URLs may be wrong idea.
On poorly (or to extensively) configured web server, this could lead to URL malfunction. So, for URLs use them with maximum care.
An alternative
Mcrypt is a good alternative, especially for sensitve and secure data. It produces fairly results basing on acceptable performance. But again, can’t be considered for large amount of data, due to speed.
It is also much more complex as involves calling several functions instead of one mcrypt_module_open
function contais a good example on both crypting and decrypting data.