PHP generate a UUID with ease in 2024

HowTo make PHP generate uuids tutorial — How to make PHP generate uuids tutorial

To make PHP generate a UUID, the most basic way would be using the native „uniqid“ function. However, keep in mind, that this one isn’t an ID considering the known UUID standards (being UUIDs, versions 1-5). If you want to generate these UUIDs, you should take a well known and tested library like ramsey/uuid.

You want to make PHP generate UUIDs for you!?

I welcome you to todays post on „How to make PHP generate a UUID for you“. In this post I will dive into the thing called universally unique identifiers. Those „special kind“ identifiers are – as the name obviously suggests – for uniquely identifying „things“.

If you would like to skip the „blabla“, jump to the manual creation here. If you would like a short library based example, click here. I would encourage you though, to study the different versions of UUIDs.

There are different versions of UUIDs, mainly you can number them from versions 1 to 5. Many projects nowadays require a new way of generating that unique identifier for like your entities. It doesn’t matter if it’s stored in a good old MySQL Database, or in one of the document based alternatives.

Using a different approach than those normal, incremental „1, 2, ..“ ids, has become more and more popular. This doesn’t only apply for the developers side, customers and other involved parties actually want something else, too. So let’s get into the background story and later into how we can actually create/generate one.

The background story – PHP generate UUID’s

Before we actually start thinking about code, we could also start from a little different angle. So imagine having a graphical user interface, like most of the time, with some kind of list. You got that usual screen with something like a table, which displays some data about the listed entities.

So far this isn’t really exciting, I know, but what comes into your mind, when thinking about the first column? The first column inside that table, will usually display the unique identifier for that entity. So basically this will be an incrementing number, which starts at like 1 and is constantly and linearly growing.

Sure, there are some cases where people want to define another starting point like at 5000, but.. Here we are, already facing one of the common problems when using „normal“ numbers.

The thing with guessability

No matter, if typical numbers start at 300, or at 300000, you can always kinda guess, what could potentially be the next one. When you have one id, you can increment that by yourself and you could test „Hmm, what’s the next one about!?“. This is literally one thing, where some customers and developers argue about doing it different.

The main reason for the customer would mostly be, to hide actual counts from the end user. End users could guess from order number „5“, that this shop maybe just started. This could lead the end user to actually interpret more, maybe negative things into that.

Thoughts like „Hmm, this shop did just start, maybe it’s a scam..“ could arise.

Being able to generate offline

Usually the approach of working with the database or entities in general, is like the following. You have that basic overview screen, where you click this sweet „+“-button and voila. After that, you will have your detail screen displayed, which is ready to get filled out.

You will most likely fill out that form in front of you and then press something like „confirm“, or „save“. This will call some kind of api in the background, which then creates that new entity for you. During that process, the database will be contacted as central unit, being responsible for generating the new ID.

Now think about what happens, when you can’t contact the database, if you’re offline. Well I think you are correct – usually nothing will happen.. At this point, UUIDs come in handy, as they can be generated offline and can still be unique!

Uniqueness accross systems & replications

Imagine you have that one online shop, where you already got thousands of orders, bills and more data. Now imagine building up on that one shop, where you want to export and import the data in another system. Well, if you’re using normal, incremental numbers, you could easily run into some id clashes.

With UUIDs on the other hand, this most likely won’t be the case (care for the different versions here). By the word „universally“ of the full name „universally unique identifier“ this might get clearer. As I already promised, I won’t go to deep on this aspect of UUID, maybe I’ll do so in a later post!

You can just keep in mind, that UUIDs will prevent clashing, even generated on different systems.

What’s does the trend say?

As I mentioned above, the typical UUID spectrum consists of different versions, being 1 – 5. They all have their different aspects and considerations, you have to be aware of. I would therefore encourage you to do some studies, before actually using, not to mention, even generating them for yourself.

Here’s a graphic, which could give you a small overview of like which UUID versions are relevant / trending. Usually, this trend counts for like „in general“, not only for PHP. Between, this graphic is from today’s date (29.01.2024) considering data from the last 12 months:

PHP generate UUID - Google Trends for versions 1-5 — PHP generate UUID – Google Trends for versions 1-5

Looking at the image you can see, that the most used version is UUID4, followed by like version 1 and 5.

The different versions of UUIDs

UUIDs (Universally Unique Identifiers) can be generated in different versions, each having a specific purpose. Of course, you need to be sure, that the underlying library (or even yourself) is correctly implementing these aspects. Here are the commonly recognized versions of UUIDs:

UUID1 (Time-based)

This version is generated from the current time and the MAC address of the computer generating the UUID. It is unique within a particular network and time period.

UUID2 (DCE Security)

It’s pretty similar to UUID version 1, but it has support for DCE security. It’s not pretty much used from my experience – supported by the shown data from above, as well.

UUID3 (Name-based, MD5 hash)

This UUID version is generated from a namespace identifier and an arbitrary name. It uses MD5 hashing to create a UUID based on the namespace and name.

UUID4 (Random)

This is – in my experience and according to the above data in the image – the most common UUID version. It’s generated using random or pseudo-random numbers. It therefore provides a high probability of uniqueness.

UUID5 (Name-based, SHA-1 hash)

The version 5 UUID is pretty similar to version 3, but it uses SHA-1 (Secure Hash Algorithm) hashing instead of MD5 (the message digest algorithm).

Disadvantages of using UUIDs

I think every upside on a specific topic, will mostly have a downside as well, so let’s talk about that. On the one side, UUIDs will be that kind of added value from the aspects mentioned above. The other side like performance, indexability, etc. counts as well, for sure.

Indexability and performance

When using uuids as keys inside your database, theres one big thing which primarily comes into your mind. I’m talking about the „indexability“ and performance of actually storing the UUIDs inside your database tables. I can’t go to deep here, as there are already so many opinions on the web, so I would recommend doing further research.

One of the most reasons I’ve heard about not using UUIDs is like hard dropping performance when used on multiple joins. You should take a read on the web with search queries like „index fragmentation“.

Storage size

I’ve seen it many times, where people stored the UUIDs as like „CHAR(36)“, means strings consisting of 36 characters. Actually, the UUIDs can be stored as a 128 bit = 16 byte, unsigned number, which is kinda more space optimized. Speaking about storage size in general I think, nowadays storage is much cheaper.

Especially if you can provide a better performance, I would preferably go for that over storage. For sure, this also depends on your use case and more things like budget, etc. Anyways i’m a big fan of optimization, where it actually makes sense.

If effort doesn’t outplay the result as hard as I don’t know, then it would make sense for me, so.. Generally you could say, that an UUID needs 4 times more space than like a 4 byte integer. It’s also really nice, that MySQL added more support for UUIDs on their version 8.

Readability

Another important aspect, which has its downside as well as its upside (as mentioned in „guessability“) is readability. While it can be nice for some folks out there, that you can’t really read those identifiers as easy as simple numbers. Other stress on having non readable identifiers, as like telling your customer on a request: „Hey, do you have an id for me?“.

For sure, this wouldn’t be a good experience, having your customer tell you like 36 characters. Theres another thing coming along with that readability point, especially when talking about URLs. You can tell by just looking at those URLs:

/api/users/cc4f998b-29d8-4163-8bbe-2a44a1ccb2ac
/api/users/55
/api/users/my_nice_nickname

PHP generate a UUID – Manually

After discussing some aspects about using those UUIDs, let’s now actually talk about generating a UUID with PHP. As I already mentioned, we can’t dive to deep into the different versions here, so I will focus on UUID version 4. Many online code examples use the mt_rand PHP function, which is (as it states itself from the docs) not good for generating cryptographic thingys.

After finding different examples, I think this one here is the best one, being a mix of different answers. In the first step we need to choose the „correct“ function for generating 16 random bytes. Below PHPs major version 7, this is supported by using the „openssl_random_pseudo_bytes„-function.

Starting with major PHP version 7 we can use the „random_bytes„-function. With an easy ternary operation, it could look like this:

$data = PHP_MAJOR_VERSION < 7 ? openssl_random_pseudo_bytes(16) : random_bytes(16);

In the next step we are arranging the data, so that the version of the UUID, etc. actually matches:

$data[6] = chr(ord($data[6]) & 0x0f | 0x40);    // Set version to 0100
$data[8] = chr(ord($data[8]) & 0x3f | 0x80);    // Set bits 6-7 to 10

In the last step we are displaying the actual UUID by returning it as formatted string:

$uuid = vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));

You could also split up that last step into multiple other calls, but I think this doesn’t really help visibility here.

So the complete function could look like this:

function get_guid() {
    $data = PHP_MAJOR_VERSION < 7 ? openssl_random_pseudo_bytes(16) : random_bytes(16);
    $data[6] = chr(ord($data[6]) & 0x0f | 0x40);    // Set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3f | 0x80);    // Set bits 6-7 to 10
    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

You could also expand further by going the „Microsoft way“, with the „com_create_guid„-function. This function will only (as far as I know) work, if you’re on a Windows based server. It will generate a UUID with prefixed and appended curly braces, hence the trim statement.

function get_guid() {
    if (function_exists('com_create_guid') === true)
        return trim(com_create_guid(), '{}');
    $data = PHP_MAJOR_VERSION < 7 ? openssl_random_pseudo_bytes(16) : random_bytes(16);
    $data[6] = chr(ord($data[6]) & 0x0f | 0x40);    // Set version to 0100
    $data[8] = chr(ord($data[8]) & 0x3f | 0x80);    // Set bits 6-7 to 10
    return vsprintf('%s%s-%s-%s-%s-%s%s%s', str_split(bin2hex($data), 4));
}

Generating a UUID with ramsey/uuid

Before doing your own experiments and implementing the versions of UUIDs for yourself, it’s probably better to use an existing library. Usually, I’m a fan of „doing things myself“, but when it comes to security and randomness, I would love to rely on broadly spreaded projects, that have been tested by many persons.

Of course you could have typical problems like the author abandoning the projects, etc. but in the end, you have to decide the common factors like:

Maintainability
Security
Testing
Time-savings
….

So let’s now take a look at an example code from the ramsey/uuid library, feel free to visit the project on Github.

Installing the library

As with the most of PHP based libraries, we will install this one with composer as well. Open a terminal in your project folder and run the following installation command:

composer require ramsey/uuid

Import and usage

After you have installed the library, you can now import the corresponding classes. Put import statement on the usual upper part, of your code file. You can then just call the static helper functions, to generate a UUID of your desire.

We will generate a UUID4 in this case:

<?php
namespace yournamespace;

// here's the important part...
use Ramsey\Uuid\Uuid;

// here's the generation of the UUID4 itself
$myUuid4 = Uuid::uuid4();

Formatting and stuff

Keep in mind, that the previous example created an instance of the following class:

Ramsey\Uuid\Rfc4122\UuidV4

If you want to do some formatting actions, you can use things – a ccording to the documentation – like this:

<?php
namespace yournamespace;

// import
use Ramsey\Uuid\Uuid;

// creation
$uuid = Uuid::uuid4();

// output
printf(
    "UUID: %s\nVersion: %d\n",
    $uuid->toString(),
    $uuid->getFields()->getVersion()
);