The problems with arrays you know

Photo by Jon Moore on Unsplash
PHP Notice: Undefined index: name in ... on line 10

or:

PHP Warning: array_key_exists() expects parameter 2 to be array, null given in ... on line 15

and another day it is:

PHP Notice:  Undefined offset: 1 in ... on line 35

Your PHP error log probably tends to be filled with such kind of errors on and off. You can do a few things with this.

You can disable error logging and continue calling yourself a professional WordPress developer, find me on Fiverr.

You can also fix those errors and this will be a good start. Before you refer to any array element, start checking if this is an array and if the value you try to fetch is at given offset/key:

<?php

$languages = (array) $languages;

echo $languages['en']['name'] ?? 'No language found';

if ( isset( $languages[1]['name'] ) ) {
    echo "next language is {$languages[1]['name']}";
} else {
    echo "no more languages found";
}

Now no error should be logged, but first: are we sure? And second: maybe we could do this simply?

I had a reason to not show you what $languages are because this is how often your work looks like: You are operating on some data which was created a lot of files, functions, and filters ago. You can check the place where the data was set and see the structure, but still, you can’t be sure the structure is still the same tens of files, functions, and filters later.

This is an all-time problem, the bigger the system you built, the more often you would see such errors. The first error could happen in such case:

<?php

// the code you wrote months ago.
$languages = [
    'en' => [
        'lang_id' => 'en',
        'name' => 'English',
        'flag' => 'us',
        'locale' => 'en_US',
 ],
    'fr' => [
        'lang_id' => 'fr',
        'name' => 'Français',
        'flag' => 'fr',
        'locale' => 'fr_FR',
 ]
];

$languages = apply_filter( 'wp_languages', $languages );

// Far far away, in another galaxy... I mean in another file,
// you are working on right now:

foreach ( $languages as $language ) {

    _e( 'Available languages', 'wp-languages' );
    ?>
    <ul>
        <?php
        foreach ( $languages as $language ) {
            ?>
            <li><?php echo esc_html( $language['name'] ); ?></li>
            <?php
 }
        ?>
    </ul>
    <?php
}

echo "Please expect " . $languages[1] . " is not fully available yet";

The code looks valid (and ugly, I know, but to focus on the one aspect, I did not spend too much time polishing it, sorry about that), but we already know, if it produced the errors, in fact, something is wrong with it.

Arrays are mutable and this means every time you try to get some value from it, you don’t know if the value exists at all.

The $languages has passed through the filter (and the filter can completely change its value, including removing some elements, and their keys or even completely unsetting $languages or setting them to string „ladies and gentlemen, będę mówił po polsku”).

You could do the data presence and format checking in the way I proposed in the previous code snippet and there is nothing wrong with this approach except… there is a better way to do this: do not use arrays.

Arrays are mutable and this means every time you try to get some value from it, you don’t know if the value exists at all. Arrays passed through the stack of tens if not hundreds of functions can lose their value, get a new value with a completely different meaning, or even stop being arrays at all. This means you not only have to do such checking when you refer to the array, but you have to do this EVERY TIME you refer to the array.

There is no way to create a contract for the array, that will describe the array format. I mean, you can do this:

/**
* List of available languages
*
* The array contains the list of available languages, indexed by language code.
* Each language is an array with the following keys:
* - lang_id: the language code (same as array key)
* - name: the language name
* - flag: the language flag
* - locale: the locale code
*
* The key is always two letter long, lowercase.
* The name is capitalized... oh nevermind, no one will ready this.
*/
$languages = [

But do we have any guarantee any other developer (or even you after a few months) will follow this contract when working with those data? Not to mention, whether anyone would read this contract before starting play with $languages array (even you right now, have you seen the last line of the comment?).

I hope I convinced you arrays are not the best things. They are good at some use cases, but for sure not to transport data between different parts of the system.

So what instead of them? Let me show you how to replace them with objects and what are benefits of such change.

Our $languages was an array where every element was also an array representing a single language. Let’s start by replacing this single language with a dedicated…

Language class:

class Language {

    private $name;

    private $locale;

    private $lang_id;

    private $flag;

    public function __construct( $name, $locale ) {

        $args = func_get_args();

        if ( ! $this->validate( $args ) ) {
            throw new InvalidArguments( 'At least one of the arguments is invalid' );
        }

        $args = $this->sanitize( $args );

        $this->name = $args[0];
        $this->locale = $args[1];
        $this->lang_id = strtolower( explode( '_', $this->locale )[0] );
        $this->flag = strtolower( explode( '_', $this->locale )[1] );
    }

    public function __get( $key ) {

        return $this->$key;
    }

    private function validate( $args ) {

        // @todo
        // check if name is a string.
        // check if locale is a string 2 letters underscore and 2 letters.

        return true;
    }

    private function sanitize( $args ) {

        // We can be forgiving here, and sanitize some mistakes.

        $args[0] = ucwords( $args[0] );
        $args[1] = str_replace( '-', '_', $args[1] );

        return $args;
    }
}

Now our $languages could look like this:

$languages = [
    'en' => new Language( 'English', 'en_US' ),
    'fr' => new Language( 'Français', 'fr_FR' ),
];

The object Languages might look like overkill, but it has its benefits that sometimes can overbalance the drawback of its size. First what you can see is that constructing $languages’ array is way simpler than it was. We are just initializing two objects of Language class.

What’s more, we pass only two arguments to each object: human-readable language name and language locale in ISO format. Language id and flag – as they seem to resemble parts of language code – are constructed on the fly. This is not possible with arrays.

What is more important, each language is now non-mutable. Once created this way, will stay the same forever. Is there someone in your team who would replace English locales with pt_BR? Sorry, can’t do this. All properties of the Language are private, only readable with magic method __get. If he wants to have this quirky language, he needs to create a new class instance and set it this way.

Or maybe he couldn’t. Using objects can ensure weird instances would not be created at all. For this purpose, we have a validate method used to throw exceptions in case someone tries to do something that is not a part of the contract. No one will be able to create new Language( 'Polish', [ 'pl', 'PL' ] ); because the second parameter must be a string. You can add as many validation rules as you want, restrict language names to existing ones, same for ISO codes, etc.

All the things above you can’t achieve with arrays. But have we already solved the problem with PHP errors?

The languages collection

All the Language objects are kept in $languages which is still an array. All will be fine if someone try to get:

echo $languages['en]->name;


and even this will not produce any errors if we try to get nonexisting property:

echo $languages['en']->vocabulary;

There still could be an undefined key error if someone tries to do a thing like this:

echo $languages['de']->name; // There is no language like 'de'.

How to solve this problem? Yes, object, but a different type of.

class LanguagesCollection {

    private $languages = [];

    public function store( Language $language ) {

        $this->languages[$language->lang_id] = $language;
    }

    public function get( $lang_id ) {

        if ( ! isset( $this->languages[$lang_id] ) ) {
            return new NullLanguage;
        }

        return $this->languages[$lang_id];
    }
}

Now our languages will be created this way:

$languages = new LanguagesCollection;
$languages->store( new Language( 'English', 'en_US' ) );
$languages->store( new Language( 'Français', 'fr_FR' ) );

And if you want to let’s say display an English language name, you can do this:

echo $languages->get('en')->name;

If you try to get nonexisting language, like:

echo $languages->get('de')->name;

it will just return null. Why? Because get method checks the language’s existence and if it is not stored yet (you can always add any other language you like with store method), it returns NullLanguage. It isn’t any built-in class type, so let’s register it, it is simple:

class NullLanguage {

    public function __get( $key ) {
        return null;
    }

    public function __call( $name, $arguments ) {
        return null;
    }
}

It is the class that always returns null if you try to get any property of it or call some method from it. No rocket since, but protects us from getting Trying to get property name from not an object or something similar.

Do you want to iterate $languages like it was an array, just add this method into LanguagesCollection :

    public function to_array() {
        return $this->languages;
    }

And call it this way:

foreach( $languages->to_array() as $lang_id => $lang ) {
    echo esc_html( $language->name );
}

Our code is ready. Is it perfect? No, but it is much safer than when we use arrays:

  • you don’t need to check if the language has lang_id, name, flag, or locale set. It always has, because no one will be able to store inside the LanguageCollection other type of data. If someone tries to do:
$languages = new LanguagesCollection;
$languages->store( new Language( 'English', 'en_US' ) );
$languages->store( new Language( 'Français', 'fr_FR' ) );
$languages->store( '<span>Other languages</span>' );
$languages->store( new Language( 'Deutsch', 'de_DE' ) );


it will not allow this and trigger an error. Someone who is trying to do it will get a PHP error of the wrong data type passed on the penultimate line.

  • $languages are mutable, so you (or other developers) can extend their list (you can also narrow this list if you add a method to delete() language) depending on the use case. No matter if the list will be extended or not, you don’t need to check if newly added languages have all properties set. LanguageCollections cares about this.
    There is only one big problem. If we pass the languages through the filter:
$languages = apply_filter( 'wp_languages', $languages );

we can’t be sure someone who hooks into this filter will simply not replace $languages with anything other than just a string English, French and German. So we need in that case check if $languages are still an object of LanguagesCollecion class. To not repeat it all the time, I propose to create an wrapper around apply_filters function which will check if the type has not been altered and only then return the result of the filter:

if ( ! function_exists( 'apply_filters_strict' ) ) {

    function apply_filters_strict( $tag, $type, $value ) {
        $filtered = apply_filters( $tag, $value );

        return is_a( $filtered, $type ) ? $filtered : $value;
    }
}

now use it this way:

$languages = apply_filters_strict( 'wp_languages', 'LanguagesCollection', $languages );

Clever, huh?

  • Each language is non-mutable. No need to check again and again if something that we set on the index referred by $languages->get(’en’) still has „English” as its name. It is, guaranteed.
  • You don’t need to worry if other developers will follow the data structure you planned for languages. No matter if $languages will be used right after setting or tens of files, hundreds of methods later, it will always look internally as you wanted. Is there a bug in the code? Do not var_dump or inspect the internals of the languages in any other way. Just check if it is still an object of LanguagesCollections type and only if not, trace back why it happened (and if you follow the rules described above about type checking, there are almost no chances $languages would lose their original type).

Congratulations! You just learned two design patterns.

I am sorry for tricking you, but I know what happens when you start your lecture by reading „Today I will describe x design pattern”. People become defensive and withdrawn, because this fancy design pattern term looks like something way above their level, especially when you take into account how experience could vary among different WordPress developers. So I started with the example.

Each language – as an instance of Language class – satisfies the Value Object design pattern and $languages which are LanguagesCollection is the implementation of the Data Transfer Object design pattern.

Value Object

is an object that stores some composed value and this composition must be super strict. It allows to addition and storage of data as an instance of this class but only if they meet validation criteria. The other important rule is they are non-mutable. They represent some final units of data. A common example of Value Object design patterns are objects that represent money. You can create some new Coin( 0.50, 'USD' ) and let your 50 cent go, but can someone who takes it from you change this particular coin into 20 cents? Nope.

Values does not change, once set, they stay like that forever.

Data Transfer Objects

are containers to transfer data. In any case, where arrays could cause problems or require a ton of structure validation, DTO comes to help. They are realising contracts as a way to make sure different parts of the system (or different systems) will operate always on the same data structure.

Where else?

I said different systems because DTOs are usually a way to prepare (and transfer) data which are leaving one system (server) and is passed to another via some API. Are you building a payment solution that will send payment requests to Stripe, but the data that will be sent are prepared in the form that is part of your theme, then are logged into your database using some functionality in the functions.php file, and to avoid sending a request directly from the form, you instead queue them in Action Scheduler or other way and only then they are sent? Check what format of the data Stripe expects (now if you look into Stripe API documentation – or any other API – you will notice every payload has mentioned its datatype, with light grey font, but it was always there), prepare your DTO for it and use it since the very beginning of processing the payment: from the moment you obtain the data from form to the moment when wp_remote_post says goodbye to it.

How do you feel now, you, design pattern newbie? Or maybe you have been using those design patterns unknowingly for years? If not Value Object or Data Transfer Object, I guarantee you are quite well experienced with other ones already, without knowing their names.


Opublikowano

w

,

przez

Komentarze

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *