IDN Conversion

In order to safely translate a domain name into it’s unicode representation and vice versa, we need a tool to correctly report the conversion results. To do so the package provides an enhanced OOP wrapper around PHP’s idn_to_ascii and idn_to_utf8 functions using the League\Uri\Idna\Converter class.

With vanilla PHP you have to do the following:

<?php

$flags = IDNA_NONTRANSITIONAL_TO_ASCII |
    IDNA_CHECK_BIDI |
    IDNA_USE_STD3_RULES |
    IDNA_CHECK_CONTEXTJ;
    
$res = idn_to_utf8('www.xn--85x722f.xn--55qx5d.cn', $flags, INTL_IDNA_VARIANT_UTS46, $result);

$res;               // returns 'www.食狮.公司.cn'
$result['result'];  // returns 'www.食狮.公司.cn'
$result['errors'];  // returns 0
$result['isTransitionalDifferent'];  // returns false

which means remembering:

In contrast, when performing a conversion with a method from League\Uri\Idna\Converter a League\Uri\Idna\Result instance is returned with information regarding the outcome of the conversion.

<?php

use League\Uri\Idna\Converter;

/** @var League\Uri\Idna\Result $result */
$result = Converter::toUnicode('www.xn--85x722f.xn--55qx5d.cn');
$result->domain();                  // returns 'www.食狮.公司.cn'
$result->isTransitionalDifferent(); // return false
$result->hasErrors();               // returns false
 
$result = Converter::toAscii('www.食狮.公司.cn');
$result->domain();                  // returns 'www.xn--85x722f.xn--55qx5d.cn'
$result->isTransitionalDifferent(); // return false
$result->hasErrors();               // returns false

In case of errors the Result::hasErrors method returns true and you can inspect the found errors using the Result::errors method which returns a list of Error enum.

<?php

use League\Uri\Idna\Converter;
use League\Uri\Idna\Error;

$result = Converter::toAscii('aa'.str_repeat('A', 64).'.%00.com');
$result->hasErrors(); //return true
$result->hasError(Error::LABEL_TOO_LONG); // returns true
foreach ($result->errors() as $error) {
    echo $error->name, ': ', $error->description(), PHP_EOL;
}
//displays
//LABEL_TOO_LONG: a domain name label is longer than 63 bytes
//DISALLOWED: a label or domain name contains disallowed characters

In case of error the return value of Result::domain may not be the same as the submitted value and may highlight the host part that triggered the error as per the specifications.

The League\Uri\Idna\Error enum provides the official name of the error as well as its description via the Error::description method.

Both static methods Converter::toAscii and Converter::toUnicode expect a host string and some IDN related options.

since version 7.2.0 The Converter methods also accept Stringable object as domain input.

You can provide PHP’s own constants or if you want a more readable API you can use the League\Uri\Idna\Option immutable object.

<?php

use League\Uri\Idna\Converter;
use League\Uri\Idna\Option;

$flags = IDNA_NONTRANSITIONAL_TO_ASCII |
    IDNA_CHECK_BIDI |
    IDNA_USE_STD3_RULES |
    IDNA_CHECK_CONTEXTJ;
//can be rewritten as

$option1 = Option::new(IDNA_NONTRANSITIONAL_TO_ASCII)
    ->add(IDNA_CHECK_BIDI)
    ->add(IDNA_USE_STD3_RULES)
    ->add(IDNA_CHECK_CONTEXTJ);

//can be rewritten as

$option2 = Option::new()
    ->nonTransitionalToAscii()
    ->checkBidi()
    ->useSTD3Rules()
    ->checkContextJ();

//can be rewritten as

$option3 = Option::forIDNA2008Ascii();

echo idn_to_ascii('bébé.be', $option);
echo idn_to_ascii('bébé.be', $option1->toBytes());
echo idn_to_ascii('bébé.be', $option2->toBytes());
echo idn_to_ascii('bébé.be', $option3->toBytes());

echo Converter::toAscii('bébé.be')->domain();
echo Converter::toAscii('bébé.be', $option)->domain();
echo Converter::toAscii('bébé.be', $option1)->domain();
echo Converter::toAscii('bébé.be', $option2)->domain();
echo Converter::toAscii('bébé.be', $option3)->domain();

//all the above calls will produce the same result 'xn--bb-bjab.be'

If you provide a Option instance, the Option::toBytes method will be called inside the conversion method when appropriate.

In contrary to PHP functions, if no option is provided both methods will use the correct basic options to validate domain names:

Last but not least if you prefer methods that throw exceptions instead of having to check the Result::hasErrors method for error you can use the following related methods:

Both methods will directly return the converted domain string or throw a League\Uri\Exceptions\ConversionFailed exception on error. You can still access:

The exception message will contain a concatenation of all the error descriptions available for the conversion.

<?php

use League\Uri\Idna\Converter;
use League\Uri\Exceptions\ConversionFailed;

try {
    $domain = Converter::toAsciiOrFail('%00.com');
} catch (ConversionFailed $exception) {
    $result = $exception->getResult(); // returns the `League\Uri\Idna\Result` object
    echo $exception->getHost();        // display "%00.com"
    echo $result->domain();            // display "�00.com"
    echo $exception->getMessage(); 
    //displays "Host `%00.com` is invalid: a label or domain name contains disallowed characters."
}

since version 7.2.0

It is possible to determine if a given domain is or can be internationalized using the new Converter::isIDN method. The method will return true if the submitted domain is valid and is/can be converted to its unicode form.

<?php

use League\Uri\Idna\Converter;

Converter::isIdn('www.example.con'); // return false - the unicode form is identical
Converter::isIdn('%00.com'); // return false - The host is invalid
Converter::isIdn('bébé.be');   // return true
Converter::isIdn('www.xn--85x722f.xn--55qx5d.cn');  // return true - a IDN host in its ascii form