Developers:

© 2007, Geert De DeckereSetting Up a Multilingual Website

You want to create a multilingual site. Each translation should be available via a URL with a language code of two characters in its first segment, like this: example.com/en/page, example.com/nl/page, etc. This may seem a daunting task. However, this tutorial is here to show you how to tackle this situation using the power and flexibility of Kohana, in just four steps.

1. Force language in every URL via .htaccess

Start by creating a .htaccess file in the document root of your website.

RewriteEngine On
RewriteBase /

# Force EVERY URL to contain a language in its first segment.
# Redirect URLs without a language to the invalid xx language.
RewriteCond $2 !^([a-z]{2}(/|$)) [NC]
RewriteRule ^(index\.php/?)?(.*)$ xx/$2 [R=301,L]

# Silently prepend index.php to EVERY URL.
RewriteCond $1 !^(index\.php)
RewriteRule ^(.*)$ index.php/$1 [L]
 

At this point you know that every URL that gets to your Kohana app will have a language as its first segment. No exceptions. If a URL has no language in it (e.g. example.com/page), it will be redirected to a temporary language placeholder (e.g. example.com/xx/page), which we take care of in step two.

2. Dynamically set locale config via a hook

The core of the language system is a hook that I call site_lang. Note that you need to enable hooks via applications/config/hooks.php. Also, you will need to set allow_config_set to TRUE since the hook needs to set config items at runtime. You find that option in application/config/config.php.

Basically what this hook does, is look at the language key found in the URL and set the locale config values according to it. If the language key in the URL is invalid (e.g. 'xx'), it will automatically look for the most appropriate alternative, taking into account (in order of precedence): a possible language cookie set on a previous visit, the HTTP_ACCEPT_LANGUAGE request header and the default language chosen by you. Finally, you will be redirected to the corrected URL.

application/hooks/site_lang.php

<?php

// This hook sets the locale.language and locale.lang config values
// based on the language found in the first segment of the URL.

Event::add('system.routing', 'site_lang');

function site_lang()
{
        // Array of allowed languages
        $locales = Config::item('locale.allowed_locales');

        // Extract language from URL
        $lang = strtolower(substr(url::current(), 0, 2));

        // Invalid language is given in the URL
        if ( ! array_key_exists($lang, $locales))
        {
                // Look for default alternatives and store them in order
                // of importance in the $new_langs array:
                //  1. cookie
                //  2. http_accept_language header
                //  3. default lang

                // Look for cookie
                $new_langs[] = (string) cookie::get('lang');

                // Look for HTTP_ACCEPT_LANGUAGE
                if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE']))
                {
                        foreach(explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']) as $part)
                        {
                                $new_langs[] = substr($part, 0, 2);
                        }
                }

                // Lowest priority goes to default language
                $new_langs[] = 'nl';

                // Now loop through the new languages and pick out the first valid one
                foreach(array_unique($new_langs) as $new_lang)
                {
                        $new_lang = strtolower($new_lang);

                        if (array_key_exists($new_lang, $locales))
                        {
                                $lang = $new_lang;
                                break;
                        }
                }

                // Redirect to URL with valid language
                url::redirect($lang.substr(url::current(), 2));
        }

        // Store locale config values
        Config::set('locale.lang', $lang);
        Config::set('locale.language', $locales[$lang]);

        // Overwrite setlocale which has already been set before in Kohana::setup()
        setlocale(LC_ALL, Config::item('locale.language').'.UTF-8');

        // Finally set a language cookie for 6 months
        cookie::set('lang', $lang, 15768000);
}
 

This hook works in combination with the locale config file to which I added two items: allowed_locales and lang.

application/config/locale.php

<?php

$config = array
(
        // Array of locales your site is available in
        'allowed_locales' => array
        (
                'nl' => 'nl_NL',
                'en' => 'en_US',
                'fr' => 'fr_FR',
                'de' => 'de_DE',
        ),

        // Long version of language (name of i18n folder)
        'language'        => 'nl_NL',

        // Short version of language (for use in URLs)
        'lang'            => 'nl',
);
 

Note that from this point Kohana::lang() will pull text from the i18n/locale folder based on the language in the URL. Also, know that the current language now is available via Config::item('locale.lang').

3. Catch-all route

application/config/routes.php

<?php

// Collision check
isset($lang) and die('Variable collision in '.__FILE__);

// Regex part for URL language
$lang = '[a-zA-Z]{2}';

$config = array
(
        // '_default' => 'home',
        $lang => 'home',

        // Catch-all language route
        $lang.'/(.*)' => '$1',
);

// Clean up
unset($lang);
 

Because of these routes you can put all your controllers straight into the application/controllers folder. No need to create subfolders like application/controllers/en for every language.

4. A url_lang helper

Finally a simple helper that makes it just a bit easier to link to pages on your site. Calling url_lang::site('aboutus') creates a URL with the current language automatically prepended (e.g. site.com/fr/aboutus).

application/helpers/url_lang.php

<?php defined('SYSPATH') or die('No direct script access.');
/*
 * Class: url_lang
 *  URL language helper class.
 */

class url_lang {

        /*
         * Method: site
         *  Creates a site URL based on the given URI string and
         *  automatically prepends the language segment.
         *
         * Parameters:
         *  uri      - URI string
         *  lang     - non-default language
         *  protocol - non-default protocol
         *
         * Returns:
         *  A URL string.
         */

        public static function site($uri = '', $lang = FALSE, $protocol = FALSE)
        {
                if ($lang === FALSE)
                {
                        $lang = Config::item('locale.lang');
                }

                return url::site($lang.'/'.trim($uri, '/'), $protocol);
        }

        /*
         * Method: current
         *
         * Returns:
         *  The current URI string without the lang part
         */

        public static function current()
        {
                return substr(url::current(), 3);
        }

        /*
         * Method: redirect
         *  Sends a page redirect header and
         *  automatically prepends the language segment.
         *
         * Parameters:
         *  uri    - site URI or URL to redirect to
         *  lang   - non-default language
         *  method - HTTP method of redirect
         *
         * Returns:
         *  A HTML anchor, but sends HTTP headers. The anchor should never be seen
         *  by the user, unless their browser does not understand the headers sent.
         */

        public static function redirect($uri = '', $lang = FALSE, $method = '302')
        {
                if ($lang === FALSE)
                {
                        $lang = Config::item('locale.lang');
                }

                return url::redirect($lang.'/'.trim($uri, '/'), $method);
        }

}
 

Final thoughts

The attentive reader might remark that the .htaccess file is not strictly necessary. Right. You could perfectly move its functionality to the site_lang hook. I do prefer the .htaccess though because it is called before any PHP code is run. It may save you some regexes in the hook as well.

If you are wondering why I go through the hassle of including the language in every URL, here are some arguments. Number one is SEO. Google absolutely needs to index all different translations of my site. Also, language specific URLs allow visitors to bookmark pages in the language of their choice.