I've been struggling with this question for quite some months now, but I haven't been in a situation that I needed to explore all possible options before. Right now, I feel like it's time to get to know the possibilities and create my own personal preference to use in my upcoming projects.
Let me first sketch the situation I'm looking for
I'm about to upgrade/redevelop a content management system which I've been using for quite a while now. However, I'm feeling multi language is a great improvement to this system. Before I did not use any frameworks but I'm going to use Laraval4 for the upcoming project. Laravel seems the best choice of a cleaner way to code PHP.
Sidenote: Laraval4 should be no factor in your answer. I'm looking for general ways of translation that are platform/framework independent.
What should be translated
As the system I am looking for needs to be as user friendly as possible the method of managing the translation should be inside the CMS. There should be no need to start up an FTP connection to modify translation files or any html/php parsed templates.
Furthermore, I'm looking for the easiest way to translate multiple database tables perhaps without the need of making additional tables.
What did I come up with myself
As I've been searching, reading and trying things myself already. There are a couple of options I have. But I still don't feel like I've reached a best practice method for what I am really seeking. Right now, this is what I've come up with, but this method also has it side effects.
- PHP Parsed Templates: the template system should be parsed by PHP. This way I'm able to insert the translated parameters into the HTML without having to open the templates and modify them. Besides that, PHP parsed templates gives me the ability to have 1 template for the complete website instead of having a subfolder for each language (which I've had before). The method to reach this target can be either Smarty, TemplatePower, Laravel's Blade or any other template parser. As I said this should be independent to the written solution.
- Database Driven: perhaps I don't need to mention this again. But the solution should be database driven. The CMS is aimed to be object oriented and MVC, so I would need to think of a logical data structure for the strings. As my templates would be structured: templates/Controller/View.php perhaps this structure would make the most sense:
Controller.View.parameter. The database table would have these fields a long with a
valuefield. Inside the templates we could use some sort method like
echo __('Controller.View.welcome', array('name', 'Joshua'))and the parameter contains
Welcome, :name. Thus the result being
Welcome, Joshua. This seems a good way to do this, because the parameters such as :name are easy to understand by the editor.
- Low Database Load: Of course the above system would cause loads of database load if these strings are being loaded on the go. Therefore I would need a caching system that re-renders the language files as soon as they are edited/saved in the administration environment. Because files are generated, also a good file system layout is needed. I guess we can go with
languages/en_EN/Controller/View.phpor .ini, whatever suits you best. Perhaps an .ini is even parsed quicker in the end. This fould should contain the data in the
format parameter=value;. I guess this is the best way of doing this, since each View that is rendered can include it's own language file if it exists. Language parameters then should be loaded to a specific view and not in a global scope to prevent parameters from overwriting each other.
- Database Table translation: this in fact is the thing I'm most worried about. I'm looking for a way to create translations of News/Pages/etc. as quickly as possible. Having two tables for each module (for example
News_translations) is an option but it feels like to much work to get a good system. One of the things I came up with is based on a
data versioningsystem I wrote: there is one database table name
Translations, this table has a unique combination of
primarykey. For instance: en_En / News / 1 (Referring to the English version of the News item with ID=1). But there are 2 huge disadvantages to this method: first of all this table tends to get pretty long with a lot of data in the database and secondly it would be a hell of a job to use this setup to search the table. E.g. searching for the SEO slug of the item would be a full text search, which is pretty dumb. But on the other hand: it's a quick way to create translatable content in every table very fast, but I don't believe this pro overweights the con's.
- Front-end Work: Also the front-end would need some thinking. Of course we would store the available languages in a database and (de)active the ones we need. This way the script can generate a dropdown to select a language and the back-end can decide automatically what translations can be made using the CMS. The chosen language (e.g. en_EN) would then be used when getting the language file for a view or to get the right translation for a content item on the website.
So, there they are. My ideas so far. They don't even include localization options for dates etc yet, but as my server supports PHP5.3.2+ the best option is to use the intl extension as explained here: http://devzone.zend.com/1500/internationalization-in-php-53/ - but this would be of use in any later stadium of development. For now the main issue is how to have the best practics of translation of the content in a website.
Besides everything I explained here, I still have another thing which I haven't decided yet, it looks like a simple question, but in fact it's been giving me headaches:
URL Translation? Should we do this or not? and in what way?
So.. if I have this url:
http://www.domain.com/about-us and English is my default language. Should this URL be translated into
http://www.domain.com/over-ons when I choose Dutch as my language? Or should we go the easy road and simply change the content of the page visible at
/about. The last thing doesn't seem a valid option because that would generate multiple versions of the same URL, this indexing the content will fail the right way.
Another option is using
http://www.domain.com/nl/about-us instead. This generates at least a unique URL for each content. Also this would be easier to go to another language, for example
http://www.domain.com/en/about-us and the URL provided is easier to understand for both Google and Human visitors. Using this option, what do we do with the default languages? Should the default language remove the language selected by default? So redirecting
http://www.domain.com/about-us ... In my eyes this is the best solution, because when the CMS is setup for only one language there is no need to have this language identification in the URL.
And a third option is a combination from both options: using the "language-identification-less"-URL (
http://www.domain.com/about-us) for the main language. And use an URL with a translated SEO slug for sublanguages:
I hope my question gets your heads cracking, they cracked mine for sure! It did help me already to work things out as a question here. Gave me a possibility to review the methods I've used before and the idea's I'm having for my upcoming CMS.
I would like to thank you already for taking the time to read this bunch of text!
// Edit #1:
I forgot to mention: the __() function is an alias to translate a given string. Within this method there obviously should be some sort of fallback method where the default text is loaded when there are not translations available yet. If the translation is missing it should either be inserted or the translation file should be regenerated.