While recently working on a project I have come across a bit of a problem regarding the charset of a web page. The site has content in russian language thus the iso-8859-1 charset could not handle it very well.
Old charset
When I was faced with building a new web site I use my very own custom build CMS. The problem is that within the default template I have coded the charset of the page using the <meta> tag to iso-8859-1. I have done this at the very beginning of the development of my web site framework.
Improvement
Not until recently when I read Anne's post about UTF-8, did I realize the effect and importance of using this charset which supports all the characters in the world today, but is also capable of extension. For more details about UTF-8 and its effect, read Anne's post. But since I haven't been faced with building a site with content in a strange language, that might need this charset, I silently copied the charset from a site template to another.
When I stumbled upon this project (with russian content), I have had a bit of a problem because the characters where messed up in the browser. They wouldn't show as they should. After I remembered the charset problem issue, and replaced the charset on the site, the content displayed as it should.
Also I stopped using the <meta> tag to declare the page's charset. Since the site is dynamic and is PHP-generated, I used PHP's header function to send the browser the charset through the HTTP header:
header('Content-Type: text/html; charset=utf-8');
Update: It seems that just changing the character set from ISO to UTF does not quite solve it. Additionaly I had to use in several places the utf8_encode function to output in UTF-8 encoding the data got from the database.
Comments
at 21:59 on 09/Sep/2006
![]()
Content-Transfer-Encoding: base64
at 09:59 on 21/Jun/2007
![]()
at 22:37 on 23/Feb/2009
![]()
at 08:22 on 25/Jun/2006
Comment by othy