Fixing existing data when switching character set in MySQL

by Martin Westin in


When altering a database, table or field from one character set to another, existing data will probably look garbled since it is expected to be in the new character set. To "convert" existing data is not always easy. This method (found in the comments of a blog entry I have lost the url for) works as long as you can keep the database "locked" when you do this. Otherwise you will convert any new data entered and make that data look garbled.

Read More

Altering character set and collation of tables in MySQL

by Martin Westin in


Altering the character set and collation of a table is sometimes not enough. You may have to alter the actual fields in the table to get MySQL to comply in some cases. I don't know why or when MySQL does this.

ALTER TABLE my_table DEFAULT CHARSET=utf8 COLLATE=utf8_swedish_ci;

ALTER TABLE my_table MODIFY my_field varchar(255) CHARACTER SET utf8 COLLATE utf8_swedish_ci;

Preferred character set and collation in MySQL

by Martin Westin in


My preferred character set is utf8.One big drawback in many cases can be that the default collation ignores accents for characters. The result is that "halla" and "hallå" is interpreted as the same word. This is of-course bad for storing general data such as names users or products that may very well have accented characters that should be respected.

The sollution for me is to not use "utf8_general_ci".

For generic data that have no particular language I prefer "utf8_bin" For data that will be sorted and that can be given a "language" I prefer the collation most appropriate, such as "utf8_swedish_ci"