Fixing existing data when switching character set in MySQL

by Martin Westin in


When altering a database, table or field from one character set to another, existing data will probably look garbled since it is expected to be in the new character set. To "convert" existing data is not always easy. This method (found in the comments of a blog entry I have lost the url for) works as long as you can keep the database "locked" when you do this. Otherwise you will convert any new data entered and make that data look garbled. If your content is latin1 but should be read as utf8 you would do the following to convert existing data.

ALTER TABLE my_table CHANGE my_field my_field TEXT CHARACTER SET latin1;

ALTER TABLE my_table CHANGE my_field my_field BLOB;

ALTER TABLE my_table CHANGE my_field my_field TEXT CHARACTER SET utf8;

The important bit is the use of the BLOB datatype. Since it is a binary datatype it has no character set. The datatype you start and end with can be any char, varchar, text or similar datatype.

A full test case to play with:

# case where data was enterd correctly as latin1 and later needs conversion to utf8

SET NAMES latin1;

drop table people;

CREATE TABLE people (
  id int(11) unsigned NOT NULL auto_increment,
  name varchar(64) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO people VALUES 
  (1,'user_a'),
  (2,'user_å'),
  (3,'user_ä');

select * from people;

SET NAMES utf8;
ALTER TABLE people engine=MyISAM default character set utf8;
ALTER TABLE people CHANGE name name BLOB;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;

select * from people;
# case where utf8 data was enterd into a table with latin1 and later needs conversion to utf8

SET NAMES latin1;

drop table people;

CREATE TABLE people (
  id int(11) unsigned NOT NULL auto_increment,
  name varchar(64) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

SET NAMES utf8;

INSERT INTO people VALUES 
  (1,'user_a'),
  (2,'user_å'),
  (3,'user_ä');

select * from people;

ALTER TABLE people engine=MyISAM default character set utf8;
ALTER TABLE people CHANGE name name BLOB;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET latin1;
ALTER TABLE people CHANGE name name varchar(64) CHARACTER SET utf8;

select * from people;