My preferred character set is utf8.One big drawback in many cases can be that the default collation ignores accents for characters. The result is that "halla" and "hallå" is interpreted as the same word. This is of-course bad for storing general data such as names users or products that may very well have accented characters that should be respected.
The sollution for me is to not use "utf8_general_ci".
For generic data that have no particular language I prefer "utf8_bin" For data that will be sorted and that can be given a "language" I prefer the collation most appropriate, such as "utf8_swedish_ci"