The scene: a document of pharmaceutical data keeps on displaying capital A circumflex Â after each major drug name but before a generated trademark sign.
The problem: the character meant something in some original data, but what? You can never afford to ignore strange characters in mission critical data: they may have significance themselves or expose an underlying transcoding problem you were not aware of. The encoding information about the original 8-bit data is lost.
The approach: First we look at the text and figure out what it could be: a non-breaking space or zero-width non-joiner (ZWNJ) is typographically likely, and becomes the working theory.
Â is the Unicode character U+00C2. However, characters in the Latin 1 block are prone to shadowing with error characters from other 8-bit character sets being read in by a system using the default encodings of most PCs (CP-1252, ISO8859-1). So we looked at what 0xC2 character is in common character sets, using the handy tables at Unicode Consortium
First we look at MacRoman. Data from around 1985 to 1995 could have used that encoding. But 0xC2 is the not symbol. No good.
If it isn't a Mac issue, the next most likely issue is that it is Adobe-related, since they are also very popular in publishing. Dingbats 0xC2 is a circled digit 3. Stdenc 0xc2 is an acute accent. Symbol font 0xC2 is a bold Fraktur R. Aha...probably not a spacing issue at all.
solution : A bold Fraktur R was being used sometimes as the registered trademark symbol. The programmer confirmed that the current transformations didn't cope with this, but that they would be transferred. So it looks like the circled R ® (U+00AE) should be used.