Spin Rewriter now recognizes even the weirdest encodings

You might have noticed I picked the "Programming" category for this blog post. The reason for this is quite simple - this post won't be very interesting to 95% of you who aren't programmers. 😃

You see, out of thousands of active users, we have received about 2-3 customer tickets each month saying that Spin Rewriter somehow garbled up the original text. We've been looking into this for a while now, and we found out that:

99.5% of all submitted texts are processed normally
texts that begin with the bytecode EF BB BF are encoded in the standard UTF-8 format (works well with Spin Rewriter)
texts that begin with the bytecode FE FF are encoded in the UTF-16/UCS-2, little endian format (some issues)
texts that begin with the bytecode FF FE are encoded in the UTF-16/UCS-2, little endian format (some issues)
texts that begin with the bytecode FF FE 00 00 are encoded in the UTF-16/UCS-2, little endian format (sporadic issues)
texts that begin with the bytecode 00 00 FE FF are encoded in the UTF-16/UCS-2, little endian format (sporadic issues)

For instance, if our user entered "It ?s nev?r a ?onven?ent tim? t? h?v? ?our v?hicle qu?t ?n ??u." in the UTF-16/UCS-2, little endian format, Step 2 of the spinning process appeared fine, however Step 3 showed this: "It

Start taking full advantage of unique, human-quality content today!