You might have noticed I picked the "Programming" category for this blog post. The reason for this is quite simple - this post won't be very interesting to 95% of you who aren't programmers. 😃

You see, out of thousands of active users, we have received about 2-3 customer tickets each month saying that Spin Rewriter somehow garbled up the original text. We've been looking into this for a while now, and we found out that:
  • 99.5% of all submitted texts are processed normally
  • texts that begin with the bytecode EF BB BF are encoded in the standard UTF-8 format (works well with Spin Rewriter)
  • texts that begin with the bytecode FE FF are encoded in the UTF-16/UCS-2, little endian format (some issues)
  • texts that begin with the bytecode FF FE are encoded in the UTF-16/UCS-2, little endian format (some issues)
  • texts that begin with the bytecode FF FE 00 00 are encoded in the UTF-16/UCS-2, little endian format (sporadic issues)
  • texts that begin with the bytecode 00 00 FE FF are encoded in the UTF-16/UCS-2, little endian format (sporadic issues)
For instance, if our user entered "It ?s nev?r a ?onven?ent tim? t? h?v? ?our v?hicle qu?t ?n ??u." in the UTF-16/UCS-2, little endian format, Step 2 of the spinning process appeared fine, however Step 3 showed this: "It ?s nevеr a ?onven?ent timе tо hаvе ?our vеhicle qu?t оn ?оu."

We have now resolved all these issues and Spin Rewriter will process all articles that you can throw at it. 😃

Previous blog post: A new and easier way of handling your projects

Next blog post: Summer season is upon us