« Rongo-Rongo Product Management | Main | Parallels Beta Cannot Allocate Virtual Machine Memory »

August 11, 2006

NewsFire OPML Exports

If you use NewsFire to export OPML, you may run into some trouble because there are a few things wrong with the export. In particular, even though the XML header indicates that the character encoding is ISO-8859-1:


<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Generated by NewsFire 67 -->
<!-- http://www.NewsFireRSS.com/ -->
<opml version="1.1">
...

the file is, in fact, encoded as UTF-16 (or some other two byte encoding). You can see this in the output of od -a on the OPML file:


0000000   ff  fe   < nul   ? nul   x nul   m nul   l nul  sp nul   v nul
0000020    e nul   r nul   s nul   i nul   o nul   n nul   = nul   " nul
0000040    1 nul   . nul   0 nul   " nul  sp nul   e nul   n nul   c nul

The two characters "ff fe" are referred to as the BOM, or byte order mark, of the file. That's the first clue that this is a two byte encoding. Next, you'll see that every other character is a NUL. That's because UTF-16 keeps a NUL in the high byte for ASCII characters. Anyway, all of this doesn't mean much other than that this file is not, as previously indicated, ISO-8859-1, which is a one byte encoding. To fix it, make use of the lovely utility "iconv", which comes standard on most Unixes (and the Mac). "-f" means "from this encoding" and "-t" means "to this encoding".


stechert@kirin:~/Desktop [1040] $ iconv -f UTF-16 -t UTF-8 My\ NewsFire\ Feeds.opml > My\ NewsFire\ Feeds2.opml
stechert@kirin:~/Desktop [1041] $ od -a My\ NewsFire\ Feeds2.opml
0000000    <   ?   x   m   l  sp   v   e   r   s   i   o   n   =   "   1
0000020    .   0   "  sp   e   n   c   o   d   i   n   g   =   "   I   S
0000040    O   -   8   8   5   9   -   1   "   ?   >  nl   <   !   -   -

Changing the indicated encoding to UTF-8 by replacing the string "ISO-8859-1" in the header with "UTF-8" gets us a properly encoded XML file.

Now the only remaining problem is that OPML requires a head element within the OPML tag, so go add that. The head of your file should now look like this:


<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by NewsFire 67 -->
<!-- http://www.NewsFireRSS.com/ -->
<opml version="1.1">
        <head/>
        <body>
...

Having come this far, your OPML file should now validate and you could, e.g., use it to upload a news filter to TailRank, if you wanted to give it a try (instead of getting error messages about how brokenly formatted your OPML is). It's annoying that David Watanabe gets this wrong. And it was a missed opportunity-to-impress that the burtonator's code didn't handle this kind of stuff automatically. But then again, neither does Google Reader or Bloglines.

Technorati Tags: ,

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83452fe4869e200d83531be5c53ef

Listed below are links to weblogs that reference NewsFire OPML Exports:

Comments

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

My Photo