Here I will describe how I converted my old website (www.flexman.homeip.net) that used SnipSnap to an offline copy (see here) and to DokuWiki (this page)

Making an offline copy of SnipSnap

Of course the tool of choice is wget. However it is not so trivial because SnipSnap uses the same name pages and folders for subpages. The hierarchical structure is similar to the one of dokuwiki. The access is pretty easy because you only give the name of the page in the url. If you use wget right away it will fail because it cannot create a folder and a file with the same name. Fortunately wget comes with the appropriate option –html-extension which adds to all pages an html file ending. Here the entire call

wget --mirror --convert-links --backup-converted \
     --html-extension www.flexman.homeip.net

This will give you an entire copy of both the pages and the raw content and the diffs. The raw content is the content of your snips as you typed them, meaning in wiki syntax. This is available through the view button. Somehow I don't see it on my mirrored pages now. Mh. Anyway, if it does not work for you then you can still access them with /raw/snipname. So one does not need the fiddle around with cookies to get the logged in pages. By the way I tried, but did not manage to fake snipsnap with a stored cookie from my logged in session. Probably it checks the browser id or something else as well.

Now, I only had to change a few things like the logo and so on.

Convert SnipSnap to Dokuwiki

Since we now have all the pages in raw format in one folder (which should be called raw) we can process them with a small script to convert them to dokuwiki syntax. Before we can start we have to convert all files to unix file format:

mkdir raw_unix;
cd raw; 
export IFS='
'; # this avoids trouble with spaces in filesnames
for F in `find -type f`; do dos2unix < "$F" > "../raw_unix/$F"; done.
cd ..

Now you can use the following perl script (download snipsnap2dokuwiki.pl) to convert the radon wiki syntax to the dokuwiki syntax. The script does not cover everything and has also some bugs, but for the majority of content it works.

This lines will apply the script to all files in raw_unix and change their filenames because Dokuwiki does not allow special characters (except - and _ AFAIK) and requires lower case:

mkdir raw_doku;
cd raw_unix; 
for F in `find -type f`; do 
  K=`echo $F | tr -d "()+" | perl -e "print lc <>;"`; #lower case without 
  ./snipsnap2dokuwiki.pl < "$F" > "../raw_doku/$K"; 
done
cd ..

Unfortunately all files are in one directory and the hierarchical structure is lost. I was anyway restructuring, that is why I did this by hand.

Anyway now you can copy the files to you Dokuwiki webspace under dokuwiki/data/pages with the appropriate subfolder. Make sure the permissions are the same as of the other files there to allow editing. That's it! You don't have to register the pages somewhere - Dokuwiki is really great in this respect!

Finally here the code of the converter snipsnap2dokuwiki.pl:

#!/usr/bin/perl -w
use strict;
 
# usage as a filter
my $codeblock=0;
while(<>){    
  #inline code elements
  s/\{code[:]?(.*)\}(.*?)\{code\}/<code $1>$2<\/code>/;
  # are we in codeblock?
  if($codeblock){
    if(/\{code\}/){
      print "</"."code>\n";
      $codeblock=0;
      next;
    }
  }
  #codeblock
  if(/\{code[:]?(.*)\}/){
    $codeblock=1;
    $_=$1;
    s/none//;
    print "<code ". $_ . ">";
    next;
  }
  # headings
  s/^\s*1.1.1 (.*)/===$1===/;
  s/^\s*1.1 (.*)/====$1====/;
  s/^\s*1 (.*)/=====$1=====/;  
 
  #bold, italics
  s/__(.*?)__/**$1**/g;  
  s/~~(.*?)~~/\/\/$1\/\//g;  
  s/--(.*?)--/<del>$1<\/del>/g;  
 
  #lists
  s/^([\s]*)[-\*] (.*)/  * $2/;
  s/^([\s]*)[1aAiI]. (.*)/  - $2/;
  s/^([\s]*)[-\*]{2}? (.*)/    * $2/;
  s/^([\s]*)[1aAiI]{2}?\. (.*)/    - $2/;
 
  #anchors ( do not exist, are done automatic on sections)
  s/\{anchor:.*?\}//;
 
  #links
  # Internal Link with name
  s/\[(.*?)\|(.*?)\/(.*?)\]/\[\[$2:$3|$1\]\]/g;
  s/\[([^:]*?)\|(.*?)\]/\[\[$2|$1\]\]/g;
  s/\[([^|]*?)\/([^|]*?)\]/\[\[$1:$2\]\]/g;
  s/\[([^|^:]*?)\]/\[\[$1\]\]/g;
 
  #external links
  s/\{link:(.*)\|(.*)\}/\[\[$2|$1\]\]/g;
  s/\{link:(.*)\}/$1/g;
 
 
  print $_;
}

archive/snipsnapconvert.txt · Last modified: 17.01.2009 14:37 (external edit)