survex to therion conversion script
Wookey
wookey at aleph1.co.uk
Wed Jan 12 03:54:53 CET 2005
A couple of people asked about this, so here it is under a sensible title so
people can find it in the archives.
Below is a perl script which does a reasonable job of converting .svx
files to .th files. Written by Olly.
I have noticed a few problems, which I might as well document here as as
good a place as any: (I was going to tidy them up and send them to Ol as I
was given said script for 'testing').
*calibrate declination comes out wrong. It should be:
*calibrate declination n -> declination n degrees
therion doesn't understand 'ignoreall' - needs commenting out
Doesn't try to deal with different 'team' syntax - but could be fixed to get
most of them right ((lower case them all), 'pics'->'pictures', 'disto'->'length
#disto' etc)
Only significant problem was with 'overview' files which include a number of others.
* All the equates need enclosing in 'centreline'/'endcentreline'
* The converter reads in any 'included' files and inserts them. It stopped
after doing two of these and truncated the file. It should probably just
convert '*include foo' to 'input foo.th'
It once inserted a top-level 'dummy' survey where one wasn't needed. This is
a feature of converting files individually rather than as a coherent dataset,
I suspect. It could probably do a better job if it 'spidered' the dataset
from a top-level overview file, but that's a lot more work :-)
I found one case where the ';' remained for the comment char instead of it
getting converted to '#'. I need to look at that or send ol the offending
file.
Then you can do this to convert a whole directory:
for FILE in ls *.svx; do FILE=echo $FILE | sed "s/.svx//"; echo "Converting
$FILE.svx"; svx2th $FILE.svx > $FILE.th; done
#!/usr/bin/perl -w
use strict;
# svx2th v0.1
# Copyright (C) Olly Betts 2004
sub convert_file($);
my $in_survey = 0;
my $had_fix = 0;
print "encoding iso8859-1\n";
for my $filename (@ARGV) {
convert_file($filename);
}
if ($in_survey) {
print "endsurvey dummy\n";
}
sub convert_file($) {
my $filename = shift;
open F, "<", $filename or die "$filename: $!\n";
my @lines = <F>;
close F;
my $in_centre_line = 0;
my $lineno = 0;
my $dummy_survey = -1;
foreach $_ (@lines) {
++$lineno;
# Replace ; with # as comment separator.
# FIXME won't cope with ; in a filename or *title
s/;/#/;
# Comment out "*export" and "*entrance" as there seems to be no
# equivalent of either.
if (/^\s*\*\s*(?:export|entrance)\b/i) {
print "#$_";
next;
}
if ($in_centre_line) {
if (/^\s*\*/ && !/^\s*\*\s*(?:date|calibrate|fix|equate|data|instrument|units|sd|infer|flags|team)\b/i) {
print "endcentreline\n";
$in_centre_line = 0;
}
} else {
if (/^\s*[^\s*#]/ || /^\s*\*\s*(?:date|calibrate|fix|equate|data|instrument|units|sd|infer|flags|team)\b/) {
if (!$in_survey) {
# Therion can't handle these outside a centreline which
# must be inside a survey, so we have to add a dummy
# top-level survey.
print "survey dummy -title \"Therion is crap\"\n";
++$in_survey;
}
print "centreline\n";
$in_centre_line = 1;
}
}
# *begin <survey> -> survey <survey>
if (s/^(\s*)\*(\s*)begin\b(\s*)(\S+)/$1$2survey$3$4 -title "$4" /i) {
++$in_survey;
print $_;
next;
}
# *begin -> <nothing>
# The *begin will cause an endcentreline / centreline pair to be
# output which hopefully prevents settings from escaping. However
# this doesn't restore the old settings, just undoes any new ones.
# FIXME: Need to address this somehow...
if (/^(\s*)\*(\s*)begin\b[ \t]*$/i) {
++$in_survey;
if ($dummy_survey != 0) {
# FIXME: doesn't coped with nested *begin with no arguments...
die "This convertor doesn't currently handle nested *begin with no arguments\n";
}
$dummy_survey = $in_survey;
print "#$_";
next;
}
# *end [<survey>] -> endsurvey [<survey>]
if (s/^(\s*)\*(\s*)end\b/$1$2endsurvey/i) {
if ($dummy_survey == $in_survey) {
$_ = "#$_";
$dummy_survey = -1;
}
--$in_survey;
if ($in_centre_line) {
print "endcentreline\n";
$in_centre_line = 0;
}
print $_;
next;
}
# *title <title> -> # -title <title>
# FIXME: just comment out for now - should really convert to -title on
# the "survey" line.
if (s/^(\s*)\*\s*title\b\s*/$1# -title /i) {
print $_;
next;
}
# *team and *instrument format is unspecified in Survex, and they're
# just informational so comment them out for now...
# NB therion seems to be case sensitive so "Compass" isn't a valid role
# ("compass" is)...
# NB in *team pics -> pictures
if (s/^(\s*)\*(\s*(?:team|instrument))\b/#$1$2/i) {
print $_;
next;
}
# *include -> literal text inclusion.
# Note that the *include means an implicit *begin, but the output
# may not reflect this correctly (since we can't handle *begin
# with no survey name anyway...)
if (/^\s*\*\s*include\s*"?([^"\s]*)/i) {
# Use Unix path separators (/ not \) - Survex understands either on
# either platform.
my $filename = $1;
$filename =~ s!\\!/!g;
$filename .= '.svx' unless $filename =~ /\.svx$/i;
convert_file($filename);
next;
}
# survey.subsurvey.12 -> 12 at subsurvey.survey
if (s/^(\s*)\*(\s*equate)\b/$1$2/i) {
# Ensure that a comment separator doesn't get eaten by station name.
s/(\S)#/$1 #/;
s/(\S+)\.(\S+)/"$2\@".join(".",reverse split m!\.!, $1)/ge;
print $_;
next;
}
if (/^\s*\*\s*fix\b/i) {
$had_fix = 1;
}
s/^(\s*)\*/$1/;
print;
}
}
Wookey
--
Aleph One Ltd, Bottisham, CAMBRIDGE, CB5 9BA, UK Tel +44 (0) 1223 811679
work: http://www.aleph1.co.uk/ play: http://www.chaos.org.uk/~wookey/
More information about the Therion
mailing list