[Therion] samples processing failing

Wookey wookey at wookware.org
Thu Sep 5 04:20:26 CEST 2013


+++ Olly Betts [2013-09-02 03:07 +0100]:
> On Fri, Aug 30, 2013 at 05:47:30PM +0100, Wookey wrote:
> > +++ Olly Betts [2013-08-30 06:04 +0100]:
> > > The error given by therion here would be less confusing if it at least
> > > reported the filename that img failed to open, and ideally if it also
> > > reported the reason as given by img_error().
> > 
> > Indeed. I've already patched therion to report the filename. I'll have
> > a go at getting it to report the error code too. It would indeed have
> > been much less confusing.
> > 
> > OK. so I tried this and added this code:
> >  +   img_errcode imgerr;
> > ...
> >  -    ththrow(("unable to open file"))    
> >  +    imgerr = img_error();
> >  +    ththrow(("unable to open file %s, error code: %u", this->fname, imgerr))


> > So I'm not sure what's going on here. Are those numbers getting
> > translated to strings, and I should call some function to do
> > that before trying to print the result? If so what function? Does this
> > machinerey only happen inside survex, not when using img.c/h
> > externally? i.e am I HOSTED or not? Is this stuff written down anywhere?
> 
> You're not HOSTED.
> 
> I don't understand why you're getting 9152480 though.  If I try patching
> this myself I get 8 (which is IMG_TOONEW), though that's with the latest
> img.c from survex git, so maybe it's something that's already fixed?

With a bit of help from Ol pointing out where I was hacking way after
I should have gone to bed, this is all sorted now.

> 
> > > > That lets it build, but now processing samples/survex/thconfig.1
> > > > causes a segfault:

> > > > Can someone who groks C++ take a look at this and see what's needed to
> > > > make it work with survex 1.2.7

> That narrowed it down somewhat, but lack of debug symbols isn't helping
> here.  Building it myself, I see the problem, and it's fixed in the
> latest version of img.c in survex git.

I've updated the patches to use this and everything builds and
processes nicely now.

> > The build is now acceptably noisy but would benefit for a bit more tidyup.
> 
> The attached patch fixes all the rest which I see with GCC 4.7.2.

That's all lovely except this bit:

> --- therion-5.3.11/thepsparse.cxx	Sun May 30 10:11:10 2010
> +++ therion-5.3.11/thepsparse.cxx	Mon Sep  2 13:57:29 2013
> @@ -146,6 +146,12 @@
>    closed=false;
>    fillstroke = -1;
>    transformation.clear();
> +  transf[0] = 0;
> +  transf[1] = 0;
> +  transf[2] = 0;
> +  transf[3] = 0;
> +  transf[4] = 0;
> +  transf[5] = 0;
>  }

which causes: 
thepsparse.cxx:149:3: error: ‘transf’ was not declared
in this scope

I tried changing it to 
+  transformation.transf[0] = 0;
+  transformation.transf[1] = 0;
+  transformation.transf[2] = 0;
+  transformation.transf[3] = 0;
+  transformation.transf[4] = 0;
+  transformation.transf[5] = 0;

Which stops the whinge but doesn't stop this error:
thepsparse.cxx:812:35: warning: ‘fntmatr.MP_transform::transf[5]’ may
be used uninitialized in this function [-Wmaybe-uninitialized]

I took a stab in the C++ dark and ched the patch to this:
Index: therion-5.3.11/thepsparse.cxx
===================================================================
--- therion-5.3.11.orig/thepsparse.cxx	2013-09-05 01:12:43.630272317 +0100
+++ therion-5.3.11/thepsparse.cxx	2013-09-05 01:40:29.410532475 +0100
@@ -117,6 +117,12 @@
 
 void MP_transform::clear () {
   command = MP_notransf;
+  transf[0] = 0;
+  transf[1] = 0;
+  transf[2] = 0;
+  transf[3] = 0;
+  transf[4] = 0;
+  transf[5] = 0;
 }
 
 MP_transform::MP_transform () {

(i.e set them to zero in the MP_transform clear member)
and that seemed to do the trick, and is presumably what Ol meant to
type in the first place.

So we have a warning-free build!

Now the only issue remaining is that every therion run has this at the
end:
compilation time: 2 sec
/home/wookey/packages/therion/therion-5.3.11/therion: warning --
trying to run: p(U+001A)(U+0001)/tmp/th1544
sh: 1: p(U+001A)(U+0001) not found
/home/wookey/packages/therion/therion-5.3.11/therion: warning -- error
deleting temporary directory -- /tmp/th1544

(The odd chars are shown as little square boxes, not the above unicode
representation I've used - i.e. it's the actual odd chars)

This looks like garbage getting into the string which should remove
the therion tmpdir. I think this has been broken in the Debian
packages from some time (because I've been collecting thNNNN tmpdirs
for ages), but then I've been using 5.3.11 for some time on this
machine...

I presume the relevant code is: void thtmpdir::remove() in thtmpdir.cxx
some debug suggests that 
    if (strlen(thini.tmp_remove_script.get_buffer()) > 0) {
      thbuffer tmpfname;
      tmpfname = thini.tmp_remove_script.get_buffer();
      tmpfname += " ";
      tmpfname += this->name;
      std::cout<<"trying to run special rmcommand:"<<tmpfname.get_buffer()<<std::endl;
      system(tmpfname.get_buffer());

is being run and thini.tmp_remove_script.get_buffer() has garbage in
it. Where does this come from?
(my thconfig.ini file has:
# tmp-remove  ""
in it. 

Uncommenting that and giving it a value makes no difference to the
above behaviour.

tmp_remove_script only seems to get set in thinit.cxx, either to "" or
reading in the file. So I don't know why it's coming out to be p and a
couple of control chars. Could a different therion.ini files be being
used (from the one in /etc/therion.ini)?

commenting out the enum entry:
static const thstok thtt_initcmd[] = {
...
  {"tmp-path",TTIC_TMP_PATH},
//  {"tmp-remove",TTIC_TMP_REMOVE_SCRIPT},
  {"units",TTIC_UNITS},
    
means it now tries to run 
trying to run special rmcommand:p/tmp/th5702
sh: 1: p not found
(i.e the control chars are gone)

Simply removing all the code for 
if (strlen(thini.tmp_remove_script.get_buffer()) > 0) {
in thtmpdir.cxx, thtmpdir::remove()
makes it behave properly, which seems the best plan for now.

Anyone got any idea what's going on here?

Oh yes another little clue might be that zero-length files are created in 
samples/xelevation/
samples/map-offset/
samples/u-symbols/
with control chars for names, i.e.
samples/xelevation/(U+0002)
samples/map-offset/(U+0002)
samples/u-symbols/(U+0001)

Maybe that's a different issue, but we clearly have a problem with
spurious control chars.

Current code is at:
http://wookware.org/software/therion/therion-5.3.11-wook3.tar.gz

I plan to upload this to debian after a couple of checks so we at
least have something newer than 5.3.10 and basically working in the
archive.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/



More information about the Therion mailing list