Non-POD to variadic functions

Recently I ran into some strange behavior in gcc that caused some minor confusion for me for a few hours until I figured out exactly what was wrong. I'm going to attempt to explain it here, so that maybe someone else will benefit from my lack of understanding about how computers work :-). The problem was exacerbated by the fact that I am compiling a system which has literally thousands of compiler warnings, and without going through and fixing them, there's really no way to manually read them. Since gcc only warns (although I really think this should be an error), this is a miss-able thing that will cause problems if it is ignored. Our working example will be the following code:

#include <stdio.h>

class Foo
{
  public:
    Foo() {}
};

int main()
{
  printf("%s\n", Foo());
  return 0;
}

First, some background. C++ has things which C compilers (and libraries) don't understand, and one of them happens to be non-POD. POD stands for Plain Old Data, and it is basically things without constructors, destructors and methods (in the above example, Foo serves this purpose). printf() is what's known as a variadic function (i.e. a function which can take a variable number of arguments). In C, these functions are written using the stdargs.h header and associated macros. Naturally, the macros in stdargs.h know nothing about non-POD, and thus it is not valid to pass non-POD to a variadic function. What gcc does when you do, though, is rather strange. Consider the following terminal output:

tycho@mittens:~/playground$ g++ variadic.cpp -o variadic
variadic.cpp: In function 'int main()':
variadic.cpp:11: warning: cannot pass objects of non-POD type
'class Foo' through '...'; call will abort at runtime
variadic.cpp:11: warning: format '%s' expects type 'char*', but
argument 2 has type 'int'
tycho@mittens:~/playground$ ./variadic 
Illegal instruction

What's going on here? My first thought was that it was some 32/64-bit nuance that I didn't understand but it turns out that isn't the case. When gcc encounters a variadic function which has been passed non-POD, it generates a warning and a ud2 instruction in place of the call. If (like me) you're forced to ignore all compiler warnings due to the sheer number, you wouldn't see the above warning. Then when you run your binary, it crashes with SIGILL! Why does it crash? From the Intel x86 manual:

[ud2] Generates an invalid opcode. This instruction is provided for software testing to explicitly generate an invalid opcode. The opcode for this instruction is reserved for this purpose. Other than raising the invalid opcode exception, this instruction is the same as the NOP instruction. This instruction's operation is the same in non-64-bit modes and 64-bit mode.

So, the generated binary has a ud2 sitting in it, which guarantees that it will crash with a SIGILL. Why does gcc do this instead of aborting compilation? I have no idea, but it's good to know that this behavior exists so that if you come across it you don't have to spend several hours hunting down what's going on.

Lastly, I'd like to make a plug for IDAPro. I have used it fairly extensively while at UW, and it works very well. It handles large (300MB) binaries well (things are naturally slower, but it's not unbearable), and models the re-writing of static and dynamic linkers for a variety of formats very well. I used it to help me track down this bug.

Spam on my blog!

In an interesting twist, you may have noticed that my blog received some spam posts the other day. This seems pretty amazing to me, since I wrote this page myself, and there is exactly one deployment of it (which is here). I can see writing spam bots for popular frameworks like Wordpress, Joomla, Drupal, etc. but this isn't exactly a popular framework ;-). The fact that I got spammed leads me to believe that spammers are applying some sort of machine learning techniques to figure out what looks like a comment form, and what doesn't. It never ceases to amaze me the lengths people will go to in order to spam.

How did I remove it? Well naturally I've been too lazy to code an admin interface (and why would I? I write the blog posts in vi). So, I had to fire up the sqlite driver (I use a database to store the comments) and manually delete them myself. If this keeps occurring, I'll probably try to build in some sort of simple spam filtering AI or at least an interface which makes it easy to delete the spam.

Either way, I'm flattered that people think my website is popular enough that they have to post their spam links on it ;-)

First post to the blog!

This my first post using the blog mechanism of the GTFO. Hopefully everything will work as advertised ;-).

Dane County Farmer's Market

One of the things I like best about living in downtown Madison is the farmer's market. It's very handy to be able to up, walk a few blocks and get stuff for breakfast, and come home and cook it. If you're ever in town, I highly recommend dropping by! They have all kinds of things there including a multitude of Wisconsin cheese :-).

First public deployment of GTFO!

Ahoy! This is the first public deployment of the new web framework "GTFO". I'll explain the name, design rationale, and other such things in a later post (suffice it to say that this is the framework which stays the hell out of your way ;-). For now, I believe I've got a reasonable stable and secure (i.e. protected against sql/html injection) version of it.

Of course with a new framework comes a new content format. This means I had to port all of my content from my old web page to this one. Although most of it was pretty scriptable, I've scrapped a few old and unnecessary pages in favor of the new format. In the process, I likely broke a few things that I'm unaware of. Thus, if you experience any broken links (or get any unexpected errors) /please/ send me an e-mail to let me know, so that I can fix them.

I'm also interested in folks' thoughts on the new layout. I'm experimenting with a few different layouts, and any thoughts would be welcome. I enjoy the simplicity of this one, but I don't like the fact that it is a fixed width. Anyway, I'll probably try a few more layouts before I settle on the right one, so don't be surprised if you show up and everything looks totally different. There should be more new content soon too (where soon is as soon as I can get the energy up to write it ;-).

Lastly, I hope to have more features (e.g. photo galleries) integrated into the framework. I'm developing it pretty actively right now, so it's likely that these things will happen in the near future. Please don't hesitate to suggest any features that you might enjoy using. I'm eventually going to release the code under a BSD-style (or perhaps beerware?) license, although it's not clean enough for me to do that right now. If you're interested in trying out the code, though, shoot me an e-mail and I'll be happy to send it your way.

Cycling in Madison

Now that I have this fancy blog, I might as well use it :-).

For my birthday I was given a fancy bike computer (Garmin Edge 705, if you're interested). I take it on all of my rides, and record them. I publish the results on runsaturday.com, which is a pretty cool page for managing things like this. If you're interested in fitness at all, this site has a plethora of tools which will automatically analyze your fitness for you. It's very interesting!

Panoramic off of the capitol

This weekend I went up to the observatory deck on the capitol, which was the first time I'd ever done that. I shot a panoramic with the camera on my phone. It's not a great camera, but the results turned out pretty well for a camera phone. If you're interested you can check out the original and a cropped version (warning: these images are huge -- around 25 MB -- and will probably take a while to load). Enjoy!

Oh, and the software I used to stitch together the photos was hugin. Not exactly an intuitive interface, but it works pretty well, and it has more buttons and whistles than I could ever want. If you know something about photo stitching that I don't (which is highly likely), I'm happy to furnish the originals for a better attempt.