Hacktoberfest 2015

Sunday, January 10, 2016 - No comments

During the month of October, I participated to DigitalOcean's Hacktoberfest. The goal is to foster Open Source programming, and the rule is easy: you just have to submit 3 pull-requests on Github. No matter the size of the pull request, no matter which project, big, small, even your own project, and the pull request doesn't even have to be accepted. It's easy to "cheat", but I played to game fairly and prepared three pull-request with actual features/improvements.

Selfloss RSS Reader

Add support for private (@) and hidden (#) tags

Selfloss is a PHP feed reader, with a per-source tagging mechanism. I needed two things:

1/ private tags (with a @ in the tag name): items and sources with such tags are not visible if the user is not logged in.

2/ "not important" tags (with a # in the tag name): items are not visible in the "all tags" feed, but only when you click directly on the tag or source.

Orochi

Add Python 3 support

Orochi is a command-line client for 8tracks online music player. It didn't support Py3, which is the default on my Archlinux system. Instead of fixing shebang, I updated the code to support Python 3.

PyQuadStick

Add a keyboard controller

PyQuadStick/PyQuadSim is a quadcopter simulator written in Python for the Virtual Robot Experimentation Platform (V-REP). PyQuadStick worked with R/C transmitters, joysticks and PS3 controllers, but I had none of them, so I wrote and contributed a keyboard controller.

Shaarli

Allow setting custom thumbnails

Shaarli is a micro-blogging engine, focused on link sharing. For my online library, I wanted to be able to provide custom thumbnails, for the book covers. Unfortunately (for my patch), they are in the middle of a refactoring, with a focus on plugins, to not clober the core of the engine and keep it minimal. So I rewrote my original patch to fit that design, but it won't be included until the refactoring is done.

Simple GDB Extensions with Python

Friday, January 08, 2016 - No comments

On Stackoverflow, I noticed that some people find missing features in GDB. There features are nothing complicated, just combination of existing commands that would be useful for them, but that are not part of the native set of commands.

GDB/Python interface is perfect to solve that kind of problem :-) It just requires simple Python skills and a little bit of documentation reading:

Does GDB have a `step-to-next-call` instruction?

SO question

Simple answer: no, step-to-next-call is not part of GDB commands.

GDB/Python-aware answer: no, it's not part of GDB commands, but it's easy to implement!

To stop before, you need to stepi/nexti (next assembly instruction) until you see call in the current instruction:

import gdb

class StepBeforeNextCall (gdb.Command):
    def __init__ (self):
        super (StepBeforeNextCall, self).__init__ ("step-before-next-call",
                                                   gdb.COMMAND_OBSCURE)

    def invoke (self, arg, from_tty):
        arch = gdb.selected_frame().architecture()

        while True:
            current_pc = addr2num(gdb.selected_frame().read_register("pc"))
            disa = arch.disassemble(current_pc)[0]
            if "call" in disa["asm"]: # or startswith ?
                break

            SILENT=True
            gdb.execute("stepi", to_string=SILENT)

        print("step-before-next-call: next instruction is a call.")
        print("{}: {}".format(hex(int(disa["addr"])), disa["asm"]))

def addr2num(addr):
    try:
        return int(addr)  # Python 3
    except:
        return long(addr) # Python 2

StepBeforeNextCall()

To stop after the call, you compute the current stack depth, then step until it's deeper:

import gdb

def callstack_depth():
    depth = 1
    frame = gdb.newest_frame()
    while frame is not None:
        frame = frame.older()
        depth += 1
    return depth

class StepToNextCall (gdb.Command):
    def __init__ (self):
        super (StepToNextCall, self).__init__ ("step-to-next-call", 
                                               gdb.COMMAND_OBSCURE)

    def invoke (self, arg, from_tty):
        start_depth = current_depth =callstack_depth()

        # step until we're one step deeper
        while current_depth == start_depth:
            SILENT=True
            gdb.execute("step", to_string=SILENT)
            current_depth = callstack_depth()

        # display information about the new frame
        gdb.execute("frame 0")

StepToNextCall()

Relevant documentation is there:

Debug a source file using GDB without stepping into library calls?

SO question

You can see my answer to Does GDB have a “step-to-next-call” instruction? : there is no native GDB command for that (as far as I know, they may have worked on that), but it's easy to do in Python:

import gdb

class StepNoLibrary (gdb.Command):
    def __init__ (self):
        super (StepNoLibrary, self).__init__ ("step-no-library",
                                              gdb.COMMAND_OBSCURE)

    def invoke (self, arg, from_tty):
        step_msg = gdb.execute("step", to_string=True)

        fname = gdb.newest_frame().function().symtab.objfile.filename

        if fname.startswith("/usr"):
            # inside a library
            SILENT=False
            gdb.execute("finish", to_string=SILENT)
        else:
            # inside the application
            print(step_msg[:-1])

    StepNoLibrary()

It's easy to read what it does:it goes one step forward, and if the step ends up in a file stored in /usr/*, it finishes the function to come back to the application.

How to set skipping of uninteresting functions from gdbinit script?

SO question

Problem: if in .gdbinit you write skip uninteresting_function, gdb complains No function found named ... because the symbols files are not loaded yet.

Python solution: new command skip_pending

import gdb

to_skip = []

def try_pending_skips(evt=None):
    for skip in list(to_skip): # make a copy for safe remove
        try:
            # test if the function (aka symbol is defined)
            symb, _ = gdb.lookup_symbol(skip)
            if not symb:
                continue
        except gdb.error:
            # no frame ?
            continue
        # yes, we can skip it
        gdb.execute("skip {}".format(skip))
        to_skip.remove(skip)

    if not to_skip:
        # no more functions to skip
        try:
            gdb.events.new_objfile.disconnect(try_pending_skips) # event fired when the binary is loaded
        except ValueError:
            pass # was not connected

class cmd_pending_skip(gdb.Command):
    self = None

    def __init__ (self):
        gdb.Command.__init__(self, "pending_skip", gdb.COMMAND_OBSCURE)

    def invoke (self, args, from_tty):
        global to_skip

        if not args:
            if not to_skip:
                print("No pending skip.")
            else:
                print("Pending skips:")
                for skip in to_skip:
                    print("\t{}".format(skip))
            return

        new_skips = args.split()
        to_skip += new_skips

        for skip in new_skips:
            print("Pending skip for function '{}' registered.".format(skip))

        try:
            gdb.events.new_objfile.disconnect(try_pending_skips) 
        except ValueError: pass # was not connected

        # new_objfile event fired when the binary and libraries are loaded in memory
        gdb.events.new_objfile.connect(try_pending_skips)

        # try right away, just in case
        try_pending_skips()

cmd_pending_skip()

Save this code into a Python file pending_skip.py (or surrounded with python ... end in your .gdbinit), then:

source pending_skip.py
pending_skip fct1
pending_skip fct2 fct3
pending_skip # to list pending skips

The Python code will automatically check if the function can be skipped (i.e., if it is defined) whenever a symbol file is loaded. Running the command with no argument list the remaining pending skips.

Documentation references:

How can I use gdb to catch the moment when a function returns false?

SO question

import gdb
class FunctionFinishBreakpoint (gdb.FinishBreakpoint):
    def __init__ (self):
        gdb.FinishBreakpoint.__init__(self, gdb.newest_frame(), 
                                      internal=True)
        self.silent = True 

    def stop(self):
        #print("after: {}".format(int(self.return_value)))
        return not int(self.return_value)

class FunctionBreakpoint(gdb.Breakpoint):
    def __init__ (self, spec):
        gdb.Breakpoint.__init__(self, spec)
        self.silent = True

    def stop (self):
        #print("before")
        FunctionFinishBreakpoint() # set breakpoint on function return

        return False # do not stop at function entry

FunctionBreakpoint("test")

Save that in a finish.py file, edit it to your needs and source it from GDB, or run it between python ... end or in python-interactive (pi).

This code creates a FunctionBreakpoint, that triggers FunctionBreakpoint.stop eachtime function test is hit. The callback is silent, and only creates a FunctionFinishBreakpoint, that stops at the end of the current frame (ie, at the end of your function). That second stop calls FunctionFinishBreakpoint.stop, which tests if the return value evaluates to true or false. If it is "not true", it tells GDB to stop the execution.

Documentation references:

(gdb.FinishBreakpoint was added to GDB Python interface for that very purpose, by myself :-)

(last time I checked, there was an efficiency problem with these FinishBreakpoint, you may notice it if your function is called very often)

Quiting ZSH not too quickly

Friday, December 18, 2015 - No comments

This posts is for zsh shell only.

If you often use command-line tools such as GDB, you certainly know the hotkey ^d (EOF) to quickly quit the CLI. But sometimes, that's too sensitive! If you hit it twice in GDB, you do quit GDB, but also its parent shell!

set -o ignoreeof  # 10*^d exits zsh

Okay, we're a bit better now, we won't quit zsh by mistake ... but we cannot close it rapidely on purpose either. So let's improve it: 3 is a better threshold (and zle is zsh line editor):

set -o ignoreeof  # 10*^d exits zsh
function zle_quit () {exit}
zle -N zle-quit zle_quit
bindkey "^d^d^d" zle-quit

We simply bind the key sequence ^d^d^d to the quit function! (you have to do it quickly enough, otherwise it won't work)

For emacs fan, this works as well:

bindkey "^x^d" zle-quit

"Unrelated" problems: it only works with my unused variable!

Monday, November 30, 2015 - No comments

A situation that occurred recently to a colleague of mine:

I don't understand what's the problem, I never use that variable, but if I comment it out, my program crashes! If I let it, it runs fine!

Of course, when you're used C programming and know a bit of memory layout, you already know that "it runs fine" is subjective and that there is buffer overflow somewhere in the code.

Nonetheless, I think the situation is interesting to study, just to remember that this can lead to tricky incomprehensible behaviors.

The problem

Consider this small C code:

 #include "stdio.h"

 #define SIZE 4

 //#define DO_NOT_CRASH

 int *i_ptr;

 #ifdef DO_NOT_CRASH
 int *not_used;
 #endif
 int array[SIZE];

 int main() {
   int i;
   i_ptr = &i;

   for (i = 0; i <= SIZE; i++) {
     array[i] = -1;
   }

   printf("*i_ptr   is ...");
   fflush(stdout);
   printf(" %d\n", *i_ptr);
 }

Run it:

gcc test.c -g -O0 && ./a.out \
echo ==============================; \
gcc test.c -g -O0 -DDO_NOT_CRASH && ./a.out

&i_ptr   is     0x7ffeac4708d0
*i_ptr   is ...[1]    8127 segmentation fault (core dumped)  ./a.out
==============================
&i_ptr   is     0x7ffd5056f508
*i_ptr   is ... 5

(I must admit that it was harder than expected to reproduce the bug. I first put all the variable on the stack (contrary to my colleague), but did not manage to have a clean and buggy behavior! Certainly memory alignment constraints that I don't understand well.)

Surprise! (or not)

Surprise, an unused variable can trigger a segmentation fault!

Last year I presented the definitions of a bug, according to Andreas Zeller. In this definition, he makes the distinction between

a defect (an invalid piece of code),
an infection (the execution of this defect, leading to an invalid memory area)
the propagation of that infection (the augmentation of the invalid memory are size)
and the failure, the externally observable error.

Here we've got an illustration of the purpose of these definition: with -DDO_NOT_CRASH, but program doesn't bug, but we now it's bugged. Yep, totally clear :-)

So what we really have is a program with a defect, whose memory space gets infected, but the infection does not propagate enough to lead to a failure.

How to detect it: with Valgrind

 valgrind ./a.out
 ==17545== Memcheck, a memory error detector
 ==17545== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
 ==17545== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
 ==17545== Command: ./a.out
 ==17545== 
 *i_ptr   is ... 5
 ==17545== 
 ==17545== HEAP SUMMARY:
 ==17545==     in use at exit: 0 bytes in 0 blocks
 ==17545==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
 ==17545== 
 ==17545== All heap blocks were freed -- no leaks are possible
 ==17545== 
 ==17545== For counts of detected and suppressed errors, rerun with: -v
 ==17545== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Nop! I expected to see something with valgrind, but apparently it's not illegal enough! (or I missed an option ...?)

How to detect it: with GDB

That that we understand the situation, we know that there is a buffer overflow, but we need to find the infection point!

(gdb) start
(gdb) watch not_used 
Hardware watchpoint 2: not_used
(gdb) cont
Continuing.
Hardware watchpoint 2: not_used

Old value = (int *) 0x0
New value = (int *) 0xffffffff
main () at overflow-long.c:18
18    for (i = 0; i <= SIZE; i++) {
19      array[i] = -1;
(gdb) print i
$1 = 4

There we are, for i=4, array[i] overflows into not_used. Our defect contaminates a memory area that is never read, so it never propagated to a failure.

Unexpected behavior of GDB watchpoint

At the beginning of the execution, the value of not_used is 0. In the overflow of the for loop, I set it to -1, so the watchpoint is triggered.

But in the first code I wrote, I set it to 0, and the watchpoint was ... not triggered. That's a bit unexpected to me, a write is a write, so I wanted the watchpoint to trigger!

So, just to confirm, I tried with rwatch, to set a read watchpoint ... and it worked!

 (gdb) rwatch not_used 
 Hardware read watchpoint 2: not_used
 (gdb) cont
 Continuing.
 Hardware read watchpoint 2: not_used

 Value = (int *) 0x0
 main () at overflow-long.c:18
 18   for (i = 0; i <= SIZE; i++) {
 (gdb)

That's also surprising to me, as my code is not supposed to read anything at this address!

Just to make it stranger, using rwatch with 0 <- -1 and watch with 0 <- 0 (the reverse of what works) doesn't work! (For the record, it's always the mov instruction that triggers my watchpoints).

Starting systemd-nspawn Container from ArchLinux

Thursday, November 19, 2015 - 1 comment

I recently disovered systemd-nspawn utility ("chroot on steroids"), and I wanted to share how I set it up on Archlinux. I plan to use it as a sandbox environment, although it's the the recommended usage.

I mainly want to prevent from script-kidding (and accidental) destruction of my environment: instead of running code from the Internet directly on my user session (or event sometimes as root, I must admit), I'll run it inside the container. Hence, the possibilities of an untargeted attack is, I believe, quite limited.

This tutorial is completely inspired from ArchWiki systemd-nspawn.

Also, keep in mind that this is not a script, don't copy and paste it directly, run it step by step and pay attention to what you validate ;-)

Step one: setup a container filesystem

MACHINE_NAME=sandbox
CONTAINER=/usr/lib/machines/$MACHINE_NAME
sudo mkdir $CONTAINER -p

On Archlinux, pacstrap can bootstrap a new filesystem architecture. debbootstrap on Debian and dnf on Fedora can certainly do the same.

sudo pacman -S arch-install-scripts

BASE_PACKAGES="base firefox gdb vim sudo"
PKG_TO_EXCLUDE=linux # see the archwiki
sudo pacstrap -i -c -d $CONTAINER $BASE_PACKAGES --ignore $PKG_TO_EXCLUDE

And that's it for a minimal system !

Step two: prepare the container environment

Let's boot it, configure it, then we'll come back on some more host-side configuration.

sudo systemd-nspawn --boot --directory $CONTAINER --network-veth

# systemd boots the system ...
# login as root ...

# get these values on the HOST
UID=$(echo $UID from HOST)
GID=$(echo $GID from HOST)
USER=$(echo $USER from HOST)

# add and configure the container user:
groupadd --gid $GID $USER
useradd --uid $UID --create-home --gid $GID $USER
passwd $USER # ...
visudo # add something like '$USER ALL=(ALL) ALL'

# setup the network DHCP (through the virtal-nic)
systemctl enable systemd-networkd
systemctl start systemd-networkd
networkctl # check that everything is prepared:
# host0            ether              routable    configur[ing|ed]

# setup the DNS resolver
systemctl enable systemd-resolved.service
systemctl start systemd-resolved.service

rm -f /etc/resolv.conf
ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf

# and quit (the *right* machine ;-)
poweroff

Step three: finish the host configuration

We're back on the host for some more configuration:

# just in case
sudo systemctl enable systemd-networkd
sudo systemctl start systemd-networkd
# or
sudo systemctl restart systemd-networkd

# then check:
networkctl status ve-sandbox
# if State: no-carrier (unmanaged)
cp /usr/lib/systemd/network/80-container-ve.network /etc/systemd/network/80-container-ve.network
# and comment: Driver=veth
# elif State: routable (configur[ing|ed])
# we're good !

# enable Linux port forwarding
sudo sysctl -w net.ipv4.ip_forward=1
# and make it permanent across reboots
echo net.ipv4.ip_forward = 1 | sudo tee /etc/sysctl.d/40-portforwarding.conf

We're almost done for the configuration! Now we need to setup the container start and access:

sudo systemctl enable machines.target
sudo systemctl enable systemd-nspawn@$MACHINE.service
sudo systemctl status systemd-nspawn@$MACHINE.service
# it should be up and running

# force nic reconfiguration
# (do it every time you boot a container)
sudo systemctl restart systemd-networkd

Step four: easy access to the container

And now your container is easy to access:

sudo machinectl
sudo machinectl login $MACHINE
sudo machinectl shell $USER@$MACHINE
sudo machinectl status $MACHINE
sudo machinectl poweroff $MACHINE

Bonus step: convenient access to the container

Sharing Xorg

sudo cp /usr/lib/systemd/system/systemd-nspawn@.service /etc/systemd/system/machines.target.wants/systemd-nspawn@$MACHINE.service
# then add --bind /tmp/.X11-unix to the container option
# then (in the container)
echo export DISPLAY=:0 >> ~/.profile

That should be enough to be able to run X application on the sandbox. Keep in mind that sharing Xorg between the two environment flaws the sandboxing, Xorg is old and buggy ...

If you want to share anything else, the option syntax is:

--bind /path/on/host[:/path/on/container]

Passwordless connection

# add to /etc/suders.d/$USER something like
$USER ALL= NOPASSWD: /usr/bin/systemctl restart systemd-networkd
$USER ALL= NOPASSWD: /usr/bin/machinectl shell $USER@$MACHINE*

LD_PRELOAD interpolation and variadic functions

Tuesday, November 17, 2015 - No comments

As part of my work on OpenMP debugging, I had to implement interpolation functions to capture the beginning and end of some library functions. In general, that's easy:

// library function we want to intercept:
// int test(int a, int b);

int test(int a, int b) {
   static fct_t real_test = dlsym(RTLD_NEXT, "test");

   //before test
   real_test(a, b);
   //after test

}

I didn't test this code, but that should work. Now, what it the function you want to intercept looks like that:

void myprintf_real(const char *fmt, ...);

Portable Answer

you can't intercept it!

ASM Hardcore Answer

//tested and certainly only working on x86-64
void myprintf_asm_interpo(const char *fmt, ...) {
   //before myprintf

  // unroll the frame
  asm volatile("mov    -0xb8(%rbp),%rdi\n\t"
               "mov    -0xa8(%rbp),%rsi\n\t"
               "mov    -0xa0(%rbp),%rdx\n\t"
               "mov    -0x98(%rbp),%rcx\n\t"
               "mov    -0x90(%rbp),%r8\n\t"
               "mov    -0x88(%rbp),%r9\n\t"
               "add     $0xc0,%rsp\n\t");

  //jump to myprintf_real;

  volatile register void ** rsp asm ("rsp");

  //movq    $[addrs], -8(%%rsp) // I can't compile that ...

  *(rsp-1) = myprintf_real;

  asm volatile(
    "mov    %rbp,%rsp\n\t"
    "pop     %rbp\n\t"
    "mov     (%rax), %rax\n\t"
    "jmpq    *%rax");
}

I disassembled the function prolog with GDB, then reversed it and put it before jumping to myprintf_real. With that technique I can't insert code after the function execution (because it returns directly to the caller frame), but that was already a good start.

GCC-Specific Clean Answer

With __builtin_apply_args() and __builtin_apply()! Easy peasy! I prefer my code to be gcc-specific than architecture specific, especially with such a hardcoded blob of assembly!

int myprintf_gcc_interpo(char *fmt, ...) {
    // before 
    void *arg = __builtin_apply_args();
    void *ret = __builtin_apply((void*)printf, arg, 100);
    //after
    __builtin_return(ret);
}

From stackoverflow with love ;-) Test it with this file.

Automatic SSL Certification with LetsEncrypt (and Bind9 zones)

Friday, November 13, 2015 - 1 comment

Update: Letsencrypt is live!!! Since Fri. Dec 4th, 2015, I have valid SSL certificates :-) (Until 2016-03-03 apparently, then I'll just have to rerun the script at the bottom of this article.)

Since I launched this VPS hosting this blog, I've only used self-signed certificates. Partly because of the price of certificated certificates, partly because of the setup difficulty. But today, LetsEncrypt is about to go beta public (December 3, 2015), so it's time to get a valid HTTPS connection !

For quite a while, I thought that my setup required a "wildcard" certificate (*.0x972.info), to cover the different subdomains I manage, as well as the ones I'll add tomorrow. So I was very disappointed when I read that LetsEncrypt would not allow them! But then I checked how LetsEncrypt works: a simple shell command :-) That means that I can easily create a script that list my subdomains and pass it to letsencrypt. And tomorrow with new subdomains, I'll just relaunch the script.

letsencrypt -d example.com auth

Listing VPS subdomains

So, where can I get the list of the subdomains currently setup? (I use AlternC to manage my VPS) I should be able to query the database to that ... but I could find where AlternC stores that function. I found another quick and easy solution: with the DNS configuration files! It makes sense :-)

$ ls /var/lib/alternc/bind/zones/
pouget.me
0x972.info
$ sudo cat /var/lib/alternc/bind/zones/0x972.info
....
@ IN A 62.4.19.144
blog IN A 62.4.19.144
www IN A 62.4.19.144

With a bit of bash-script around it, we get:

get_subdomains() {
  domain=$1
  cat $BIND_ZONES/$domain \
           | grep "IN A"  \
           | cut -d" " -f1 \
           | while read subdomain
    do
       if [[ $subdomain == '@' ]]; then
         echo $domain
       else
         echo $subdomain.$domain
       fi
   done
}

Configuring LetsEncrypt

LetsEncrypt generates SSL certificates that authenticates the website you're communicating with, and encrypts the communication channel between the webserver and your computer. So the first step of the generation process is to make sure that the certificate is delivered to the website owner. To that purpose, LetsEncrypt uses a challenge/response protocol: the certification server tries to access an URL on the domain to certified: https://blog.0x972.info/.well-known/acme-challenge/$CHALLENGE, and the webserver should answer that request with the right answer, that it's the only one to know.

LetsEncrypt fully automates this process, but it needs help. The default method asks to stop any webserver listening on port :80, and starts its own server. This method means that you need to shutdown your webserver during the certificate generation. For a one-shot that may not be impossible, but LetsEncrypt certificates are only valid 90 days, so you have to renew it ~every two months. So we need to find a better way...

... and the solution already exists, it's called the webroot authenticator. Instead of letting LetsEncrypt starts is own webserver, you give it a path where it will store its challenge-response tokens, and you're in charge of putting it online. I did it this way:

$ cat /etc/apache2/conf.d/letsencrypt.conf 
Alias /.well-known /var/www/path/to/letsencrypt/.well-known
<Directory "/var/www/path/to/letsencrypt/.well-known">
    AllowOverride All
</Directory>

I tell Apache to create on every virtual host an alias directory, named /.well-known that points to /var/www/path/to/letsencrypt/.well-known. This directory has to be reachable and readable by Apache, and LetsEncrypt needs a read-write access (this part is easy, you run it as root !).

Then, just run LetsEncrypt with the following command (not that --webroot-path is not exactly the alias path):

sudo letsencrypt $DOMAINS auth --email $EMAIL  -a webroot --webroot-path /var/www/path/to/letsencrypt --renew-by-default

Configuring Apache

LetEncrypt put everything you need into /etc/letsencrypt/live/$DOMAIN:

$ sudo ls /etc/letsencrypt/live/0x972.info
cert.pem  chain.pem  fullchain.pem  privkey.pem

In Apache configuration, you'll need to add the following lines, either in the global configuration or in the virtual host parts:

SSLEngine on
SSLCaCertificatePath /etc/ssl/certs # not part of LetsEncrypt
SSLCertificateFile    /etc/letsencrypt/live/0x972.info/cert.pem
SSLCertificateKeyFile etc/letsencrypt/live/0x972.info/privkey.pem
SSLCertificateChainFile etc/letsencrypt/live/0x972.info/chain.pem

Finally reload Apache and your certificate should be live! (These certificates are not valid, don't forget that, you'll have to regenerate them after Dec, 3rd.)

Automating Everything

Finally, we need to script all of that for the automatic renewal. Nothing to to in Apache, the alias can stay here. I just have to skip some of the subdomains of the DNS that I don't use anymore:

#! /bin/bash

# Make sure only root can run our script
if [[ $EUID -ne 0 ]]; then
   echo "This script must be run as root" 1>&2
   exit 1
fi

TO_SKIP=("to_skip.0x972.info" "to_skip_2.0x972.info")
BIND_ZONES=/var/lib/alternc/bind/zones/
WEBROOT_PATH=/var/www/path/to/letsencrypt

letsencrypt() {
  letsencrypt $* auth --agree-dev-preview --renew-by-default -a webroot --webroot-path $WEBROOT_PATH
}

get_subdomains() {
  domain=$1
  cat $BIND_ZONES/$domain \
              | grep "IN A" \
              | cut -d" " -f1 \
              | while read subdomain
    do
       if [[ $subdomain == '@' ]]; then
         echo $domain
       else
         echo $subdomain.$domain
       fi
   done
}

get_all_subdomains() {
  for domain in $(ls $BIND_ZONES)
  do
    get_subdomains $domain | while read subdomain
    do
      echo $subdomain
    done
  done
}

subdomains_to_letsencrypt_opt() {
  while read subdom
  do
    if [[ " ${TO_SKIP[@]} " =~ " ${subdom} " ]]
    then
      continue
    fi
    echo "-d $subdom"
  done
}

letsencrypt $(get_all_subdomains | subdomains_to_letsencrypt_opt)

Tested in Debian GNU/Linux 7 (wheezy).

GDB and Frame-Filters: a bug and a quick fix

Tuesday, November 10, 2015 - No comments

With frame filters and decorators, GDB lets you rewrite the output of the where command. That's quite convenient, except that it doesn't work well in one situation (that I could not clearly understand ...):

(gdb) where no-filters # <--- notice the option here
#0  do_spin () at .../gcc-5.2.0/libgomp/config/linux/wait.h:55
#1  do_wait () at ...gcc-5.2.0//libgomp/config/linux/wait.h:64
#2  gomp_team_barrier_wait_end (...) at .../libgomp/config/linux/bar.c:112
#3  0x00007ffff7bd8966 in GOMP_barrier () at gomp_preload.c:49
#4  0x0000000000400a19 in main._omp_fn.0 () at parallel-demo.c:10
#5  0x00007ffff7bd89e4 in GOMP_parallel_trampoline (...) at gomp_preload.c:62
#6  0x00007ffff79c442e in gomp_thread_start () at .../libgomp/team.c:118
#7  0x00007ffff7bd8ce8 in pthread_create_trampoline () at pthread_preload.c:33
#8  0x00007ffff779f4a4 in start_thread () from /usr/lib/libpthread.so.0
#9  0x00007ffff74dd13d in clone () from /usr/lib/libc.so.6

becomes:

(gdb) where
#0  gomp_team_barrier_wait_end () at .../libgomp/config/linux/wait.h:55
#1  gomp_team_barrier_wait_end () at .../libgomp/config/linux/wait.h:64
#2  #pragma omp barrier () at parallel-demo.c:10
#4  #parallel zone #1 of main () at parallel-demo.c:10

Many frames are gone, that's my cleanup, some function names have been changed, that's my OpenMP work ... but the function name of frame #0 and #1 are inconsistent. It should not read gomp_team_barrier_wait_end but rather do_spin and do_wait, respectively.

I don't know what's special about these functions, they're inlined, but that's not enough to explain and recreate the problem ...

Anyway, I found that the inconsistency boils down to two lines:

(gdb) frame 0
(gdb) pi print(gdb.selected_frame().function())
gomp_team_barrier_wait_end # wrong
(gdb) pi print(gdb.selected_frame().name())    
do_spin # right

So to solve my problem, I add a frame decorator that picks up the frame name instead of its function symbol:

class BugFixFrame(gdb.frames.FrameDecorator):
    def function(self): 
        return self.inferior_frame().name() 

class BugFixFrameFilter:
    def __init__(self):
        self.enabled = True
        self.priority = 99999

    def filter(self, frames):
        for frame in frames:
            yield BugFixFrame(frame)

gdb.frame_filters["Bug fix frame filter"] = BugFixFrameFilter()

and I now have my clean and correct callstack:

(gdb) where                                                          
#0  do_spin (val=0, addr=0x602104) at .../libgomp/config/linux/wait.h:55
#1  do_wait (val=0, addr=0x602104) at .../libgomp/config/linux/wait.h:64
#2  #pragma omp barrier () at parallel-demo.c:10
#4  #parallel zone #1 of main () at parallel-demo.c:10

I've submitted a bug report as PR/19225.

Looking up Source-Code Lines from GDB/Python (and OpenMP complications)

Monday, November 02, 2015 - No comments

In GDB Python bindings, there is currently no direct way to translate a function symbol into its source file and corresponding lines. But that's possible with gdb command-line, and some more jungling:

(gdb) disassemble [function_name|*address]
Dump of assembler code for function .omp_ptask.:
   0x00000000004024a0 <+0>: push   %rbp
   0x00000000004024a1 <+1>: mov    %rsp,%rbp
   0x00000000004024a4 <+4>: sub    $0x20,%rsp
   0x00000000004024a8 <+8>: mov    %edi,-0x8(%rbp)
   0x00000000004024ab <+11>:    mov    %rsi,-0x10(%rbp)
   0x00000000004024af <+15>:    mov    -0x8(%rbp),%edi
   0x00000000004024b2 <+18>:    mov    %edi,-0x14(%rbp)
   0x00000000004024b5 <+21>:    mov    -0x10(%rbp),%rsi
   0x00000000004024b9 <+25>:    mov    (%rsi),%rsi
   0x00000000004024bc <+28>:    mov    (%rsi),%rdi
   0x00000000004024bf <+31>:    callq  0x4009f0 <foo>
=> 0x00000000004024c4 <+36>:    mov    -0x4(%rbp),%eax
   0x00000000004024c7 <+39>:    add    $0x20,%rsp
   0x00000000004024cb <+43>:    pop    %rbp
   0x00000000004024cc <+44>:    retq   
End of assembler dump.

With disassemble, we know (gdb tells us) where a function begins and ends ... in memory. In theory, we just have to parse the second and penultimate lines of gdb.execute("disassemble {addr}"). But in practise, compilers may reorganize (for optimization) the binary instructions, so it's safer to iterate through all of them. Then, gdb.find_pc_line(pc) tells use the source-code line matching that PC. There we are:

 def get_function_fname_and_lines(fct_symb):
     fct_addr = long(fct_symb.value().address)
     disa = gdb.execute("disassemble {}".format(fct_addr), to_string=True)

     filename = fct_symb.symtab.filename
     from_line = fct_symb.line
     to_line = 0
     for disa_line in disa.split("\n"):
         if "Dump of assembler code" in disa_line:
             continue # skip first line
         if "End of assembler dump." in disa_line:
             break # we're at the end
         try:
             # parse the PC value
             # => 0x004009c1 <+32>: jmpq   0x401464 <main._omp_fn.0+2755>
             pc = int(disa_line.replace("=>", "").split()[0], 16)
         except:
             log.warning("Could not parse disassembly line ...")
             log.warning(disa_line)
             continue

         sal = gdb.find_pc_line(pc)
         if not sal:
             continue # hum, nothing known that that PC


         # check for consistency that PC is in the right file
         if not sal.symtab.filename == fct_symb.symtab.filename:
             log.info("not the right file, inlined ?")
             continue

         # if function symbol doesn't specify its line
         if fct_symb.line == 0:
             if from_line == 0 or sal.line < from_line:
                 from_line = sal.line

         # PCs may not be in order
         if sal.line > to_line:
             to_line = sal.line

     return filename, from_line, to_line

which gives:

(gdb) print(get_function_fname_and_lines(gdb.lookup_symbol("main")[0]))
('minimal_omp_threads.c', 26, 76)

OpenMP complications

I wrote that function as part of my work on OpenMP (OMP) debugging. In OMP, compilers do "outlining", that is, the reverse of inlining:

#pragma omp task
    foo1(&i, &j, &k);

becomes with GCC/GOMP:

main._omp_fn.3 (...) {
    foo1 (...);
}

GOMP_task (main._omp_fn.3, ...);

Everything is okay here, my code works well. But with Intel OpenMP and LLVM/Clang, they didn't implement outlining the same way: instead of naming the outlined functions something like <parent>._omp_fn.<id>, they name them ... .omp_microtask.! Thanks guys, now gdb.lookup_symbol(".omp_microtask.") always returns the same symbol (certainly the first one), and so does my source-code lookup function.

We do have the address of the function

(Pdb) print fn
0x402340 <.omp_ptask.>

but gdb.lookup_symbol cannot do the lookup by address ...

So let's get back to GDB/Python documentation and see how we can fix that:

Function: gdb.lookup_symbol (name [, block [, domain]])

This function searches for a symbol by name. The search scope can be restricted to the parameters defined in the optional domain and block arguments.

That block argument looks good (that's more or less the equivalent of a C scope). But where show I get it from?

I remember that [gdb.Frame](https://sourceware.org/gdb/current/onlinedocs/gdb/Frames-In-Python.html#Frames-In-Python) has a block attribute:

(gdb) pi gdb.lookup_symbol(".omp_microtask.", gdb.selected_frame().block())[0]
<gdb.Symbol object at 0x7fc96e0883c8>
(gdb) pi get_function_fname_and_lines(...)
('minimal_omp_threads.c', 38, 39)

but that doesn't work as I wanted (that is, from the task allocator function), because we are in the scope of the task allocator function, which is here equivalent to the global one. The lookup always resolves to the first task ...

So, how to get the right block? Let's get back to the documentation, maybe the block page ...

Function: gdb.block_for_pc (pc)

Return the innermost gdb.Block containing the given pc value. If the block cannot be found for the pc value specified, the function will return None.

Interesting ! Furthermore:

Variable: Block.function

The name of the block represented as a gdb.Symbol. If the block is not named, then this attribute holds None. This attribute is not writable.

For ordinary function blocks, the superblock is the static block. However, you should note that it is possible for a function block to have a superblock that is not the static block – for instance this happens for an inlined function.

Indeed:

(Pdb) gdb.block_for_pc (0x402340).function
<gdb.Symbol object at 0x7f824e346300> (.omp_ptask.)

so the final code for Intel OpenMP looks like that:

fct_addr = ... # "0x402340"
fct_symb = gdb.block_for_pc(int(fct_addr, 16)).function
my_gdb.get_function_fname_and_lines(fct_symb)

and that works well :-)

Install Archlinux Package without Internet Connection

Tuesday, October 27, 2015 - No comments

If for some reasons you have an Archlinux box without Internet access (like a Qemu system not completely setup?), but still want to install pacman packages, here is a little help:

Synchronize `pacman` database

cat > ./pacman_update.sh2 << EOF
#! /bin/sh

MIRROR=http://mirror.archlinuxarm.org/aarch64/

ROOT_FS=/path/to/archlinux/filesystem/root
PAC_SYNC=$ROOT_FS/var/lib/pacman/sync

set -x
wget -q $MIRROR/alarm/alarm.db -O $PAC_SYNC/alarm.db
wget -q $MIRROR/aur/aur.db -O $PAC_SYNC/aur.db
wget -q $MIRROR/community/community.db -O $PAC_SYNC/community.db
wget -q $MIRROR/core/core.db -O $PAC_SYNC/core.db
wget -q $MIRROR/extra/extra.db -O $PAC_SYNC/extra.db

Retrieve the packages to install

(archlinux) pacman -S zsh  
resolving dependencies...
looking for conflicting packages...

Packages (1) zsh-5.1.1-2.1

Total Download Size:   1.68 MiB  # nooo, we can't download that .... :(
Total Installed Size:  5.03 MiB

:: Proceed with installation? [Y/n] y
:: Retrieving packages ...
error: failed retrieving file 'zsh-5.1.1-2.1-aarch64.pkg.tar.xz' from mirror.archlinuxarm.org : Could not resolve host: mirror.archlinuxarm.org
warning: failed to retrieve some files
error: failed to commit transaction (download library error)
Errors occurred, no packages were upgraded.

yep, that's a good start, but that's not very convenient ...

(archlinux) pacman -Sp zsh
http://mirror.archlinuxarm.org/aarch64/extra/zsh-5.1.1-2.1-aarch64.pkg.tar.xz

yes, that's better !

Download the packages

cat > ./pacman_download.sh << EOF
#! /bin/sh

# run ./pacman_download.sh then past `pacman -Sp` urls to stdin

MIRROR=http://mirror.archlinuxarm.org/aarch64/

ROOT_FS=/home/kevin/travail/sample/juno-qemu/linaro/juno-fs
PAC_CACHE=$ROOT_FS/var/cache/pacman/pkg

while read url
do
    if [ -z "$url" ]
    then
        break
    fi
    echo Downloading $url into $PAC_CACHE ...
    wget -nc -q $url -P $PAC_CACHE
    echo Done
done
echo Bye bye.

Install the packages

(archlinux) pacman -S zsh
resolving dependencies...
looking for conflicting packages...

Packages (1) zsh-5.1.1-2.1

Total Installed Size:  5.03 MiB

:: Proceed with installation? [Y/n] 
(1/1) checking keys in keyring                     [######################] 100%
(1/1) checking package integrity                   [######################] 100%
(1/1) loading package files                        [######################] 100%
(1/1) checking for file conflicts                  [######################] 100%
(1/1) checking available disk space                [######################] 100%
(1/1) installing zsh                               [######################] 100%

And zsh is ready :-)

« Newer entries – Older entries »

Sunday, January 10, 2016 - No comments

Selfloss RSS Reader

Orochi

PyQuadStick

Shaarli

Friday, January 08, 2016 - No comments

Does GDB have a step-to-next-call instruction?

Debug a source file using GDB without stepping into library calls?

How to set skipping of uninteresting functions from gdbinit script?

How can I use gdb to catch the moment when a function returns false?

Friday, December 18, 2015 - No comments

Monday, November 30, 2015 - No comments

The problem

Surprise! (or not)

How to detect it: with Valgrind

How to detect it: with GDB

Thursday, November 19, 2015 - 1 comment

Step one: setup a container filesystem

Step two: prepare the container environment

Step three: finish the host configuration

Step four: easy access to the container

Bonus step: convenient access to the container

Tuesday, November 17, 2015 - No comments

Portable Answer

ASM Hardcore Answer

GCC-Specific Clean Answer

Friday, November 13, 2015 - 1 comment

Listing VPS subdomains

Configuring LetsEncrypt

Configuring Apache

Automating Everything

Tuesday, November 10, 2015 - No comments

Monday, November 02, 2015 - No comments

OpenMP complications

Tuesday, October 27, 2015 - No comments

Synchronize pacman database

Retrieve the packages to install

Download the packages

Install the packages

Does GDB have a `step-to-next-call` instruction?

Synchronize `pacman` database