(gdb) break *0x972

Debugging, GNU± Linux and WebHosting and ... and ...

Understanding why Git Annex is busy with Strace

I'm getting into git annex for synchronizing my photos and videos between two computers and two backup disks, and sometimes commands take forever to complete. As they're not very verbose, it's try to understand what's going on:

terminal #1 $ git annex direct
commit

terminal #2 $ top
Tasks: 137 total,   2 running, 134 sleeping,   0 stopped,   1 zombie
%Cpu(s): 11.5 us,  3.8 sy,  0.0 ni, 34.7 id, 49.7 wa,  0.0 hi,  0.3 si,  0.0 st
GiB Mem :    3.830 total,    0.456 free,    0.818 used,    2.556 buff/cache
GiB Swap:    1.862 total,    1.832 free,    0.030 used.    2.711 avail Mem 

  PID USER      PR  NI    VIRT    RES  %CPU %MEM     TIME+ S COMMAND                                                                                                     
 9770 kevin     20   0 1046.6m  31.8m  21.2  0.8  66:17.35 S git

OK, our target pid is 9770 or $(pidof git) if there's only one.

$ sudo strace -p 9770
Process 9770 attached
read(4, "\32\331QJ....36\31#\327\321\361vr\246\326{y"..., 16384) = 16384
write(5, "\273\3203....\232\337\227\310\233b"..., 8192) = 8192
write(5, "F\231C\24...73<=7\34479\222\342\327\233:"..., 8192) = 8192
read(4, "M\223\36...\271\264\327\321l\260h&\36\226"..., 16384) = 16384
^C
Process 9770 detached
 <detached ...>

Git is reading from fd=4 and certainly writing it to fd=5. What are these files?

$ llh /proc/9770/fd
....
lr-x------ 1 kevin users 64 Dec 21 12:06 4 -> .../2013-03-10 Ski de rando au Grand Colon/100_0355.MP4
lrwx------ 1 kevin users 64 Dec 21 12:06 5 -> /media/sdb1/data/.git/objects/pack/tmp_pack_0vBSc4

Ok, that's why it takes time, it's copying the content of my files to git internal objects ... I'm not sure I wanted git annex to do that in fact ...

Subsidiary question: is it a good idea to let strace attached to my process, for instance to follow to progression of git annex, by tracking the files it opens?

(terminal 1) $ sudo strace -e open -p 9770
(terminal 2) $ sudo strace -e open -p $(pidof strace)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=20819, si_uid=1000, si_status=SIGTRAP, si_utime=62, si_stime=1355} ---
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == 133}], __WALL, NULL) = 20819
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE TERM], NULL, 8) = 0
ptrace(PTRACE_GETREGSET, 20819, NT_PRSTATUS, [{0x66a480, 216}]) = 0
ptrace(PTRACE_SYSCALL, 20819, 0, SIG_0) = 0
# and again and again

Answer: No! strace receives a ptrace notification each time the traced application does a syscall, no matter what we asked to print (-e open). And each time, strace asks for the CPU registers (PTRACE_GETREGSET). So no, don't let strace attached to your application, it will slow it down a lot!