Unix C library

libc is a C library shipped with Unix which provides access to most of its facilities like system calls in a portable manner. Linux glibc (one of the implementations of libc which is most popular) and libc shipped with Solaris contain some USDT probes. We will discuss them in this section.

On Solaris USDT probes are limited to userspace mutexes and read-write locks which available as plockstat provider (similar to lockstat provider we discussed earlier). glibc, however implements wider set of probes: along with various pthread operations which include not only mutexes and rwlocks but also condition variables and threads operations, it supports tracing of setjmp/longjmp and dynamic linker ld.so.

Lets see how mutexes are traced in Solaris and Linux (in this section we will assume glibc by saying "Linux"). Solaris provides them through plockstat provider:

plockstatpid:::probe-name

SystemTap will use standard USDT notation for it:

probe process("libpthread.so.0").mark("probe-name")

Note that libpthread.so.0 will vary in different distributions. We will use macro-definitions for paths in our scripts.

Userspace programs have to explicitly ask kernel to block thread that is waiting on condition variable or mutex. Linux provides futex system call for it which is wrapped into so-called low-level-locks in glibc (they are seen by probes those name start with lll). Solaris provides multiple lwp_* system calls for it like lwp_park which "parks" thread (stops its execution).

Here are list of probes available for userspace mutexes (use them as probe name). First argument (arg0 in DTrace or $arg1 in SystemTap) would be address of pthread mutex. Some probes can contain more arguments, i.e. DTrace will pass number of spinning loops to mutex-spun probe. Check documentation for them.

Action DTrace SystemTap
Creation - mutex_init
Destruction - mutex_destroy
Attempt to acquire - mutex_entry
Busy waiting (spinning) mutex-spin
mutex-spun
-
Attempt to block mutex-block lll_lock_wait
Acquired mutex mutex-blocked
mutex-acquire
mutex-error
mutex_release
lll_futex_wake

Here are an example of pthread tracer in SystemTap:

  Script file pthread.stp

#!/usr/bin/env stap

@define libpthread %( "/lib64/libpthread.so.0" %)
@define libc %( "/lib64/libc.so.6" %)

probe process(@libpthread).mark("pthread_create") {
    if(pid() != target()) next;

    thread_id = user_long($arg1);
    thread_caller = usymname($arg3);
    thread_arg = $arg4;
    
    printf("[%d] pthread_create %x %s(%x)\n", tid(),
            thread_id, thread_caller, thread_arg);
}

probe process(@libpthread).mark("pthread_start") {
    if(pid() != target()) next;

    thread_id = $arg1;
    printf("[%d] pthread_start %x\n", tid(), thread_id);
}

probe process(@libpthread).mark("pthread_join") {
    if(pid() != target()) next;

    thread_id = $arg1;
    printf("[%d] pthread_join %x\n", tid(), thread_id);
}

probe process(@libpthread).mark("pthread_join_ret") {
    if(pid() != target()) next;

    thread_id = $arg1;
    printf("[%d] pthread_join %x return -> %d/%d \n", tid(),
            thread_id, $arg2, $arg3);
}

probe process(@libpthread).mark("mutex_*"),
      process(@libpthread).mark("cond_*"),
      process(@libpthread).mark("rdlock_*"),
      process(@libpthread).mark("wrlock_*"),
      process(@libpthread).mark("rwlock_*") {
    if(pid() != target()) next;

    printf("[%d] %s %p\n", tid(), pn(), $arg1);
    print_ustack(ucallers(5));
}

probe process(@libpthread).mark("lll_*"),
      process(@libc).mark("lll_*") {
    if(pid() != target()) next;

    printf("[%d] %s\n", tid(), pn());
    print_ustack(ucallers(5));
}

If we set tsexperiment process as a target, we can see how request is passed from control thread to a worker thread (some output is omitted):

[8972] process("/lib64/libpthread.so.0").mark("mutex_entry") 0xe1a218
  0x7fbcf890fa27 : tpd_wqueue_put+0x26/0x6a [/opt/tsload/lib/libtsload.so]
[8972] process("/lib64/libpthread.so.0").mark("mutex_acquired") 0xe1a218
 0x7fbcf890fa27 : tpd_wqueue_put+0x26/0x6a [/opt/tsload/lib/libtsload.so]
[8972] process("/lib64/libpthread.so.0").mark("cond_broadcast") 0xe1a240
[8972] process("/lib64/libpthread.so.0").mark("mutex_release") 0xe1a218
 0x7fbcf890fa27 : tpd_wqueue_put+0x26/0x6a [/opt/tsload/lib/libtsload.so]
[8971] process("/lib64/libpthread.so.0").mark("mutex_entry") 0xe1a628
 0x7fbcf9148fed : cv_wait+0x2d/0x2f [/opt/tsload/lib/libtscommon.so]
 0x7fbcf890f93f : tpd_wqueue_pick+0x44/0xbc [/opt/tsload/lib/libtsload.so]
[8971] process("/lib64/libpthread.so.0").mark("mutex_acquired") 0xe1a628

Note that thread with TID=8972 will acquire mutex in tpd_wqueue_put function and then send a broadcast message to all workers. One of them (one with TID=8971) wakes up, re-acquires mutex and gets request through tpd_wqueue_pick.

plockstat doesn't support many probes that glibc do, put we can easily replace them with pid provider and function boundary tracing:

  Script file pthread.d

#!/usr/bin/dtrace
#pragma D option bufsize=8m
#pragma D option switchrate=100hz

pid$target::pthread_create:entry {
    self->thread_id_ptr = (uintptr_t) arg0;
    self->thread_func = arg2;
    self->thread_arg = arg3;
}
pid$target::pthread_create:return {
    this->thread_id = * (uint_t*) copyin(self->thread_id_ptr, sizeof(uint_t));
    printf("[%d] pthread_create %x ", tid, this->thread_id);
    usym(self->thread_func);
    printf("(%x)\n", self->thread_arg);
}
pid$target::pthread_join:entry {
    self->thread_id = (uint_t) arg0;
    printf("[%d] pthread_join %x\n", tid, self->thread_id);
}
pid$target::pthread_join:return {
    printf("[%d] pthread_join:return %x -> %d\n", tid, self->thread_id, arg1);
}

plockstat$target:::, 
pid$target::pthread_cond_*wait*:entry,
pid$target::pthread_cond_*wait*:return,
pid$target::pthread_cond_signal:entry,
pid$target::pthread_cond_broadcast:entry {
    printf("[%d] %s:%s ", tid, probefunc, probename);
    usym(arg0);
    printf("[%p]\n", arg0);
    ustack(6);
}

That script yields similar results on Solaris:

[7] mutex_lock_impl:mutex-acquire   0x46d4a0                        [46d4a0]
              libtsload.so`tpd_wqueue_put+0x26
[7] cond_signal:entry   0x46d4e0                                    [46d4e0]
[7] mutex_unlock_queue:mutex-release   0x46d4a0                     [46d4a0]
[7] mutex_unlock_queue:mutex-release   0x46d4a0                     [46d4a0]
[6] mutex_lock_impl:mutex-acquire   0x46d4a0                        [46d4a0]
              libtsload.so`tpd_wqueue_pick+0xb6
[6] pthread_cond_wait:return   0x15                                 [15]

References