GTK 2 and general Linux graphics performance analysis

Written by Gionatan Danti on . Posted in Linux & Unix

User Rating:  / 9

GTK 2 CPU time analysis

Now we will try to go a little deeper. From the benchmarks above we saw that text rendering use the lion's share of execution time. In a way, this is only a normal thing: Gtkperf is quite intensive on its text-rendering benchmark. On the other hand, text rendering is actually performed via Pango, a library that for sure is not know for its speed: in the past, there was considerable complaints about the slowness of the Pango library.

Using Oprofile, we can monitor what libraries are the most CPU intensive, from both a clock perspective (who use the most CPU clock speed?) and an instructions count perspective (who issue the largest number of instructions?).

Lets see the CPU clock share first:

Most CPU-clock intensive libraries

The most intensive CPU clock application/library is the kernel (the no-vmlinux entry). It is normal, because it has a large number of thing to do: schedule process, schedule I/O, allocate/deallocate memory, execute system call requested from userspace applications, etc.

Next we have the libpixman library: considering the large quantity of pixel drawn in the GtkDrawingArea test, I'm not surprised to see this library here. After it, we find the libpangoft2 library which is used for, you imagine that, freetype fonts rendering.

The X server, which is ofter cited as the most serious speed bottleneck, use only a relatively little share of CPU resources (about 4.3%), proving that today (but also yesterday!) machines can manage it without problems. If we exclude the kernel itself, the pixman and pangoft2 libraries use the greatest share of CPU cycles, by far. Obviously, a GTK 2 optimization attempt should really focus itself on these two libraries.

But what about issued CPU instructions? Which libraries are the most “verbose” from a CPU standpoint?

Most instructions intensive libraries

Wow, now the libpangoft2 library is on top! It issue the greatest number of instructions but, as it isn't the top CPU cycles burner, probably its instruction and data streams fit better into the processor's cache and/or can be executed faster by the CPU's out-of-order logic.

On the other two top positions we find the kernel and the libpixman library, while the X server again contribute to the instruction stream only by a little percentage (<4%).

So, from an instruction stream standpoint also, it is clear that an eventual optimization should really be focused on Pango and pixman performances.

You have no rights to post comments