I have a number of questions about Intel PT (have been trying to decode the manual but is very difficult). My questions are:
This is the patch that enable the use of Intel PT in 4.3:
https://lkml.org/lkml/2013/12/11/233
https://lkml.org/lkml/2015/9/24/181
https://lkml.org/lkml/2015/9/27/45
This is on the interaction of PT with other Intel features like LBT:
https://lkml.org/lkml/2014/7/31/572
Read up the documentation at tools/perf/Documentation/intel-pt.txt on usage how to.
Andi Kleen from Intel is the originator of the patch for Skylakes/Broadwell (only these two processor and the Atom series support Intel PT), and he has the userspace tool for demonstrating its use for debugging:
https://github.com/andikleen/simple-pt
For more details please see:
https://tthtlc.wordpress.com/2016/01/26/intel-processor-trace-how-to-use-it/
Five years ago as of this writing, but it does come up in searches, and this is a 2020 update:
Answers:
1) It depends on your OS. Any x86-64 OS should be able to support it as long as it sees an Intel CPU of at least Broadwell or greater generation, where the user really wants Skylake or better since finer grained timing and address filtering features are added.
For Linux these days the answear is yes as native support was added to the OS.
For Microsoft Windows there is unofficial semi-documented support added to Windows 10 via the ipt.sys driver.
See: https://github.com/ionescu007/winipt
Also there are a few (mostly abandoned) Windows IPT driver projects on Github including a working one for "CheatEngine".
2) Download the "Intel 64 and IA-32 Architectures Software Developer’s Manual", start at section "CHAPTER 35 INTEL PROCESSOR TRACE" page 35-1. As the others state you setup and control the IPT feature through a series of 9 MSR registers starting with IA32_RTIT_CTL. Unlike the forerunner Last Branch Trace (LBT) feature they at least made the MSRs constant across all CPU that support the feature (albeit some generations support more features than others).
Lacks overall documentation, but the Intel manual does lay out how to control and read the IPT trace data, for which you can follow up to decode with the iptlib Intel reference decoder.
3) See my answer #2. Again the manual mostly tells you how to do these things, plus you can look at the few Github driver projects and see how they do it. You can set the feature up to use a circular buffer or to trigger an interrupt when it's internal physical memory buffer (that you setup) gets full et al.
I'm also currently figuring out how to use Intel PT. As far as I know:
IA32_RTIT_CTL, at address 570H, is the primary enable and control MSR for trace packet generation. Bit positions are listed in Table 36-5.
You can clear or set the IA32_RTIT_CTL MSR to disable or enable PT tracing. This can be done from within the system PT is providing a trace of. In fact, I don't think it can be done any other way.
Yes. A Paging Information Packet (PIP) is created when modifications to the CR3 register happen. Not sure about IDTR and others, though. Furthermore, the CR3 register can be used for trace filtering.
The whole idea behind Intel PT is packet encoding and decoding. When x event happens, y packet is generated. It's your job to "decode" this CPU provided data and make some high level sense out of it. Additionally, you can "encode" packets and feed them into the system doing the decoding. Again, decoder (and, optionally, encoder) functionality is your job. You can check out Intel's opensource decoder/encoder library reference implementation here. I'd recommend trying it out under Linux, with the latest stable kernel (4.1.3 as of this writing). It's worth noting that PT stores its data where you tell it to, generally a reserved memory region, or a debugging port.