PMELib 1.0
This library is an interface to the Performance Monitoring Events (PME) that are available in the Pentium P5, P6 and P4 processors. There are 18 counters that let you gather information about what the processor is going during execution. They are described in these manuals:
IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic Architecture
IA-32 Intel Architecture Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
IA-32 Intel Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z
IA-32 Intel Architecture Software Developer's Manual Volume 3: System Programming Guide
See chapter 15, Appendix A and Appendix B
Pentium 4 documentation is available here
This library is an extension to the utilities from the Game Developer's Magazine article by Robert Wyatt in May 1998. The library is now in a class and included the Pentium 4 processor. Only Intel is currently supported and tested.
Send feedback to doug@nvidia.com
Window NT and Windows XP are supported and tested. Win98 may work, however.
You need to install a driver, set some registry settings and reboot. If you installed the GDPerf.sys from the GD magazine article, you can skip the installation step. It uses the same driver. Let me know if you have problems.
In the Installation directory
copy GDPerf.sys to the window driver directory
copy GDPerf.sys C:\windows\system32\drivers
there is a batch file as an example
Run the PMELib.reg file to set the registry settings
Reboot
For P5 and P6 processors (anything before Pentium 4), You use the same interfaces that are described in the Game Developer article. They have just been incorporated in to the PMELib as is.
In the Pentium 4, are 18 performance monitoring counters and more than 40 Events Modes that can be captured. Each Mode has a bit mask that indicates which tests to perform. These are described in Appendix A of IA-32 Intel Architecture Software Developer's Manual Volume 3: System Programming Guide Each of these event mode has a class dedicated to it. The event modes are listed below.
Set PerfTest2 for an example.
Step 1) Choose an Event Mode class
Example:
Event_branch_retired event;
Step 2) Set the Event Mask for the selected Event Mode
Example:
event.eventMask->MMNP = 1;
event.eventMask->MMTP = 1;
Step 3) Set the privilege level to capture data from with the SetCaptureMode method
OS_Only, // ring 0, driver level only
USR_Only, // app level, privilege levels 1 2 and 3
OS_and_USR, // all levels 0, 1, 2 and 3
Optionally, you can enable tagging in the SetCaptureMode method.
Example:
SetCaptureMode(OS_and_USR, TagEnable, 34);
Step 4) Optional Configuration
At this point you can configure Tagging, Filtering, Overflow and Cascading options. You can also select one of the legal counters for the selected Event Mode.
Step 5) Set the process priority to high
This reduces the noise from other processes interfering. If you have an infinite loop in you code and you have these set, you may hang and need to reboot
Example:
PME * pme = PME::Instance();
pme->SetProcessPriority(ProcessPriorityHigh);
Step 6) Start using the counters
Each Event Mode counter has the follow ability:
Stop
Start
Clear - set to 0
Read
Write - write a 64 bit counter value
Step 7) Set the process priority to normal
pme->SetProcessPriority(ProcessPriorityNormal);
Event_TC_deliver_mode
Event_BPU_fetch_request
Event_ITLB_reference
Event_memory_cancel
Event_memory_complete
Event_load_port_replay
Event_store_port_replay
Event_MOB_load_replay
Event_page_walk_type
Event_BSQ_cache_reference
Event_IOQ_allocation
Event_IOQ_active_entries
Event_FSB_data_activity
Event_BSQ_allocation
Event_BSQ_active_entries
Event_SSE_input_assist
Event_packed_SP_uop
Event_packed_DP_uop
Event_scalar_SP_uop
Event_scalar_DP_uop
Event_64bit_MMX_uop
Event_128bit_MMX_uop
Event_x87_FP_uop
Event_x87_SIMD_moves_uop
Event_TC_misc
Event_global_power_events
Event_tc_ms_xfer
Event_uop_queue_writes
Event_retired_mispred_branch_type
Event_retired_branch_type
Event_resource_stall
Event_WC_Buffer
Event_b2b_cycles
Event_bnr
Event_snoop
Event_response
Event_front_end_event
Event_execution_event
Event_replay_event
Event_instr_retired
Event_uops_retired
Event_uop_type
Event_branch_retired
Event_mispred_branch_retired
Event_x87_assist
Event_machine_clear
http://www.gamasutra.com/features/wyatts_world/19990528/pentium3_08.htm
Used some tables from Mikael Pettersson's pertctf
Used the detect code from Kamen Yotov's ia32lib library