This section describes the derived events for Instruction-Based Sampling (IBS.) IBS is available on AMD Family 10h processors.
Abbreviation: IBS fetch
The number of all IBS fetch samples. This derived event counts the number of all IBS fetch samples that were collected including IBS-killed fetch samples.
Abbreviation: IBS fetch killed
The number of IBS sampled fetches that were killed fetches. A fetch operation is killed if the fetch did not reach ITLB or IC access. The number of killed fetch samples is not generally useful for analysis and are filtered out in other derived IBS fetch events (except Event Select 0xF000 which counts all IBS fetch samples including IBS killed fetch samples.)
Abbreviation: IBS fetch attempt
The number of IBS sampled fetches that were not killed fetch attempts. This derived event measures the number of useful fetch attempts and does not include the number of IBS killed fetch samples. This event should be used to compute ratios such as the ratio of IBS fetch IC misses to attempted fetches.
The number of attempted fetches should equal the sum of the number of completed fetches and the number of aborted fetches.
Abbreviation: IBS fetch comp
The number of IBS sampled fetches that completed. A fetch is completed if the attempted fetch delivers instruction data to the instruction decoder. Although the instruction data was delivered, it may still not be used (e.g., the instruction data may have been on the "wrong path" of an incorrectly predicted branch.)
Abbreviation: IBS fetch abort
The number of IBS sampled fetches that aborted. An attempted fetch is aborted if it did not complete and deliver instruction data to the decoder. An attempted fetch may abort at any point in the process of fetching instruction data. An abort may be due to a branch redirection as the result of a mispredicted branch.
The number of IBS aborted fetch samples is a lower bound on the amount of unsuccessful, speculative fetch activity. It is a lower bound since the instruction data delivered by completed fetches may not be used.
Abbreviation: IBS L1 ITLB hit
The number of IBS attempted fetch samples where the fetch operation initially hit in the L1 ITLB (Instruction Translation Lookaside Buffer).
Abbreviation: IBS ITLB L1M L2H
The number of IBS attempted fetch samples where the fetch operation initially missed in the L1 ITLB and hit in the L2 ITLB.
Abbreviation: IBS ITLB L1M L2M
The number of IBS attempted fetch samples where the fetch operation initially missed in both the L1 ITLB and the L2 ITLB.
Abbreviation: IBS IC miss
The number of IBS attempted fetch samples where the fetch operation initially missed in the IC (instruction cache).
Abbreviation: IBS IC hit
The number of IBS attempted fetch samples where the fetch operation initially hit in the IC.
Abbreviation: IBS 4K page
The number of IBS attempted fetch samples where the fetch operation produced a valid physical address (i.e., address translation completed successfully) and used a 4-KByte page entry in the L1 ITLB.
Abbreviation: IBS 2M page
The number of IBS attempted fetch samples where the fetch operation produced a valid physical address (i.e., address translation completed successfully) and used a 2-MByte page entry in the L1 ITLB.
Abbreviation: IBS fetch lat
The total latency of all IBS attempted fetch samples. Divide the total IBS fetch latency by the number of IBS attempted fetch samples to obtain the average latency of the attempted fetches that were sampled.
Abbreviation: IBS all ops
The number of all IBS op samples that were collected. These op samples may be branch ops, resync ops, ops that perform load/store operations, or undifferentiated ops (e.g., those ops that perform arithmetic operations, logical operations, etc.).
IBS collects data for retired ops. No data is collected for ops that are aborted due to pipeline flushes, etc. Thus, all sampled ops are architecturally significant and contribute to the successful forward progress of executing programs.
Abbreviation: IBS tag-to-ret
The total number of tag-to-retire cycles across all IBS op samples. The tag-to-retire time of an op is the number of cycles from when the op was tagged (selected for sampling) to when the op retired.
Abbreviation: IBS comp-to-ret
The total number of completion-to-retire cycles across all IBS op samples. The completion-to-retire time of an op is the number of cycles from when the op completed to when the op retired.
Abbreviation: IBS BR
The number of IBS retired branch op samples. A branch operation is a change in program control flow and includes unconditional and conditional branches, subroutine calls and subroutine returns. Branch ops are used to implement AMD64 branch semantics.
Abbreviation: IBS misp BR
The number of IBS samples for retired branch operations that were mispredicted. This event should be used to compute the ratio of mispredicted branch operations to all branch operations.
Abbreviation: IBS taken BR
The number of IBS samples for retired branch operations that were taken branches.
Abbreviation: IBS misp taken BR
The number of IBS samples for retired branch operations that were mispredicted taken branches.
Abbreviation: IBS RET
The number of IBS retired branch op samples where the operation was a subroutine return. These samples are a subset of all IBS retired branch op samples.
Abbreviation: IBS misp RET
The number of IBS retired branch op samples where the operation was a mispredicted subroutine return. This event should be used to compute the ratio of mispredicted returns to all subroutine returns.
Abbreviation: IBS resync
The number of IBS resync op samples. A resync op is only found in certain microcoded AMD64 instructions and causes a complete pipeline flush.
Abbreviation: IBS load/store
The number of IBS op samples for ops that perform either a load and/or store operation.
An AMD64 instruction may be translated into one ("single fastpath"), two ("double fastpath"), or several ("vector path") ops. Each op may perform a load operation, a store operation or both a load and store operation (each to the same address). Some op samples attributed to an AMD64 instruction may perform a load/store operation while other op samples attributed to the same instruction may not. Further, some branch instructions perform load/store operations. Thus, a mix of op sample types may be attributed to a single AMD64 instruction depending upon the ops that are issued from the AMD64 instruction and the op types.
Abbreviation: IBS load
The number of IBS op samples for ops that perform a load operation.
Abbreviation: IBS store
The number of IBS op samples for ops that perform a store operation.
Abbreviation: IBS L1 DTLB hit
The number of IBS op samples where either a load or store operation initially hit in the L1 DTLB (data translation lookaside buffer).
Abbreviation: IBS DTLB L1M L2H
The number of IBS op samples where either a load or store operation initially missed in the L1 DTLB and hit in the L2 DTLB.
Abbreviation: IBS DTLB L1M L2M
The number of IBS op samples where either a load or store operation initially missed in both the L1 DTLB and the L2 DTLB.
Abbreviation: IBS DC miss
The number of IBS op samples where either a load or store operation initially missed in the data cache (DC).
Abbreviation: IBS DC hit
The number of IBS op samples where either a load or store operation initially hit in the data cache (DC).
Abbreviation: IBS misalign acc
The number of IBS op samples where either a load or store operation caused a misaligned access (i.e., the load or store operation crossed a 128-bit boundary).
Abbreviation: IBS bank conf load
The number of IBS op samples where either a load or store operation caused a bank conflict with a load operation.
Abbreviation: IBS bank conf store
The number of IBS op samples where either a load or store operation caused a bank conflict with a store operation.
Abbreviation: IBS forwarded
The number of IBS op samples where data for a load operation was forwarded from a store operation.
Abbreviation: IBS cancelled
The number of IBS op samples where data forwarding to a load operation from a store was cancelled.
Abbreviation: IBS UC mem acc
The number of IBS op samples where a load or store operation accessed uncacheable (UC) memory.
Abbreviation: IBS WC mem acc
The number of IBS op samples where a load or store operation accessed write combining (WC) memory.
Abbreviation: IBS locked op
The number of IBS op samples where a load or store operation was a locked operation.
Abbreviation: IBS MAB hit
The number of IBS op samples where a load or store operation hit an already allocated entry in the Miss Address Buffer (MAB).
Abbreviation: IBS L1 DTLB 4K
The number of IBS op samples where a load or store operation produced a valid linear (virtual) address and a 4-KByte page entry in the L1 DTLB was used for address translation.
Abbreviation: IBS L1 DTLB 2M
The number of IBS op samples where a load or store operation produced a valid linear (virtual) address and a 2-MByte page entry in the L1 DTLB was used for address translation.
Abbreviation: IBS L1 DTLB 1G
The number of IBS op samples where a load or store operation produced a valid linear (virtual) address and a 1-GByte page entry in the L1 DTLB was used for address translation.
Abbreviation: IBS L2 DTLB 4K
The number of IBS op samples where a load or store operation produced a valid linear (virtual) address, hit the L2 DTLB, and used a 4 KByte page entry for address translation.
Abbreviation: IBS L2 DTLB 2M
The number of IBS op samples where a load or store operation produced a valid linear (virtual) address, hit the L2 DTLB, and used a 2-MByte page entry for address translation.
Abbreviation: IBS DC load lat
The total DC miss latency (in processor cycles) across all IBS op samples that performed a load operation. The miss latency is the number of clock cycles from when the data cache miss was detected to when data was delivered to the core. Divide the total DC miss latency by the number of sampled load operations to obtain the average DC miss latency.
Abbreviation: IBS NB local
The number of IBS op samples where a load operation was serviced from the local processor.
Northbridge IBS data is only valid for load operations that miss in both the L1 data cache and the L2 data cache. If a load operation crosses a cache line boundary, then the IBS data reflects the access to the lower cache line.
Abbreviation: IBS NB remote
The number of IBS op samples where a load operation was serviced from a remote processor.
Abbreviation: IBS NB local L3
The number of IBS op samples where a load operation was serviced by the local L3 cache.
Abbreviation: IBS NB local cache
The number of IBS op samples where a load operation was serviced by a cache (L1 data cache or L2 cache) belonging to a local core which is a sibling of the core making the memory request.
Abbreviation: IBS NB remote cache
The number of IBS op samples where a load operation was serviced by a remote L1 data cache, L2 cache or L3 cache after traversing one or more coherent HyperTransport™ links.
Abbreviation: IBS NB local DRAM
The number of IBS op samples where a load operation was serviced by local system memory (local DRAM via the memory controller).
Abbreviation: IBS NB remote DRAM
The number of IBS op samples where a load operation was serviced by remote system memory (after traversing one or more coherent HyperTransport links and through a remote memory controller).
Abbreviation: IBS NB local other
The number of IBS op samples where a load operation was serviced from local MMIO, configuration or PCI space, or from the local APIC.
Abbreviation: IBS NB remote other
The number of IBS op samples where a load operation was serviced from remote MMIO, configuration or PCI space.
Abbreviation: IBS NB cache M
The number of IBS op samples where a load operation was serviced from local or remote cache, and the cache hit state was the Modified (M) state.
Abbreviation: IBS NB cache O
The number of IBS op samples where a load operation was serviced from local or remote cache, and the cache hit state was the Owned (O) state.
Abbreviation: IBS NB local lat
The total data cache miss latency (in processor cycles) for load operations that were serviced by the local processor.
Abbreviation: IBS NB remote lat
The total data cache miss latency (in processor cycles) for load operations that were serviced by a remote processor.