全部博文(14)
分类:
2010-07-19 09:54:50
This enables support for sparse irqs. This is useful for distro kernels that want to define a high CONFIG_NR_CPUS value but still want to have low kernel memory footprint on smaller machines.
( Sparse IRQs can also be beneficial on NUMA boxes, as they spread out the irq_desc[] array in a more NUMA-friendly way. )
Impact: new CONFIG_SPARSE_IRQ feature, which makes irq_desc[] a sparse
array
To support kernels with very large NR_CPUS and NR_IRQS settings,
we need to reduce the size of irq_desc[]. On x86, when NR_CPUS is
set to 4096, the irq_desc[] array will waste megabytes of RAM,
which is not acceptable overhead to generic distro kernels.
In v2.6.28 we already introduced a generic API to make access to
the irq_desc[] array more abstract - and to allow a different
data structure to underly it. This patch finishes that process.
Core kernel changes:
- fix missing sparseirq API changes in various bits of core kernel code
(missing for_irq_desc primitives, missing checks for !desc, etc.)
- introduce a new data type in the IRQ code: irq_desc_ptrs[] and its
handling in the core IRQ code
- detach the IRQ statistics counters from kernel_stat and
attach it to irq_desc->kstat_irqs[] dynamically allocated
array of pointers. (this can use percpu_alloc() in the
future, once percpu_alloc() becomes generic enough)
- detach the NR_IRQS array in random.c.
- interrupt remapping: when moving an IRQ on NUMA, reallocate the irq
descriptor so that we get proper NUMA-local memory for the
descriptor,
for the irq_cfg entry and for the kstat_irqs array.
Architectures can enable this by setting the CONFIG_SPARSE_IRQ
config switch. The x86 architecture is extended/fixed to deal
with such an irq_desc[] model:
- io_apic irq_cfg[NR_IRQS] array is re-attached to desc->irq_chip
- MSI virtual IRQ numbering is sanitized to go from the max upper
end of the physical IRQ range up towards NR_IRQS - instead of
coming down from the end of NR_IRQS.
- re-tunes our max NR_IRQS calculations
Architectures that do not specify CONFIG_SPARSE_IRQ, do not need
to change anything - this is a transparent feature that is not
supposed to break any existing code.