分类: LINUX
2011-11-09 12:41:21
Direct memory access (DMA) is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the central processing unit (CPU).
Without DMA, the CPU using programmed input/output is typically fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other work. With DMA, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is useful any time the CPU cannot keep up with the rate of data transfer, or where the CPU can perform useful work while waiting for a relatively slow I/O data transfer. Many hardware systems use DMA including disk drive controllers, graphics cards, network cards and sound cards. DMA is also used for intra-chip data transfer in multi-core processors. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel. Similarly, a processing element inside a multi-core processor can transfer data to and from its local memory without occupying its processor time, allowing computation and data transfer to proceed in parallel.
DMA can also be used for "memory to memory" copying or moving of data within memory. DMA can offload expensive memory operations, such as large copies or scatter-gatheroperations, from the CPU to a dedicated DMA engine.
A DMA controller can generate addresses and initiate memory read or write cycles. It contains several registers that can be written and read by the CPU. These include a memory address register, a byte count register, and one or more control registers. The control registers specify the I/O port to use, the direction of the transfer (reading from the I/O device or writing to the I/O device), the transfer unit (byte at a time or word at a time), and the number of bytes to transfer in one burst. [1]
To carry out an input, output or memory-to-memory operation, the host processor initializes the DMA controller with a count of the number of words to transfer, and the memory address to use. The CPU then sends commands to a peripheral device to initiate transfer of data. The DMA controller then provides addresses and read/write control lines to the system memory. Each time a word of data is ready to be transferred between the peripheral device and memory, the DMA controller increments its internal address register until the full block of data is transferred.
DMA transfers can either occur one word at a time, allowing the CPU to access memory on alternate bus cycles - this is called cycle stealing since the DMA controller and CPU contend for memory access. In burst mode DMA, the CPU can be put on hold while the DMA transfer occurs and a full block of possibly hundreds or thousands of words can be moved.[2] Where memory cycles are much faster than processor cycles, an interleaved DMA cycle is possible, where the DMA controller uses memory while the CPU cannot.
In a bus mastering system, both the CPU and peripherals can be granted control of the memory bus. Where a peripheral can become bus master, it can directly write to system memory without involvement of the CPU, providing memory address and control signals as required. Some measure must be provided to put the processor into a hold condition so that bus contention does not occur.
The OHCI 1394 specification allows for devices for performance reasons to bypass the operating system and access physical memory directly without any security restrictions.[1][2]SBP2 devices can be spoofed, allowing an operating system to be tricked into allowing an attacker to both read and write physical memory.[3]
In addition to the nefarious uses mentioned above, there are some beneficial uses too as the DMA features can be used for kernel debugging purposes.[4]
Systems may be vulnerable to a DMA attack by an external device if they have a FireWire port, or if they have a PCMCIA or ExpressCard port that allows an expansion card with a FireWire port to be installed where the operating system supports plug and play. Systems with a Thunderbolt port may also be vulnerable.[citation needed][5]
IOMMU and VT-d can be used to secure device and allow it using only part of memory and use virtual memory. It was developed mainly for using in virtualization, but can be also used from preventing DMA attack and other device malfunctions. This technique however isn't used in any systems for preventing DMA attack.
In computing, an input/output memory management unit (IOMMU) is a memory management unit (MMU) that connects a DMA-capable I/O bus to the main memory. Like a traditional MMU, which translates CPU-visible virtual addresses to physical addresses, the IOMMU takes care of mapping device-visible virtual addresses (also called device addresses or I/O addresses in this context) to physical addresses. Some units also provide memory protection from misbehaving devices.
The advantages of having an IOMMU, compared to direct physical addressing of the memory, include:
For system architectures in which port I/O is a distinct address space from the memory address space, an IOMMU is not used when the CPU communicates with devices via I/O ports. In system architectures in which port I/O and memory are mapped into a suitable address space, an IOMMU can translate port I/O accesses.
When an operating system is running inside a virtual machine, including systems that use paravirtualization, such as Xen, it does not usually know the host-physical addresses of memory that it accesses. This makes providing direct access to the computer hardware difficult, because if the guest OS tried to instruct the hardware to perform a direct memory access (DMA) using guest-physical addresses, it would likely corrupt the memory, as the hardware does not know about the mapping between the guest-physical and host-physical addresses for the given virtual machine. The corruption is avoided because the hypervisor or host OS intervenes in the I/O operation to apply the translations; unfortunately, this delays the I/O operation.
An IOMMU can solve this problem by re-mapping the addresses accessed by the hardware according to the same (or a compatible) translation table that is used to map guest-physical address to host-physical addresses.[9]