Chinaunix首页 | 论坛 | 博客
  • 博客访问: 565718
  • 博文数量: 197
  • 博客积分: 7001
  • 博客等级: 大校
  • 技术积分: 2155
  • 用 户 组: 普通用户
  • 注册时间: 2005-02-24 00:29












2012-09-04 08:00:40


The OpenVZ blog has the of the release of CRtools 0.1. 在用户态实现 checkpoint/restore [CPT] functionality



l         "Suspend to both" support allows the system to be suspended after writing a hibernation image to disk. Then, should power run out before the suspended system is resumed, it can be restarted from the disk image instead.

l         The patch set, another piece of the solution to the bufferbloat problem, has been merged.

l         The protocol extension has been merged. TCP fast open is a patch out of Google that reduces the overhead of TCP connection setup, hopefully making protocols like HTTP go faster.

l         A long effort to remove the IPv4 routing cache from the networking subsystem has come to its conclusion. David Miller wrote:

The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered.  What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables. The replacement code simplifies the networking subsystem and, hopefully, gives better performance on high-volume systems.

l         Some initial work has been done to separate the dynamic tick code from the idle task, setting the ground for stopping the timer tick on non-idle CPUs.

l         The power domains subsystem has seen some integration with the cpuidle code to handle situations where devices share power lines with CPU cores.

l         The VFS layer has seen some significant changes. There is a new atomic_open() inode operation that combines the process of looking up, possibly creating, and opening a file into a single, atomic operation. The whole "open intents" mechanism has been removed. Numerous other operations have had prototype changes. The have been merged, simplifying the process of cleaning up file structures.

l         There is for I/O memory management units intended to help enable safe device access to virtualized guests.



目标:split out the user-space API content of the kernel header files in the include and arch/xxxxxx/include directories, placing that content into corresponding headers created in new uapi/ subdirectories that reside under each of the original directories.

好处:It simplifies and reduces the size of the kernel-only headers. More importantly, splitting out the user-space APIs into separate headers has the desirable consequence that it "simplifies the complex interdependencies between headers that are [currently] partly exported to userspace".

核心实现:Everything inside that block that is not nested within a block governed by a #ifdef __KERNEL__ block should move to the corresponding uapi/ header file. The content inside the #ifdef __KERNEL__ block remains in the original header file, but the #ifdef __KERNEL__ and its accompanying #endif are removed.



The 3.5 kernel was released one day faster than the 3.4 kernel was, in 62 days. The last time a kernel was released this quickly was back in 2005 with the 2.6.14 kernel release (61 days).

571987 lines added

358836 lines removed

135848 lines modified


1 Random numbers for embedded devices

嵌入设备不安全因素源头,启动初期缺乏随机源As Zakir Durumeric, Nadia Heninger, J. Alex Halderman, and Eric Wustrow have documented, many of the latter class of systems are at risk, mostly as a result of keys generated with insufficient randomness and predictable initial conditions.


解决方法:Ted Ts'o has put together designed to improve the amount of randomness available in the system from when it first boots.


l         One of those is to fix the internal add_interrupt_randomness() functionadding randomness from interrupts should be fast and effective, so it is done by default for all interrupts;

l         Next, the patch set adds a new function: void add_device_randomness(const void *buf, unsigned int size); The purpose is to allow drivers to mix in device-specific data that, while not necessarily random, is system-specific and unpredictable. Examples include serial, product, and manufacturer information from attached USB devices

l         Ted's patch set also changes the use of the hardware random number generator built into a number of CPUs. Rather than return random numbers directly from the hardware, the code now mixes hardware random data into the kernel's entropy pool and generates random numbers from there. His reasoning is that using hardware random numbers directly requires placing a lot of trust in the manufacturer



TSQ (TCP Small Queues)  goal is to reduce number of TCP packets in xmit queues (qdisc &device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem.


A number of bloat-fighting changes have gone into the kernel over the last year. The works to prevent packets from building up in router queues over time. At a much lower level, put a cap on the amount of data that can be waiting to go out a specific network interface. Byte queue limits work only at the device queue level, though, while the networking stack has other places—such as the queueing discipline level—where buffering can happen. So there would be value in an implementation that could limit buffering at levels above the device queue.

Eric Dumazet's looks like it should be able to fill at least part of that gap. It limits the amount of data that can be queued for transmission by any given socket regardless of where the data is queued, so it shouldn't be fooled by buffers lurking in the queueing, traffic control, or netfilter code. That limit is set by a new sysctl knob found at:


The default value of this limit is 128KB;




解决方法见链接,但是还未实现Torvalds to put out an to add distribution-specific kernel configuration options.



目前不足:While btrfs makes the creation and management of snapshots easy, it currently lacks the ability to efficiently determine what the differences are between two snapshots and save that information for future use. With Alexander Block'sbtrfs can be instructed to calculate the set of changes made between two snapshots and serialize them to a file. That file can then be replayed elsewhere, possibly at some future time, to regenerate one snapshot from the other.
      The primary use case for this feature (which is clearly patterned after the ZFS send/receive functionality) is backups in various forms.


"Mobile" and "embedded" no longer mean "tiny.""Mobile" and "embedded" no longer mean "tiny."
目前的名字不叫ARM 64,而是"AArch64" is ARMv8's 64-bit operating mode) or arm64.

3 Linux power management: The documentation I wanted to read

The main imposition made by the PM core is the over-all sequencing of suspend and resume.


1 A UEFI secure boot and TianoCore info page

James Bottomley has distilled his hard-earned knowledge of how to set up UEFI secure boot with QEMU and the TianoCore system and placed it into a web page.


The interprocess communication mechanism has, over the years, become a standard component of the Linux desktop. 但是D_BusD-Bus 的特点 Multicast functionality is inherently a part of the protocol; one message can be sent to multiple recipients. D-Bus promises reliable delivery, where "reliable" means that messages arrive in the order in which they were sent and multicast messages will either be delivered to all recipients or, if that is not possible, to none.

The current D-Bus implementation uses Unix-domain sockets and a central routing daemon. It works, but the routing daemon adds context switches, overhead, and latency to each message it handles. The kernel is unable to help get high-priority messages delivered first, so all messages cause wakeups that slow down the processing of the most important ones; see
for a description of how these problems can affect a running system.

目前的改进implement, a new address family designed to meet the needs of D-Bus. It provides the reliable delivery that D-Bus requires; it also has the ability to pass file descriptors and credentials from one process to another. The security mechanism is built in, with the netfilter code (augmented with a new D-Bus message parser) used to control which messages can actually be delivered to any specific process.   但是目前有反对意见,短期不可能merge

3 Better documentation: the window of naive interest

讨论如何写更好的文档,首先有全景 "big picture" understanding,应该explain "why", not "what".





阅读(769) | 评论(0) | 转发(0) |