分类: LINUX
2012-08-05 15:05:20
1
关于利用QR码来显示内核崩溃信息的讨论,前景未明。其特点是compress a fair amount of data into a form that can be digested elsewhere.
2
The 碰到如下问题: when the kernel outputs a partial message (by passing a string to printk() that does not end with a newline), the logging system will buffer the text until the rest of the message arrives.
If a driver does
printk("testing the frobnozzle ...");
do_test();
printk(" OK\n");
and do_test() hangs up,
如果处理buffer带来的麻烦还没有形成一致意见
3 Tightening security: not for the impatient
安全补丁进入内核的困难旅程,长达十多年。
Consider the classic symbolic link vulnerability, wherein an attacker fools a privileged program into writing to a file behind an attacker-controlled symbolic link. Such vulnerabilities can be exploited to overwrite files that the attacker would not otherwise have access to.
Kees Cook to deal with this class of vulnerabilities. It is based on the observations that symbolic link vulnerabilities almost always involve links placed in /tmp, and that /tmp has the "sticky" bit set in any contemporary distribution. Given that:
The solution is to permit symlinks to only be followed when outside a sticky world-writable directory, or when the uid of the symlink and follower match, or when the directory owner matches the symlink's owner.
So Kees thinks that his current (a variant of one we have ) should be considered for merging, finally. The patches implement the symbolic link restrictions, but also add a new rule for hard links: a hard link to a file can only be created if the user owns the file or has write access to it. Once again, this change eliminates a class of attacks, but at a small cost: older versions of the "at" daemon break unless a small patch is applied.
另外一个漏洞 On Linux systems, there is a sysctl knob (suid_dumpable) that controls whether a crashing setuid process generates a core dump or not. Setting it to a non-zero value allows core dumps to happen; setting it to two applies certain restrictions that are intended to make it safe. But, Kees says, that's not the case;见
1
存储设备的新机制,OS可以它通知firmware优化。
"contexts" are a small number added to I/O requests that are intended
to help the device optimize the execution of those requests. They are meant to
differentiate different types of I/O, keeping large, sequential operations
separate from small, random requests. I/O can be placed into a "large
unit" context, where the operating system promises to send large requests
and, possibly, not attempt to read the data back until the context has been
closed.
但是如何实现没有达成共识
对于flash device
, The effect
of such an implementation would be to concentrate data written under any one
context into the same erase block(s).
2
Paolo Bonzini recently posted making a couple
of changes to msync(),但想被merge不容易,因为可能改变应用程序的行为,虽然应用程序不一定正确
目前MS_ASYNC正是内核缺省的行为,patch想立即发起I/O
There are a few options to msync(), one of which (MS_ASYNC) asks that the writeback
of modified pages be "scheduled," but not necessarily completed
immediately. It is meant to be a non-blocking system call that sets the
necessary actions in motion, but does not wait for them to complete. Current
kernels will write back dirty pages as part of the normal writeback process;
the system behaves, in other words, as if msync(MS_ASYNC) were being called on
a regular basis on every mapping. Writeback of dirty pages is already scheduled
as soon as the page is dirtied. Given that, there's not much work for an
explicit MS_ASYNC call from user space to do, and, indeed, the kernel
essentially ignores such calls.
下面的变化也不容易merge
msync() takes two parameters indicating the offset and length of the memory
area to be written back. But the kernel has always ignored those parameters,
choosing instead to just write back all modified pages in the file, and the
related metadata as well. Paolo's patch changes the implementation to only
synchronize the specific pages requested by the user.
3 Proposals for Kernel Summit discussions
Kernel Summit的参会人员选拔机制:
Those interested in attending are being asked to describe the technical
expertise they will bring to the meeting, as well as to suggest topics for
discussion.
从目前来看,议题更注重Linux内核开发的生态环境, 技术议题少一些
There tends to be a focus on more process and social aspects of the kernel at
the summit, mostly because the hardcore technical topics are generally better
handled by a more focused group. The summit tries to address global concerns,
and there seem to be plenty to choose from
1
一个新的字符串操作接口
2
老问题 The "holy grail" is a single kernel binary that will boot on any ARM device。
有四个方面的努力: Cleaning up and consolidating the header files within the various ARM is one, while consolidating ARM drivers is another. In addition, device tree will provide a way to specify the differences between ARM SoCs at runtime. Finally, doing active maintenance of the ARM tree, keeping in mind the big picture, will also help.
3
ARM's big.LITTLE architecture is an example of asymmetric multiprocessing where all CPUs are instruction-set compatible, but where different CPUs have very different performance and energy-efficiency characteristics.
早期的工作a . This approach is termed “big.LITTLE Switcher. b These big.LITTLE systems were therefore the subject of a scheduler minisummit at last February's Linaro Connect in the Bay Area which was .
一个重要任务是completely eliminate the overhead of per-kthread creation, teardown, and migration. Thomas posted a that moves idle-task creation to common code. This patchset has been well received thus far, and went into mainline during the 3.5 merge window. Thomas has since followed up with a new that allows kthreads to be parked and unparked. The new kthread_create_on_cpu(), kthread_should_park(), kthread_park(), and kthread_unpark() APIs can be applied to the per-CPU kthreads that are now created and destroyed on each CPU-hotplug cycle.
另外一个有趣的地方是Add minimal support to scheduler for asymmetric cores。There has been great progress in a number of areas. First, Paul Turner posted a new version of his . This patchset should allow the scheduler to make better (and more power-aware) task-placement and load-balancing decisions. Second, Morten Rasmussen ran some experiments (including experimental patches) on top of Paul Turner's patchset. See below for more information. Third, Peter Zijlstra posted a of removing sched_mc and also posted an proposing increased scheduler awareness of hardware topology. This should allow the scheduler to better handle asymmetric architectures such as ARM's big.LITTLE architecture. Finally, Juri Lelli posted an of a prototype
1
2
红黑树的用户must provide their own functions for inserting nodes into the tree and performing searches; There is some appeal to being able to hand-code the search and insertion functions, but there would also be value in generic implementations. 目前有两个竞争方案。
3
A "volatile range" is a set of pages in memory containing data that
might be useful to an application at some point in the future; a key point is
that, if the need arises, the application is able to reacquire (or regenerate)
that data from another source.
用法:
放弃该区域
fallocate(fd, FALLOCATE_FL_MARK_VOLATILE, offset, len);
After the call completes, the kernel is not obligated to keep that range in
memory, and is not obligated to write that range to backing store before
reclaiming it.
如果真正要使用该区域:
fallocate(fd, FALLOCATE_FL_UNMARK_VOLATILE, offset, len);
If the indicated range is still present in memory, the call will return zero
and the application can proceed to work with the data. If, instead, any part of
the range has been purged by the kernel since it was marked volatile, a
non-zero return value will inform the application that it needs to find that
data somewhere else.