Chinaunix首页 | 论坛 | 博客
  • 博客访问: 224133
  • 博文数量: 68
  • 博客积分: 3120
  • 博客等级: 中校
  • 技术积分: 715
  • 用 户 组: 普通用户
  • 注册时间: 2008-03-08 09:53
文章分类
文章存档

2012年(29)

2011年(3)

2010年(18)

2009年(18)

我的朋友

分类: C/C++

2010-09-04 20:14:55

Software optimization resources

See my new blog

Contents


Optimization manuals

This series of five manuals describes everything you need to know about optimizing code for x86 and x86-64 family microprocessors, including optimization advices for C++ and assembly language, details about the microarchitecture and instruction timings of Intel, AMD and VIA processors, and details about different compilers and calling conventions.

Intel microprocessors covered: Intel Pentium 1 through Pentium 4, Pentium D, Pentium M, Core Duo, Core 2, Core i7, Atom, but not Itanium. AMD microprocessors covered: Athlon 64, Opteron. VIA microprocessors covered: Nano. Operating systems covered: DOS, Windows, Linux, BSD, Mac OS X Intel based. Includes coverage of 64-bit systems.

Note that these manuals are not for beginners.

1. Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms
This is an optimization manual for advanced C++ programmers. Topics include: The choice of platform and operating system. Choice of compiler and framework. Finding performance bottlenecks. The efficiency of different C++ constructs. Multi-core systems. Parallelization with vector operations. CPU dispatching. Efficient container class templates. Etc.
 
File name: optimizing_cpp.pdf, size: 871305, last modified: 2010-Feb-16.
.
 
2. Optimizing subroutines in assembly language: An optimization guide for x86 platforms
This is an optimization manual for advanced assembly language programmers and compiler makers. Topics include: C++ instrinsic functions, inline assembly and stand-alone assembly. Linking optimized assembly subroutines into high level language programs. Making subroutine libraries compatible with multiple compilers and operating systems. Optimizing for speed or size. Memory access. Loops. Vector programming (XMM, YMM, SIMD). CPU-specific optimization and CPU dispatching.
 
File name: optimizing_assembly.pdf, size: 854707, last modified: 2010-Feb-16.
.
 
3. The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers
This manual contains details about the internal working of various microprocessors from Intel, AMD and VIA. Topics include: Out-of-order execution, register renaming, pipeline structure, execution unit organization and branch prediction algorithms for each type of microprocessor. Describes many details that cannot be found in manuals from microprocessor vendors or anywhere else. The information is based on my own research and measurements rather than on official sources. This information will be useful to programmers who want to make CPU-specific optimizations as well as to compiler makers and students of microarchitecture.
 
File name: microarchitecture.pdf, size: 1294739, last modified: 2010-Feb-16.
.
 
4. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs
Contains detailed lists of instruction latencies, execution unit throughputs, micro-operation breakdown and other details for all application instructions of most microprocessors from Intel, AMD and VIA. Intended as an appendix to the preceding manuals.
 
File name: instruction_tables.pdf, size: 1352052, last modified: 2010-Feb-16.
.
 
5. Calling conventions for different C++ compilers and operating systems
This document contains details about data representation, function calling conventions, register usage conventions, name mangling schemes, etc. for many different C++ compilers and operating systems. Discusses compatibilities and incompatibilities between different C++ compilers. Includes information that is not covered by the official Application Binary Interface standards (ABI's). The information provided here is based on my own research and therefore descriptive rather than normative. Intended as a source of reference for programmers who want to make function libraries compatible with multiple compilers or operating systems and for makers of compilers and other development tools who want their tools to be compatible with existing tools.
 
File name: calling_conventions.pdf, size: 366154, last modified: 2010-Feb-16.
.
 
All five manuals
Download all the above manuals together in one zip file.
 
File name: optimization_manuals.zip, size: 3801138, last modified: 2010-Feb-16.
.
 

If you don't know how to read the .pdf files click .


These are some of the test programs I have used for my research. You can use them for testing how many clock cycles a piece of assembly or C++ code takes. Can also count cache misses, branch mispredictions, resource stalls etc. Supports Intel, AMD and VIA processors. Includes different versions for 16, 32 and 64 bit mode, Windows and Linux.

File name: testp.zip, size: 293164, last modified: 2010-Jan-17.
.


This utility can be used for converting object files between COFF/PE, OMF, ELF and Mach-O formats for all 32-bit and 64-bit x86 platforms. Can modify symbol names in object files. Can build, modify and convert function libraries across platforms. Can dump object files and executable files. Also includes a very good disassembler supporting the SSE4, AVX, FMA and XOP instruction sets. Source code included (GPL). .

File name: objconv.zip, size: 710609, last modified: 2009-Jul-16.
.


This is a library of optimized subroutines coded in assembly language. The functions in this library can be called from C++ and other compiled high-level languages. Supports many different compilers under Windows, Linux, BSD and Mac OS X operating systems, 32 and 64 bits. The library contains faster versions of common C/C++ functions such as memcpy, memmove, memset, strcpy, strcat, strlen, as well as round functions, CPU identification functions, etc.

The package contains library files in many different file formats, C++ header file and assembly language source code. Gnu general public license applies. .

File name: asmlib.zip, size: 226920, last modified: 2009-Jan-22.
.


This is a program that can change the CPUID vendor string, family and model number on VIA Nano processors. See my blog for a discussion of the purpose of this program.

File name: cpuidfake.zip, size: 67593, last modified: 2010-Aug-08.
.


Agner's CPU blog www.agner.org/optimize/blog

Masm Forum

ASM Community Messageboard

Linux Assembly

Hutch's Assembly pages

Iczelion's Win32 Assembly Homepage

CPU-id tools and information

Programmer's heaven assembler zone

X-bit Labs articles on microprocessors

Virtual sandpile x86 Processor information

intel-assembler programmers guides and manuals

Online computer books

FASM assembler and messageboard

YASM assembler

NASM assembler

JWASM assembler

Intel resources

Reference manuals and other documents can be found at Intel's web site. Intel's web site is refurnished so often that any link I could provide here to specific documents would be broken after a few months. I will therefore recommend that you use the search facilities at developer.intel.com and search for "Software Developer's Manual" and "Optimization Reference Manual".

AMD resources

Microsoft resources

MASM manuals

536287


阅读(1413) | 评论(1) | 转发(0) |
给主人留下些什么吧!~~

chinaunix网友2010-09-07 11:24:14

Download More than 1000 free IT eBooks: http://free-ebooks.appspot.com