Chinaunix首页 | 论坛 | 博客
  • 博客访问: 780295
  • 博文数量: 231
  • 博客积分: 3217
  • 博客等级: 中校
  • 技术积分: 2053
  • 用 户 组: 普通用户
  • 注册时间: 2011-07-04 12:01
文章分类

全部博文(231)

文章存档

2015年(1)

2013年(10)

2012年(92)

2011年(128)

分类:

2012-05-07 10:15:30

原文地址:M5 simulator overview 作者:wuweidan

I will be complete soon...

Overview
   M5 is a full system simulator written in C++ and python. Major system components are represented as C++ objects, the python configuration allows flexible composition of system components to form different configurations. M5 supports 3 CPU models: simple, functional one CPI CPU and detailed out-of-order model which support SMT feature.
    M5 is an event driven simulator; it supports two types of simulations: syscall and full system. The former one provide support for ALPHA, MIPS, SPARC, ARM, X86, POWER, it runs statically compiled programs and simulates any system call they make; the latter one supports ALPHA best, it run linux on it. The M5 needs to be compiled and install before using.
scons build/ALPHA_FS/m5.opt builds M5 as a ALPHA full system simulator in debug mode. Scons is a build tool similar to GUN MAKE. There are 3 other modes: fast, debug and prof. The differences are as followed (cite from M5 document)
  • m5.debug - A binary used for debugging without any optimizations. Since no optimizations are done this binary is compiled the fastest, however since no optimizations are done it executes very slowly. (-g3 -gdwarf-2 -O0).
  • m5.opt - A binary with debugging and optimization. This binary executes much faster than the debug binary and still provides all the debugging facility of the debug version. However when debugging source code it can be more difficult to use that the debug target. (-g -O3)
  • m5.prof - This binary is like the opt target, however it also includes profiling support suitable for use with gprof. (-O3 -g -pg).
  • m5.fast - This binary is the fastest binary and all debugging support is removed from the binary (including trace support). (-O3 -DNDEBUG) 
    After building, the scons program will build a directory build/ALPHA_FS/ in the home directory of M5 source. It is almost a copy of the source code, including the directory of arch, base, cpu, config, dev, enums, kern, mem, params, python, sim, etc. Most are directly copied from the source code, but there are some compile time decided files which is not including in the source code. For example, the specific ISA support for decode stage. This issue would be described in detail later.
     M5 is a highly modularize design. All the system components are represented as C++ objects and connected in Python configuration files. The M5 is started by initiating a Root object, root = Root(system=test_sys). A root contains a system, test_sys = makeLinuxAlphaSystem (test_mem_mode, bm[0]). The system object contains almost everything including the cpu, physical memory, i/o device and maybe caches.
 
if options.l2cache:
    test_sys.l2 = L2Cache(size = '2MB')
    test_sys.tol2bus = Bus()
    test_sys.l2.cpu_side = test_sys.tol2bus.port
    test_sys.l2.mem_side = test_sys.membus.port

test_sys.cpu = [TestCPUClass(cpu_id=i) for i in xrange(np)]

if options.caches:
    test_sys.bridge.filter_ranges_a=[AddrRange(0, Addr.max)]
    test_sys.bridge.filter_ranges_b=[AddrRange(0, size='8GB')]
    test_sys.iocache = IOCache(mem_side_filter_ranges=[AddrRange(0, Addr.max)],
    cpu_side_filter_ranges=[AddrRange(0x8000000000, Addr.max)])
   test_sys.iocache.cpu_side = test_sys.iobus.port
   test_sys.iocache.mem_side = test_sys.membus.port

for i in xrange(np):
   if options.caches:
        test_sys.cpu[i].addPrivateSplitL1Caches(L1Cache(size = '32kB'),
                                                L1Cache(size = '64kB'))
  if options.l2cache:
     test_sys.cpu[i].connectMemPorts(test_sys.tol2bus)
  else:
     test_sys.cpu[i].connectMemPorts(test_sys.membus)

  if options.fastmem:
     test_sys.cpu[i].physmem_port = test_sys.physmem.port

From these segment of source code, we can see that the system components are most optional and connected to the system via bus designated by the user. Caches are connected via buses to the CPU, while bridge is used to connect different buses. For example,
L1 cache is contained in cpu while L2 is connneted to cpu through "tol2bus".  Function that perfrom this task (adding l1 cache) is contained in build/ALPHA_FS/cpu/BaseCPU.py which declare a basecpu object(c++). Different system components communicate through ports, which are connected to buses. Parameters such as memory size, address ranges, are all designated as attributes of objects. Components are added by instantiating a C++ objects using python grammar. Swig is used as glue to connect the python part and C++ part together.
    The system used is LinuxAlphaSystem, which is defined in arch/alpha. This class inherits the System class defined in the arch/alpha/system.hh, which declares the major components of a system, including PAL code and console. These are basic components, included in any sytems. Other components are declared in python configuration files.

configs/common/FSConfig.py
def makeLinuxAlphaSystem(mem_mode, mdesc = None):

    class BaseTsunami(Tsunami):
        ethernet = NSGigE(pci_bus=0, pci_dev=1, pci_func=0)
        ide = IdeController(disks=[Parent.disk0, Parent.disk2],
                            pci_func=0, pci_dev=0, pci_bus=0)

    self = LinuxAlphaSystem()
    if not mdesc:
        # generic system
        mdesc = SysConfig()
    self.readfile = mdesc.script()
    self.iobus = Bus(bus_id=0)
    self.membus = Bus(bus_id=1)
    self.bridge = Bridge(delay='50ns', nack_delay='4ns')
    self.physmem = PhysicalMemory(range = AddrRange(mdesc.mem()))
    self.race_v = Race(pio_addr=0x80140000000,devicename = "Race")
    self.race_v.pio = self.iobus.port
    self.bridge.side_a = self.iobus.port
    self.bridge.side_b = self.membus.port
    self.physmem.port = self.membus.port
    self.disk0 = CowIdeDisk(driveID='master')
    self.disk2 = CowIdeDisk(driveID='master')
    self.disk0.childImage(mdesc.disk())
    self.disk2.childImage(disk('linux-bigswap2.img'))
    self.tsunami = BaseTsunami()
    self.tsunami.attachIO(self.iobus)
    self.tsunami.ide.pio = self.iobus.port
    self.tsunami.ethernet.pio = self.iobus.port
    self.simple_disk = SimpleDisk(disk=RawDiskImage(image_file = mdesc.disk(),read_only = True))
    self.intrctrl = IntrControl()
    self.mem_mode = mem_mode
    self.terminal = Terminal()
    self.kernel = binary('vmlinux')
    self.pal = binary('ts_osfpal')
    self.console = binary('console')
    self.boot_osflags = 'root=/dev/hda1 console=ttyS0'

    return self

image_file is the pre compiled linux image used to boot the linux. The original one can not support my need to compile device driver (I don't konw why, it just fail), so I have to compile one by myself. (see also
alpha linux compilation). Race is a component I add for the datarace detection. Currently it is only a hardware memory which can be directly communicated by the application. I designate its address range from 0x80140000000, the size is designated in the Race class declaration. It is a C++ class defined in dev/race.hh. Applications typically can not directly control the hardware, but I want this funtion because the application has some information need to pass down to the hardware and I want the datarace detection to be performed in the hardware level. The solution is to attach such a device in the system and write a device driver as the medium of the application and the hardware memory. Detailed description will be available below.

The CPU model
    M5 simulates 3 CPU models. They all inherit from the BaseCPU class which contains some basic cpu component and fields, including some operation on context, a pointer to the system, initial and startup function, serialize function, etc. I attach my Race class pointer here; this is not necessary, as I can attach it in the system, but it is just one implementation. 
   The O3 cpu model is the most detailed out-of-order CPU modeling detailed pipeline structure including fetch, decode, rename, IEW(issue, execute, writeback), and commit stages. It also contains context operations, ITLB, DTLB, register file and a list of register file and thread operations. O3CPU.py defines a DerivO3CPU which interits from the Basic O3 cpu model.
  
   to be continued here......
阅读(800) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~