博客首页
注册
建议与交流
排行榜
加入友情链接
推荐
投诉
搜索:
帮助
好好学习,天天向SUN
联系方式: leiyu530@163.com
penguinstorm.cublog.cn
管理博客
发表文章
留言
收藏夹
博客圈
音乐
相册
· N4000拆卸过程
· Solaris8安装过程截图
· Veritas
}
· 建立VOLUME
· 其他
· 证书
· ORACLE_FOR_Solaris9i
· HACMP_FOR_AIX 4.3.3
}
· 准备工作
· 配置过程
· 配置共享卷组
· 配置应用脚本
· 同步过程
· 启动双机
· 错误大观
· 路由交换
· ruby专用文件夹
· sun_cluster
· FASTT系列存储
· 7133换盘操作
· HACMP_FOR_AIX 5.1
· 地震纪实
文章
· AIX
}
· 实践操作
· 双机配置
· 系统认证
· 基础知识
· 故障处理
· 共享精神
· HPUX
}
· 学逻辑卷
· 双机相关
· 考试认证
}
· CSE
· CSA
· 基础知识
· 动手实践
· 存储备份
· CISCO
}
· 交换相关
· 路由相关
· 认证考试
}
· CCNA
· CCNP
· CCIE
· 心情日记
· 动手实践
· Linux
· Oracle
}
· 基础知识
· 实践操作
· 考试认证
· 实验操作
· English
· Solaris
}
· 读书笔记
}
· SA239
· SA299
· SA399
· ES222
· SM240
· ES255
· ES310
· Solaris高级系统管理员指南
· 基础知识
· 实践操作
}
· ST350
· 认证计划
· 系统安装
· Veritas
}
· 基础知识
· 实践操作
· Program
}
· Rails
}
· 基础知识
· 实践操作
· 语法掌握
· Dynamips
· 荣誉勋章
· 过关总结
· 闲言碎语
· 熬夜签到
· 好文收录
· 人在职场
· 热点关注
首页
关于作者
姓名:雷宇 昵称:storm 职业:IT 年龄:26 位置:北京 个性介绍:没啥个性 不聊MSN/QQ 本着资源共享的精神,所有文章欢迎转载
||
<<
>>
||
我的分类
文章列表 - ST350
Diagnostic Files(from ST350)
<DIV>/etc/defaultdomain:<BR>Name the current domain;read and set it at each boot by the script /etc/init.d/inetinit<BR>/etc/default/cron:<BR>Determine logging activity for the cron daemon through specification of the cronlog variable<BR>/etc/default/login:<BR>Control root logins at the console through specification of the console variable, and other defaults<BR>/etc/default/su:<BR>Determine /etc/hostname.le0 logging activity for the su command through specification of the sulog variable<BR>/etc/dfs/dfstab:<BR>List which distributed file systems will be shared at boot time<BR>/etc/dfs/sharetab:<BR>List currently shared NFS files and directories<BR>/etc/hosts:<BR>Associate Internet Protocol(IP) addresses to particular system names(symbolically linked to /etc/inet/hosts)<BR>/etc/hostname.le0:<BR>Assign a system name, and through cross-referencing the /etc/hosts file, add an IP address to a particular network interface<BR>/etc/inetd.conf:<BR>List information for network services that can be invoked by the inetd daemon<BR>/etc/inittab:<BR>Read by init daemon at startup to determine which rc scripts to execute; also contains default run level<BR>/etc/minor_perm:<BR>Specify permissions to be assigned to device files<BR>/etc/mnttab:<BR>Display a list of currently mounted file systems<BR>/etc/name_to_major:<BR>Display a list of configured major device numbers<BR>/etc/netconfig:<BR>Display the network configuration database read during network initialization and use<BR>/etc/netmasks:<BR>Display the netmask value; read at boot time during network initialization<BR>/etc/nsswitch.conf:<BR>List the database configuration file for the name service switch engine<BR>/etc/path_to_inst:<BR>List the contents of the system device tree using the format of physical device names and instance numbers<BR>/etc/protocols:<BR>List known protocols used in conjunction with Internet<BR>/etc/rmtab:<BR>List current remotely mounted file systems<BR>/etc/rpc:<BR>List available RPC programs<BR>/etc/ser……
查看全文
发表于:2006-08-17 ┆
阅读(668)
┆
评论(0)
Diagnostic Commands and Tools(from ST350)
<DIV>置顶,每日一看:</DIV> <DIV>adb:<BR>Analyze dumps and a running system<BR>AnswerBook:<BR>Display online reference manuals in areas of hardware, user, system administration,and other<BR>arch:<BR>Display architecture and kernel architecture information<BR>arp:<BR>Display the address resolution protocol table<BR>aset:<BR>Use the automated security enhancement tool<BR>catman -w:<BR>Create the /usr/share/man/windex database for use with index function available through the apropos command<BR>crash:<BR>Analyze crash dumps<BR>devlinks:<BR>Create symbolic links in /dev using information in /devices<BR>df -k:<BR>Display disk space usage in Kbytes, including free space<BR>dfmounts:<BR>Display remote file system mount information<BR>dfshares:<BR>Display shared file system information<BR>diff:<BR>Compare file contents<BR>dmesg:<BR>Analyze recent log messages<BR>disks:<BR>Create symbosic links in /dev/dsk and /dev/rdsk<BR>drvconfig:<BR>Configure the devices directory and the device information tree<BR>eeprom:<BR>Analyze and change programmable read-only memory(PROM) settings<BR>file:<BR>Determine a file's type<BR>find:<BR>Search for specific files in the file system structure<BR>format:<BR>Analyze or modify disk partition information<BR>fsck:<BR>Check UFS file systems for inconsistencies<BR>fsdb:<BR>Use file system debugger(see fsdb_ufs in man pages)<BR>fstyp:<BR>Display extensive file system parameters for a specified file system<BR>grep:<BR>Analyze file contents, and search for specific patterns<BR>groups:<BR>Display group definitions for a given user<BR>grpck:<BR>Check the /etc/group file for syntax errors or inconsistencies<BR>ifconfig:<BR>Analyze the status of network interfaces<BR>infocmp:<BR>Compare tic formatted files<BR>iostat:<BR>Analyze I/O performance issues<BR>kadb:<BR>Trap kernel and low-level faults<BR>last:<BR>Display history of system login information<BR>ls:<BR>Analyze file properties<BR>ndd:<BR>Get and set named device driver parameters<BR>netstat (-i, -……
查看全文
发表于:2006-08-17 ┆
阅读(637)
┆
评论(0)
ST350-7 Kernel core dump analysis
<DIV>Upon completion of this module, you should be able to:<BR>1,describe how a process creates a system dump<BR>2,configure a system to collect and store core files<BR>3,differentiate a system panic condition from a system hang<BR>4,list the steps to perform an initial core dump analysis<BR>5,use the adb and crash debuggers to analyze crash dumps</DIV> <DIV> </DIV> <DIV>The following events occur in response to a system panic:<BR>display a panic message at the console <BR>perform a stack trace, listing routines that led to the panic<BR>dump "interesting" portions memory to the swap device<BR>reboot and copy the core file from swap to a file system</DIV> <DIV> </DIV> <DIV>The severity of the hang needs to be monitored by trying diffrerent methods of system access, and varies depending on the cause of the hang, which commonly include:<BR>an application has hang<BR>a terminal or window has hang<BR>the system has hang<BR>the system is overloaded and has a resource bottleneck</DIV> <DIV> </DIV> <DIV>The options available with dumpadm to change the crash dump and savecore configuration details are:<BR>-c content_type<BR>-d dump_device<BR>-m min<BR>-n suppress the automatic execution of savecore on reboot<BR>-s savecore_dir<BR>-u update the configuration details based on the contents of /etc/dumpadm.conf</DIV> <DIV> </DIV> <DIV>The following command specifies collect the core file on swap, populate the file with kernel pages only, and save the core file to /opt/crash/system1 on reboot:<BR># dumpadm -c kernel -d swap -s /opt/crash/system1</DIV> <DIV> </DIV> <DIV>To invoke adb for crash dump analysis,type<BR># cd crash_directory<BR># adb -k unix.n vmcore.n<BR>To start adb on a live system, run the following command:<BR># adb -kw /dev/ksyms /dev/mem</DIV>
查看全文
发表于:2006-04-19 ┆
阅读(903)
┆
评论(0)
ST350-6 sunVTS system Diagnostics
<DIV>Upon completion of this module, you should be able to:<BR>1,install the sunVTS package on a system<BR>2,select, set up ,and run sunVTS diagnostic tests<BR>3,run sunVTS over a network<BR>4,run sunVTS in TTY mode without a frame buffer<BR>5,analyze sunVTS test results</DIV> <DIV> </DIV> <DIV>SunVTS is sun's on-line validation test suite. functionality of most sun hardware devices canbe verified. the SunVTS tests can be used to stress certain areas of the system as needed for diagnostic and troubleshooting purposes</DIV> <DIV> </DIV> <DIV>The sunVTS architecture is divided into the:<BR>user interfaces<BR>sunVTS kernel<BR>hardware tests</DIV> <DIV> </DIV> <DIV>The sunVTS program is run when the superuser types one of the following commands. the ex /opt/SUNWvts/bin directory needs to be defined as part of the PATH variable<BR>1,sunvts-runs the sunVTS kernel and default graphical interface(CDE) on the local machine<BR>2,sunvts -l-runs the sunVTS kernel and OpenLook graphical interface on the local machine<BR>3,sunvts -t-runs the sunVTS kernel in TTY mode, vtstty<BR>4,sunvts -h host_name-runs the graphical interface on the local machine while connecting and testing a remote machine</DIV>
查看全文
发表于:2006-04-18 ┆
阅读(780)
┆
评论(0)
ST350-5 Sunsolve Database Information
<DIV>Upon completion of this module, you should be able to:<BR>1,use the sunsolve database for fault analysis purposes<BR>2,differentiate between the sunsolve CD-ROM and sunsolve online databases<BR>3,describe how to apply for a sunsolve online account<BR>4,install the sunsolve software and patches software on a server and share them correctly to the network<BR>5,configure and use sunsolve software from an installed server or from the CDROM<BR>6,display the installed patches on a solaris system<BR>7,display the current patch report for a given operating system<BR>8,install and remove patches as needed on a solaris system<BR>9,solve a workshop exercise using sunsolve database information</DIV> <DIV> </DIV> <DIV>A sunsolve online account can be applied for by using a Web browser and visiting one of the following web sites:<BR><A href="http://sunsolve.sun.com">http://sunsolve.sun.com</A><BR><A href="http://sunsolve1.sun.com">http://sunsolve1.sun.com</A><BR><A href="http://www.sun.com">http://www.sun.com</A></DIV> <DIV> </DIV> <DIV>The following procedure shows the steps needed to share the sunsolve software, and if needed, the CDROM from a server. the version number of sunsolve software under the /cdrom directory changes with progressive releases<BR>1,insert the sunsolve CDROM into the CDROM drive on the server and verify that the vold daemon has mounted the software using the mount command<BR>2,if needed, perform the installation procedure described previously, or run the software from the CDROM<BR>3,share the software and, if needed, the mounted CDROM<BR># share -o ro sunsolve_install_dir<BR># share -o ro /cdrom/cdrom0<BR>4,add the following line to the /etc/dfs/dfstab file:<BR># share -o ro sunsolve_install_dir<BR># share -o ro /cdrom/cdrom0<BR>5,start the NFS server<BR># /etc/init.d/nfs.server start<BR>6,check to see if the share command was successful<BR># dfsshares -F nfs server_name<BR>7,on the clients, remotely mount the software or provide clients with the URL <A href="http://servername">http://servername</A><BR># mount server_name:/cdrom/cdrom0 /cdrom</DIV> <DIV> </DIV> <DIV>removing a patch<BR>1,display a list of installed patches on your system<BR># showrev -p<BR>2,find the installed location of the patch<BR># find / -name 102044-01 -print<BR>3,change directory to the location of the patch<BR># cd /var/sadm/patch/102044-01<BR>4,run the script to remove the patch and reboot the system<BR># ./backoutpatch<BR>5,reboot the system<BR># reboot</DIV>
查看全文
发表于:2006-04-18 ┆
阅读(696)
┆
评论(0)
ST350-4 OBP Diagnostics and commands
<DIV>Upon completion of this module, you should be able to use OBP commands to do the following:<BR>1,gather general information about the system<BR>2,define the meaning of the non-volatile read access memory(NVRAM) variables<BR>3,display and capture the names of the devices in the system device tree, and display their attributes<BR>4,test devices using the device path, node name, and device alias<BR>5,generate and test a PROM device alias<BR>6,alter and display NVRAM settings, and reset to the defaults<BR>7,use the eeprom command to examine and define NVRAM</DIV> <DIV> </DIV> <DIV>The OBP consists of two chips on each system board:<BR>the boot PROM itself<BR>a non-volatile random access memory(NVRAM)</DIV> <DIV> </DIV> <DIV>The OBP has the following features:<BR>the ability to read plug-in device drivers and diagnostics from probed devices.<BR>A FORTH code interpreter to facilitate writing and downloading drivers, diagnostics, and parameters<BR>a device tree with a data structure hierarchy, similar to UNIX, for locating device addresses<BR>diagnostic and informational commands, and system configuration parameters(PROM variables)<BR>a restricted monitor<BR>system initialization<BR>power-on self tests(POSTs)</DIV> <DIV> </DIV> <DIV>You can use the printenv command at the monitor prompt to see the various NVRAM parameters and default values</DIV> <DIV> </DIV> <DIV>The superuser can display and change PROM variable settings using the eeprom command<BR># /usr/sbin/eeprom diag-switch?=true</DIV> <DIV>The default boot sequence:<BR>ok boot->execute primary boot-OBP->load bootblk program->load and start secondary boot(/platform/'uname -m'/ufsboot)->load and start kernel(/platform/'uname -m'/kernel/unix)->kernel reads /etc/system->kernel initialized->kernel starts the init process->read /etc/default/init and /etc/inittab->execute rc scripts</DIV> <DIV> </DIV>
查看全文
发表于:2006-04-18 ┆
阅读(612)
┆
评论(0)
ST350-3 post diagnostics
<DIV>Upon completion of this module, you should be able to:<BR>1,describe the capabilities and limitations of the POSTs in identifying and resolving system faults<BR>2,describe different ways to view the POST<BR>3,configure the file /etc/remote on a console server to enable the use of tip in a remote diagnostic session<BR>4,view and interpret POST output<BR>5,describe the functionality of the prtdiag command</DIV> <DIV> </DIV> <DIV>You can use a null modem cable or a modem with TIP(Terminal Interface Protocol) to remotely troubleshoot a faulty system</DIV> <DIV> </DIV> <DIV>To send a break through the tip window(stop-a or L1-a key remote equivalent),type<BR>~#<BR></DIV> <DIV>To interrupt a test, press Control-c<BR></DIV> <DIV>To exit from tip,type<BR>~. or ~^D<BR></DIV> <DIV>To see a list of tip commands, type<BR>~?<BR></DIV>
查看全文
发表于:2006-04-18 ┆
阅读(629)
┆
评论(0)
ST350-2 Diagnostic Tools
<DIV>Upon completion of this module, you should be able to:<BR>1,differentiate watchdog resets, panics, and system hangs<BR>2,differentiate hardware and software problems<BR>3,provide examples of fatal and non-fatal error conditions<BR>4,identify a comprehensive set of Solaris commands and utilities which are useful in fault analysis<BR>5,describe the syntax, function, and relevance of each command or system file<BR>6,use Solaris commands and files to determine system configuration and status information<BR>7,solve workshop problems using Solaris utilities and system file</DIV> <DIV>error categories-software, hardware-corrected, recoverable, fatal, and critical<BR></DIV> <DIV>error reporting mechanisms-bus errors, interrupts, and resets</DIV> <DIV>Recoverable errors caused by hardware are usually signaled by a bus error posted to the requesting device and a specified interrupt, which could broadcast the error. Error recovery in such cases is normally handled by the trap routines, while error logging is done by the interrupt handler.</DIV> <DIV> </DIV> <DIV>Critical errors require immediate attention, system shutdown, and power-off. They are notified through a high-level broadcast interrupt if at all possible. </DIV> <DIV> </DIV> <DIV>A fatal error is a hardware error in which proper system operation cannot be guaranteed. All fatal errors initiate a system-watchdog reset. Parity errors on backplanes are an example of a fatal error.</DIV> <DIV> </DIV> <DIV>Bus errors are one of the mechanisms for error reporting on the system. Bus errors are issued to the processor when the processor references a virtual or physical location that cannot be satisfied for hardware reasons. some typical bus errors that occur are:<BR>Illegal address or internal hardeare failure<BR>instruction fetch or data load<BR>on an SBus, direct virtual memory access(DVMA) operations<BR>synchronous/asynchronous data store<BR>memory management unit(MMU) operations</DIV> <DIV>&n……
查看全文
发表于:2006-04-18 ┆
阅读(583)
┆
评论(1)
ST350-1 Fault analysis and diagnosis
<DIV>Upon completion of this module, you should be able to <BR>1,use an organized total system approach for fault analysis and diagnosis<BR>2,write accurate problem statements<BR>3,describe a system problem in terms of error messages, symptoms, relative comparisons and technical conditions<BR>4,identify and use commonly available resources to solve technical problems<BR>5,generate and test a list of likely causes on a per fault basis<BR>6,communicate and document information gathered during fault analysis<BR>7,use the fault analysis worksheet to gather and document facts</DIV> <DIV> </DIV> <DIV>fault analysis-identify the problem and organize fact gathering and comparisons<BR>diagnosis-organize the actual discovery, testing, repair, and reporting of the problem</DIV> <DIV> </DIV> <DIV>Eight steps of fault analysis and diagnosis:<BR>state the problem<BR>describe the problem<BR>identify differences<BR>list relevant changes<BR>generate likely causes<BR>test likely causes<BR>verify the most likely cause<BR>take action to correct the fault<BR>NOTE:most bugs that become a disaster happen because the original problem is not identified correctly</DIV> <DIV> </DIV> <DIV>Describing the problem<BR>Listing All Observed Facts<BR>Identify the sources of the observed facts you listed<BR>1,Customer complaints-Use the original message from the customer<BR>2,Customer interviews-Use the list of questions shown previously to interview customers about the problem. Expand and customize the question list for your own style and environment<BR>3,Interviews of others involved-Include other colleagues such as administrators, programmers, and technical support staff<BR>4,Diagnostics-Consider changed environments and operating system levels<BR>5,Dumps-Evaluate the results of crash analysis if a core file is generated and available</DIV> <DIV> </DIV> <DIV>Testing Likely Causes<BR>Verification<BR>Three approaches used to verify the most likely cause of a problem are:<BR>1,Factual and logical-In this approach,your conclusions of likely causes are based on information gathered on the fault analysis worksheet and on past experience. This results in likely causes that make the most sense<BR>2,Realistic-In this approach, the most likely cause must pass an experiment to show conclusively that it is or is not the cause. For example,try a new driver without overwriting the old one. This provides a quick, non-disruptive verification with good, but not complete, conclusiveness<BR>3,Result-oriented-In this approach, you assume, without proof, that the most likely cause you choose is the actual cause, and take the indicated correctie action. This is the least conclusive verification, and can be disruptive, expensive, and time-consuming, especially if your assumptions are not correct.</DIV> <DIV> </DIV> <DIV>Taking Corrective Action<BR>1,Complete the repair<BR>2,Test and verify the repair<BR>3,Document results<BR>4,Obtain confirmation and acceptance</DIV> <DIV> </DIV> <DIV><BR> </DIV> <DIV></DIV>
查看全文
发表于:2006-04-18 ┆
阅读(616)
┆
评论(0)
ST350课程介绍
<h4>Sun Systems Fault Analysis Workshop(ST-350)<br />Sun 系统故障分析</h4><li>ST-350 <p>时间:5天 授课:20% 上机:80% 价格:RMB9,800 <br /><br />课程描述:<br /><br />本课程告诉学员如何解决Sun系统中出现的程度较深的系统故障的方法。对象是系统管理员和系统维护人员。通过在Sun SPARC station机器上使用Solaris 2.X设置一些系统故障,引导学员进行排除,使得学员学会如何更好的维护和管理系统。</p><p>目标:本课程完成后,您将具备以下能力 </p></li><li>能够区分及修复所选择硬件,系统管理和软件的故障 </li><li>使用有规律所……
查看全文
发表于:2005-09-15 ┆
阅读(729)
┆
评论(0)