AIX 4.x How-To, Tricks and Tips v2.00 ===================================== Last updated: 2/10/1998 ============================================================================== System Administration Tasks and Tools System Installation and Maintenance System Configuration and Customization Network Configuration and Customization Distributed Services User Management and Security Backup and Recovery System Performance and Tuning Miscellaneous ============================================================================== ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + System Administration Tasks and Tools + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Browsing the bootup messages ============================================================================== alog -f /var/adm/ras/bootlog -o ============================================================================== Finding out the amount of real memory installed ============================================================================== lsattr -E -l sys0 -a realmem | awk '{ print $2 "K bytes" }' bootinfo -r lsdev -Cc memory ============================================================================== Controlling "at" jobs ============================================================================== To list "at" jobs: at -l atq [user] To cancel "at" job: at -r JobNumber atrm [jobNumber | user] ============================================================================== Setting the initial priority of a process ============================================================================== nice -n Number CommandString where Number is in the range of 0 to 39 ============================================================================== Reporting CPU consumption of running processes in the next xx seconds ============================================================================== tprof -ksex sleep 60 The ./__prof.all file reports CPU consumption in ticks (10 msec). ============================================================================== Setting up InfoExplorer ============================================================================== To set up InfoExplorer on a disk: mkdir /cdrom mount -v cdrfs -p -r /dev/cd0 /cdrom cd /cdrom cp -r * /usr/lpp/info/lib/en_US cd /usr/lpp/info/lib/en_US chmod -R a+r . find . -type d -exec chmod a+x {} \; /usr/lib/X11/app-defaults/info_gr contains the application defaults file that contains system resource definitions. InfoExplorer searches the following environment variables in that order to determine the language to read databases from the /usr/lpp/info/lib directory: INFOLANG, INFOLOCALE, LC_MESSAGES, LANG. If no libraries are found, then InfoExplorer defaults to using the libraries installed in /usr/lpp/info/lib/en_US. ============================================================================== Creating a man page ============================================================================== Commonly used nroff tags: .TH Man page name and section number .SH
Identifies a subsection .B Bold or highlighted text .I Italics text .PP Block out a paragraph .TP Indent paragraph spaces except first line .TH Title heading .SH
Section heading .TP Specifiy amount to ident .PP Start new filled paragraph .IP Idented paragraph .nf Stop autofillin (adjusting words on lines) .fi Start autofilling again .B Use bold type for text given as its argument .I Italicize text given as its argument .R Use roman type for text given as its argument Sample man page with tags inserted: .TH my command 1 .SH NAME mycommand \- Does just what I want to do. .SH SYNOPSIS my command [will|wont] work .SH DESCRIPTION .B mycommand is always used as a last resrot. It can be expected either to work or fail. .B mycomand contains no surprises at all. .SH OPTIONS .TP 3 will Completes the work at hand. .TP 3 wont Take a vacation. .SH BUGS .B mycommand is always error free! .SH SEE ALSO myprogram(1), myshell(1) .SH AUTHOR .B Me. Who did you think it was? Output of "nroff -man mycommand.1 | more": mycommand(1) NAME mycommand - Does just what I want to do. SYNOPSYS mycommand [will|wont work DESCRIPTION mycommand is always used as a last resrot. It can be expected either to work or fail. mycomand contains no surprises at all. OPTIONS will Completes the work at hand. Wont Take a vacation. BUGS mycommand is always error free! SEE ALSO myprogram(1), myshell(1) AUTHOR Me. Who did you think it was? ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + System Installation and Maintenance + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Commands on multiprocessing kernel ============================================================================== To display if MP capable: bootingo -z To create a boot image with MP lock support: bosboot -a -L To display processor state and mapping: cpu_state -l To display system lock usage statistics: lockstat -a To bind process to processor: bindprocessor process_id processor_number To display which processors are available: bindprocessor -q To unbind a bound process: bindprocessor -u process_id ============================================================================== VSM commands ============================================================================== xinstallm Installation manager xmaintm Software maintenance xdevicem Device manager xprintm Print manager xlvm Storage manager xuserm User manager xdat Date, time, job manager ============================================================================== Using the instfix command ============================================================================== To list which PTFs are installed: instfix -i To list the filesets of all the applied PTFs: instfix -ic To check whether fixes for specific APARs are installed: instfix -ick "IXnnnnn Ixnnnnn" To list fixes on a PTF tape: instfix -Td /dev/rmt0 To obtain more information on fixes, use the -a option together with either -i for the installed fixes or -d file/device for the fixes on a file or distribution medium. ============================================================================== Collecting PTFs ============================================================================== To transfer updates in backup format from tape (must be a non-rewinding device) to disk: bffcreate -qd /dev/rmt0.1 -t /SaveDir all To list the packages on the tape: bffcreate -d /dev/rmt0.1 -l To list the contents of the .bff file created: restore -Tqf bff_filename Always run the inutoc command after a fix is added to the directory to create the table of contents file .toc. ============================================================================== System boot procedure ============================================================================== (1) Initialize the hardware (2) Load the boot image and execute it (3) Configure devices (4) Start the init process (5) Run the commands in /etc/inittab ============================================================================== Non-volatile RAM (NVRAM) ============================================================================== Apart from storing the boot list, the NVRAM contains device information and bit steering information for memory hardware defect correction. It can be reset by disconnecting the battery for half an hour when needed. If present in the NVRAM, a device extension code can be loaded to support a boot device that is not supported through the system read-only storage (ROS). ============================================================================== Using fsck to validate a file system ============================================================================== fsck can find the following file system problems: One block belonging to several files (inodes) Blocks marked as free but in use Blocks marked as used but free Incorrect link counts in inodes (indicating missing or excess directory entries) Inconsistencies between inode size values and the amount of data blocks referenced in address fields Illegal blocks (e.g., system tables) within files Inconsistent data in the filesystem's tables Lost files (non-empty inodes not listed in any directory) Illegal or unallocated inode numbers in directories If fsck is run with the -p option, the following problems will be silently fixed: Lost files will be placed in the file system's lost+found directory, named for their inode number Link counts in inodes too large Missing blocks in the free list Blocks in the free list also in files Incorrect counts in the file system's tables Unreferenced zero-length files are deleted ============================================================================== Invalid boot list or hardware problems ============================================================================== If the system is stuck at BIST (LED codes below 200), then you should check the hardware. LED codes in the range 200-299 can be caused by hardware or wrong configuration information. A steady LED code of 201 indicates hardware trouble with the system planar. Should the system alternate between 223/229, 225/229, 221/229 and 233/235 or be stuck at 221, 222 or 721 you can try to clean up the default boot list as those errors indicate a missing boot device or the system ROM being unable to work with the configured boot device. Go into maintenance mode and execute: getrootfs hdisk0 bootlist -m normal -i sync;sync Then turn the key to normal and reboot the system. If this still does not help, you can try to disconnect the NVRAM battery for half an hour and then boot again. The problem may also have been caused by a full / or /tmp file system. ============================================================================== Corrupted boot images ============================================================================== If the LED display alternates between 201 and 299 or hangs on 517 then its boot image is probably corrupt. The LED sequence 888-103-207-299 and 888-103- 208-299 may also indicate a corrupted boot image. You can try to generate a new boot image with the following commands in maintenance mode: getrootfs hdisk0 syncvg -v rootvg synclvodm -v rootvg bosboot -d /dev/hdisk0 -a sync;sync Then turn the key to normal and reboot the system. A corrupted boot image could be caused by a full / or /tmp file system during the bosboot run. ============================================================================== Corrupted file systems ============================================================================== A corrupted file system is indicated by the LED 518 display. Use fsck to repair the file system damage in maintenance mode: getrootfs hdisk0 sh fsck -y /dev/hd1 fsck -y /dev/hd2 fsck -y /dev/hd3 fsck -y /dev/hd4 fsck -y /dev/hd9var fsck -y application_filesystems . . sync;sync If a file system cannot be fixed by fsck but block 8 is readable, try to copy a backup block of the superblock to block 1: dd count=1 bs=4k skip=31 seek=1 if=/dev/hdN of=/dev/hdN Should you get errors indicating problems with the JFS log, reformat the log logical volume: /usr/sbin/logform /dev/hd8 ============================================================================== Varyon problems ============================================================================== The system displays code 551 when it starts to activate the root volume group. If it hangs there or hangs/alternates with 552, 554, 555, 556 or 557 then it has problems activating the root volume group because of a corrupted file system, a bad JFS log or a bad disk. If the root volume group is activated in maintenance mode (getrootfs hdisk0), it is likely that the / or /tmp file system is full, /dev is missing or /bin does not exist. You should also check files in /etc for their integrity, especially /etc/filesystems. If the getrootfs step failed, reboot and use "getrootfs hdisk0 sh" instead, and then run "fsck" to repair the file systems. If they fail, then the ODM may be corrupted. It is possible to create a non-corrupted but stripped down version of the ODM: mount /dev/hd4 /mnt mount /dev/hd2 /usr mkdir /mnt/etc/objrepos/bak cp /mnt/etc/objrepos/Cu* /mnt/etc/objrepos/bak cp /etc/objrepos/Cu* /mnt/etc/objrepos /etc/umount all exit Now the system should be up with the file systems mounted and the mini ODM active. Run "lslv -m hd5" to find the disk where the boot logical volume resides. Then create a new boot image: savebase -d /dev/hdiskN bosboot -a -d /dev/hdiskN ============================================================================== Problems with /etc/inittab ============================================================================== If the system hangs with LED 553 this could indicate problems with /etc/inittab, a full / or /tmp file system or that /bin has been removed. A missing shell or shell profile could also be the cause. Run "getrootfs hdisk0" in maintenance mode and use "df" to see if any of the file systems are full. Check the following files and links for correctness: /bin, /etc/ inittab, /etc/environment, /bin/sh, /bin/bsh, /etc/fsck, /etc/profile and /sbin/rc.boot. ============================================================================== Problems with the console ============================================================================== If the system hangs with a code of c31 then it has problems accessing the console device. Either it is not properly configured or a device on one of the integrated serial ports makes the system think there is a console even though there is none. Remove all non-terminal devices from the serial ports, turn on all connected terminals and try a reboot. For a direct attached console but no serial terminals, the necessary device driver for the graphics card is missing or the console configuration is damaged. In a maintenance shell use "smit chcons" to set up the console. If you do not have the right driver loaded, copy it from the operating system CD, create a new boot image and reboot. ============================================================================== Where did the program crash? ============================================================================== Use dbx on the core file will tell you which program created the core file when complaining that there is no program matching the core image. ============================================================================== LED values ============================================================================== 100-199 Built-in self tests ----------------------------------------------------------------- 200-299 Progress of first phase of boot 218-219 Power-on self tests 222 Booting via standard device list 229 Can't find a valid boot device 299 Control passing to loaded boot program ----------------------------------------------------------------- 500-599 Status messages from boot program 516-518 Beginning network boot 551 Activating the root volume group 555 fsck failed on the root partition ----------------------------------------------------------------- 600-625 Network boot status indicators ----------------------------------------------------------------- 700-732 Configuration of "unknown" devices 731 PTY configuration ----------------------------------------------------------------- 800-998 Configuration of standard devices 812 Memory configuration 821 Keyboard configuration 823 Mouse configuration 826 First serial port configuration 827 First parallel port configuration 828 Diskette drive configuration 830-848 Additional port configuration 868-869 SCSI adapter configuration 870-879 Configuring various graphics adapters 951-968,989-990 Configuring various types of SCSI disks 970-973,991-994,998 Configuring various tape drives 974,987 CD-ROM configuration ============================================================================== National language support ============================================================================== Naming convention for a locale definition source file: language[_territory][.codeset][@modifier] Examples: En_US or En_US.IBM850 en_US or en_US.ISO8859-1 Locale Environment variable hierarchy: Priority Class Environment Variables -------------- --------------------- High LC_ALL Medium LC_COLLATE LC_CTYPE LC_MESSAGES LC_MONETARY LC_NUMERIC LC_TIME Low LANG A high priority variable overrides a medium priority variable, which overrides a low priority variable. The Message Facility includes the following two commands for displaying messages: dspcat displays all or part of a message catalog. dspmsg displays a selected message from a message catalog. gencat creates and modifies a message catalog. mkcatdefs preprocesses a message source file for input to gencat. Runcat pipes output from the mkcatdefs command to gencat. The NLSPATH environment variable specifies the search path for locating message catalog files. The LOCPATH environment variable specifies the search path for localized information, including binary locale files, input methods, and code-set converters. ============================================================================== System run levels ============================================================================== Currently defined run levels: 0-9 All processes at the current run levels are killed, then any processes associated with the new run level are started. 0-1 Reserved. 2 Default run level. 3-9 User-defined. a,b,c Processes at the current run levels are not killed; any processes assigned with the new run level are started. Q,q The /etc/inittab file is reexamined by the init command. To display a history of previous run levels: /usr/lib/acct/fwtmp < /var/adm/wtmp | grep run-level bosext2.acct.acct.obj must be installed to use this command. The file /etc/.init.state contains the current run-level. ============================================================================== Creating a boot image on a boot logical volume ============================================================================== The bosboot command does the following: Checks the file system to see if there is enough room to create the boot image. Creates a RAM file system using the mkfs command and a prototype file. Calls the mkboot command, which merges the kernel and the RAM file system into a boot image. Writes the boot image to the boot logical volume. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + System Configuration and Customization + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Structure of ODM ============================================================================== The databases in /etc/objrepos contain the actual configuration, often called customized objects and the root part of the LPP. /usr/lib/objrepos contain default configuration or predefined objects and the usr part of the LPP. The shareable part of the LPP is in /usr/share/lib/objrepos. All the files from /usr/lib/objrepos are accessible in /etc/objrepos via symbolic links. ============================================================================== File types ============================================================================== - Plain file (hard link) d Directory l Symbolic link b Block special file c Character special file s Socket p Named pipe ============================================================================== Hard and symbolic (soft) links ============================================================================== A hard link associates two (or more) filenames with the same inode. Hard links all share the same disk data blocks while functioning as independent directory entries. Hard links may not span disk partitions, since node numbers are only unique within a given device. Symbolic links are pointer files that name another file elsewhere in the file system. Symbolic links may span physical devices, since they point to a UNIX pathname, not to an actual disk location. ============================================================================== Sockets and named pipes ============================================================================== A socket is a special type of file used for communications between processes. A socket may be thought of as a communications end point, tied to a particular system port, to which processes may attach. Named pipes are pipes opened by name by applications. Named pipes often reside in the directory called /dev, and they serve as another mechanism to facilitate interprocess communication. ============================================================================== Duplicating an entire directory tree ============================================================================== Using tar: mkdir / cd /newdir tar -cfv - -C / olddir | tar -xvpf - mv olddir newdir Using cpio: mkdir /newdir cd /olddir find . -print | cpio -pdvm /newdir ============================================================================== savebase/restbase ============================================================================== savebase - save master ODM custom device data to the boot image. Restbase - restore custom ODM data from the boot image to master ODM database. A small ODM database representing device configuration information is maintained as part of the AIX boot image. This information can be updated from the master ODM database using the savebase command. ODM information from the boot image can be restored to the master ODM database by invoking the restbase command. ============================================================================== RAID levels ============================================================================== RAID 0 Striped data, no parity RAID 1 Mirrored data RAID 2 Striped data with parity RAID 3 Striped data with parity drive RAID 4 Striped data with parity and asynchronous read access RAID 5 Striped data with striped parity ============================================================================== inode numbers ============================================================================== ls -ailF istat /dev/hd1 istat find /home -xdev -inum -print ============================================================================== Data blocks ============================================================================== Data blocks are used to store the actual file data in the file system. Each inode contains 13 data block address slots. The first eight address slots point at the first eight file data blocks of the file. The ninth address points to an incore inode structure. The disk inode information is copied to this structure when the file is opened. The tenth through thirteenth addresses point to indirect data blocks, which are used to address data blocks for large files. Each indirect block supports 1024 addresses. Because file addressing is restricted to 32 bits, third-level indirection is not used. ============================================================================== Location code format for non-SCSI devices ============================================================================== AA-BB-CC-DD AA- First position identifies I/O bus, usually 0. Second position identifies slot number on bus in CPU drawer; 0 on workstations BB- First position is the system I/O bus identifier (0 indicates Microchannel or PCI; 1 indicates ISA) Second position is the slot number of the adapter, memory card or adapter for the identified device CC- Connector on an adapter or planar 0P. 0S, S1, S2, 0D, 0K, 0M, 0T for built-in devices DD- Async port number or FRU location on a card or planar ============================================================================== Location code format for SCSI devices ============================================================================== AA-BB-CC-S,L AA- Usually 00 BB- First position identifies I/O bus Second position identifies the adapter card slot on bus CC- 00 For a card that provides a single SCSI bus or a device attached to the internal bus for a dual SCSI 01 Device attached to an external bus on a dual SCSI 0S External bus connector of an integrated SCSI controller S- SCSI address of the device L- Logical unit number of the device ============================================================================== Location code format for PCI RS/6000 ============================================================================== AA-BB-CC-DD AA- Always 00 BB- The first digit indicates the bus type 0 = PCI 1 = ISA 2 = PCMCIA The second digit indicates slot. For PCI 1 indicates the first identifying adapter, which is the integrated SCSI, 2 indicates the lower PCI slot and 3 indicates the upper PCI slot. With the ISA adapters, the second digit will be an x. CC- Will be 00 or it will indicate connector designation (i.e., 0D, 0M, 0K for diskette, mouse and keyboard connectors respectively). DD- Port number for async device, address for SCSI device. ============================================================================== SCSI I/O controller types ============================================================================== Type # Interface Type ------ -------------- 4-1 Single-ended, Narrow 4-2 Differential, Narrow, Fast 4-4 Single-ended, Narrow, Fast 4-6 Differential, Wide, Fast 4-7 Single-ended, Wide, Fast 4-C Differential, Wide, Fast ============================================================================== Creating a file system on diskettes ============================================================================== To create the floppy file system, use a clean directory and copy all the files and directories that you want on the diskette into it. Use the "proto" command to generate a prototype file: proto /tmp/fdfs > /tmp/fd.p The header of the file needs to be adjusted: echo ' 0 0' > /tmp/fd.proto sed "ld' < /tmp/fd.p >> /tmp/fd.proto Now you can use "mkfs" to generate the file system (on a previously formatted diskette): mkfs -p /tmp/fd.proto -V jfs /dev/fd0 Now the floppy can be mounted like any other file system. You only need to specify that it is readonly and removable: mount -p -r /dev/fd0 /mnt ============================================================================== Making the primary paging space smaller ============================================================================== Create a temporary paging space in rootvg (e.g. 40 MB paging space paging00): mkps -a -n s10 rootvg Deactivate the primary paging space: chps -a n hd6 Edit /sbin/rc.boot and comment the line that contains "swapon /dev/hd6": [ ! -f /needcopydump ] && swapon /dev/paging00 Build a new boot image: bosboot -a -l/dev/hd5 -d/dev/hdisk0 Make sure the dump device no longer points to hd6 (if applicable): sysdumpdev -Pp /dev/sysdumpnull Reboot the system shutdown -Fr Remove hd6: rmps hd6 Create a new paging space: mkps -s newsize -a rootvg hdisk0 Rename the new paging space from paging01 to hd6: chlv -n hd6 paging01 Edit /etc/swapspaces to reflect the change made in the previous step Edit /sbin/rc.boot, reactivate the "swapon" command: [ ! -f /needcopydump ] && swapon /dev/hd6 Rebuild the boot logical volume again: bosboot -a -l/dev/hd5 -d/dev/hdisk0 Deactivate the temporary paging space: chps -a n paging00 Reboot the machine: shutdown -Fr Remove the temporary paging space: rmps paging00 Reset the dump device (if applicable): sysdumpdev -Pp/dev/hd6 ============================================================================== Cleaning up the spool directories ============================================================================== Kill all current print jobs with "qcan -X". Stop qdaemon with "stopsrc -s qdaemon". If there are still qdaemon processes or children of it (any pio processes) kill them with kill -9 manually. If you need to save print jobs copy the spooled files from /var/spool/qdaemon and /var/spool/lpd to /tmp. Remove the contents of the spool and control directories: rm /var/spool/lpd/* rm /var/spool/qdaemon/* rm /var/spool/lpd/qdir/* rm /var/spool/lpd/stat/* Activate the queuing daemon again with "startsrc -s qdaemon". Now you can use your favorite print command to print the saved spool files. ============================================================================== Starting/stopping qdaemon ============================================================================== The following method for starting and stopping qdaemon is preferred over using startsrc/stopsrc: chssys -s qdaemon -O (Does not restart if qdaemon dies) enq -G (Ends qdaemon gracefully) chssys -s qdaemon -R (Immediately restarts if qdaemon dies) ============================================================================== Controlling queues and print jobs ============================================================================== To submit a print job: enq -P queue_name file_name To display the status of a queue: enq -q -P queue_name or lpstat -Pqueue_name To display the status of all queues: enq -A or lpstat To delete a job in a queue: enq -x job_number or cancel job_number To delete all jobs in a queue: enq -X -P queue_name or cancel queue_name To bring a device down, allowing current jobs to finish: enq -P queue_name -D or disable queue_name enq -P queue_name:device_name -D (needed if more than 1 device for queue) To bring a device down, killing current jobs: enq -P queue_name -K enq -P queue_name:device_name -K (needed if more than 1 device for queue) To bring a device up: enq -P queue_name -U or enable queue_name enq -P queue_name:device_name -U (needed if more than 1 device for queue) To move a pending print job: enq -Q new_queue -# job_number or qmov -m new_queue -# job_number enq -Q new_queue -P old_queue or qmov -m new_queue -P old_queue enq -Q new_queue -u user_name or qmov -m new_queue -u user_name To hold print jobs in queues: enq -h -# job_name or qhld -# job_number enq -h -P queue_name or qhld -P queue_name To release a previously held job: enq -p -# job_name or qhld -r -# job_number enq -p -P queue_name or qhld -r -P queue_name To submit and hold a print job: enq -H -P queue_name file_name ============================================================================== Canceling a print job to a remote printer ============================================================================== qcan -x JobNumber -P PrintQueue Note: The print queue must be specified. ============================================================================== Resetting the terminal ============================================================================== From terminal: Try the start key Reset terminal from setup menu () Try interrupt, quit keys reset reset stty sane From another terminal: stty -a < /dev/ttyn stty sane echo < /dev/ttyn or kill -9 pid_of_login_shell ============================================================================== Terminfo ============================================================================== To compile terminfo source: tic terminfo_source To list source for a compiled terminfo entry: infocmp [terminfo_entry] To list the equivalent termcap entry for a compiled terminfo entry: infocmp -C terminfo_entry To translate a termcap entry info terminfo source: captoinfo [termcap_entry] ============================================================================== Creating the queuing system for batch services ============================================================================== Define one shell device for each concurrent batch job in /etc/qconfig: bsh: device = bshdev discipline = fcfs bshdev: backend = /usr/bin/ksh ============================================================================== Retrieving the logical volume control block ============================================================================== getlvcb -ATl lvname ============================================================================== Synchronizing the ODM and VGDA ============================================================================== varyonvg -m vgname uses VGDA to update ODM. varyonvg -m1 vgname forces VGDA to update ODM. varyonvg -m2 vgname uses VGDA to update ODM only if errors are detected. ============================================================================== Activating a volume group ============================================================================== When varyonvg vgname is executed: (1) The VGDA of each physical volume is read. (2) The header and trailer time stamps of each VGDA are compared. If they match the VGDA is accepted as valid. (3) If the number of the usable VGDAs reaches the quorum value the volume group is varied on; otherwise varyonvg fails. (4) The most recent VGDA is used to update the other VGDAs. ============================================================================== Limiting the volume reorganization to specific physical volumes ============================================================================== echo hdisk1 | reorgvg -i rootvg hd1 hd2 Note: hd1 is favored over hd2 during the reorgvg operation. ============================================================================== Defragmenting a file system ============================================================================== defragfs /home1 The -r option generates a report only, and no defragmentation actually takes place. This command also increases continuous free space in nonfragmented file systems. ============================================================================== Files that grow ============================================================================== /var/spool /var/spool/mail /var/spool/lpd /var/spool/mqueue /var/spool/qdaemon /var/spool/rwho /var/spool/uucp /var/tmp /var/log /tmp /var/adm //var/adm/cron/log /var/adm/sulog /var/adm/wtmp /var/adm/ras/errlog /etc/utmp /etc/security/failedlogin /audit ~/smit.script ~/smit.log /dev/* ============================================================================== What is the clocal STTY attribute? ============================================================================== If you are configuring the console tty, then you should include the clocal STTY attribute in both the RUN TIME and LOGIN fields. This will ensure that output to the tty is not blocked even when the tty is turned off. However, a session on that terminal will not be terminated when the terminal is turned off. One needs to log out explicitly. ============================================================================== What is a ttyhog error? ============================================================================== If there is much noise on a unconnected tty line then there may be many gettys respawning. init will notice this and stop respawning gettys. If this happens on your system you will see ttyhog errors in the error report. ============================================================================== LVM informational commands ============================================================================== To display all disks on the system lspv To display all volume groups lsvg To display all logical volumes lsvg -l `lsvg` To display all file systems lsfs To display all file systems of a given type lsfs -v type To display what logical volumes are in a volume group lsvg -l vg_name To display file systems in a volume group lsvgfs vg_name To display disks in a volume group lsvg -p vg_name To display volume group a disk is in lsvg -n hdiskn To display disk characteristics and settings lspv hdiskn To display volume group settings lsvg vg_name To display logical volume characteristics lslv lv_name To display size of an unmounted local file system lsfs file_system To compare lv_size and fs_size lsfs -q file_system To display disk usage summary map by region lspv -p hdiskn To display free physical partition location on a disk lspv hdiskn To display free physical partition location in a vg lsvg -p vg_name To display logical volumes in a given disk lspv -l hdiskn To display disk region distribution of logical volume lslv -l lv_name To display physical-to-logical partition mapping lslv -m lv_name To display physical partition usage for a disk lspv -M hdiskn ============================================================================== Making screen dumps ============================================================================== xwd | xpr -device -s | lp -d ps0 where ps0 is a PostScript printer For higher quality screen dumps use pbmplus, Imagemagick, xgrabsc and xv. ============================================================================== Managing AIX Common Desktop Environment ============================================================================== To enable the desktop auto-start: dtconfig -e shutdown -r To disable the desktop auto-start: dtconfig -d shutdown -r To start the desktop login manager manually: /usr/dt/bin/dtlogin -daemon To stop the desktop login manager manually: cat /var/dt/Xpid ---> process_id of the login manager kill -term process_id To modify the desktop profile: cp /usr/dt/config/sys/dtprofile $HOME/.dtprofile vi $HOME/.dtprofile Note: When a user logs in to the desktop, the shell environment file (.profile or .login) is not automatically read. ============================================================================== Managing power management ============================================================================== To enable power management: pmctrl -e -a enable To disable power management: pmctrl -e -a full_on To configure power management: mkdev -l pmc0 To unconfigure power management: rmdev -l pmc0 To start system suspend state: pmctrl -e -a suspend ============================================================================== Limitations for logical storage management ============================================================================== Volume group 255 per system Physical volume 32 per volume group Physical partition 1016 per physical volume up to 256 MB each in size Logical volume 256 per volume group Logical partition 32512 per logical volume ============================================================================== Switching back to the primary disk ============================================================================== If you have booted from a device that is a mirrored copy of the boot device, and later corrected the primary disk, brought down the system and connected the primary disk back to boot from again, you may get errors while attempting to reboot. The system will continue to boot. You may encounter other problems with logical volumes on the primary boot device. To make the primary available again, do the following before you boot from the primary: chpv -va primary_disk syncvg -v rootvg ============================================================================== Making an available disk a physical volume ============================================================================== chdev -l hdiskn -a pv=yes ============================================================================== Making a physical volume an available disk ============================================================================== chdev -l hdiskn -a pv=clear ============================================================================== Migrating boot logical volume from hdisk0 to hdisk1 ============================================================================== extendvg rootvg hdisk1 migratepv -l hd5 hdisk0 hdisk1 bosboot -a -d /dev/hdisk1 bootlist -m normal hdisk1 mkboot -c -d /dev/hdisk0 ============================================================================== Changing the name of a volume group ============================================================================== umount ... varyoffvg old_vgname exportvg old_vgname importvg -f -y new_vgname hdiskX mount all ============================================================================== Copying a logical volume with the cplv command ============================================================================== To copy to a new logical volume: cplv [ -v VolumeGroup ] [ -y NewLogicalVolume | -Y Prefix ] SourceLogicalVolume To copy to an existing logical volume: cplv -e DestinationLogicalVolume [ -f ] SourceLogicalVolume Do not copy from a larger logical volume containing data to a smaller one. Doing so results in a corrupted file system because some data (including the superblock) is not copied. Copying a logical volume to an existing logical volume will overwrite any data on that volume without requesting user confirmation. ============================================================================== Types of storage ============================================================================== Working storage: Segments are used to implement the data areas of processes and shared memory segments. The pages for working storage segments are stored in the paging spaces configured in the system. Persistent storage: Segments are used to manipulate files and directories. When a persistent storage segment is accessed, the pages are read and written from its file system. Client storage: Segments are used to implement some virtual file systems like Network File System (NFS) and the CD-ROM file system. The storage for client segment pages can be in a local or remote computer. ============================================================================== Paging space allocation policies ============================================================================== When the number of free paging-space blocks falls below a threshold known as the paging-space warning level, the system informs all processes of the condition by sending the SIGDANGER signal. If the shortage continues and falls below a second threshold known as the paging-space kill level, the system sends the SIGKILL signal to processes that are the major users of paging space. Processes executed in the early allocation environment mode (PSALLOC=early) will not be sent the SIGKILL signal should a low paging space condition occur. The early allocation algorithm guarantees as much paging space as requested by a memory allocation request. Any processes running under the default late allocation mode become highly vulnerable to the SIGKILL signal mechanism. The paging space required for early allocation mode will almost always be greater than the paging space required for the default late allocation mode. A good starting point for determining the right mix for your system is to define a paging space four times greater than the amount of physical memory. ============================================================================== Before removing the failed drive ============================================================================== To look at the contents of the failing drive (hdisk2), use one of the other drives (hdisk3) in the same volume group: lspv -M -n hdisk3 hdisk2 Back up all single-copy logical volumes, if possible. Unmount all single-copy file systemsfrom the failing physical volume: umount /directory Remove all single-copy file systems from the failed physical volume: rmfs /directory Remove all mirrored logical volumes located on the failing disk: rmlvcopy lv_name 2 hdisk2 Remove the dump device and any paging spaces located on the disk. Remove any other logical volumes: rmlv -f lv_name Reduce the size of the volume group to omit the failed drive: reducevg -df vg_name hdisk2 Shut down the system: shutdown -F ============================================================================== After reformatting a drive ============================================================================== If you were unable to reducevg the disk from the old volume group before the disk was reformatted, the following procedure can help clean up the VGDA/ODM information: If the volume group consisted of only one disk, enter: exportvg vg_name If the volume group consists of more than one disk, first run the command: varyonvg vg_name --> hdiskX pv_id PVNOTFND Enter: varyonvg -f vg_name --> hdiskX pv_id PVREMOVED Enter the command: reducevg -df vg_name pv_id ============================================================================== After adding a reformatted or replacement disk drive ============================================================================== Reboot or run the following commands: cfgmgr mkdev -l hdiskX List the disks: lsdev -C -c disk Make the disk available: chdev -l hdiskX -a pv=yes Add the new disk drive to the volume group: extendvg vg_name hdiskX Recreate the single-copy logical volumes on the new disk drive: mklv -y lv00 vg_name 1 hdiskX Recreate the file systems on the logical volume: crfs -v jfs -d lv_name -m /directory Restore single-copy file system data from backup media: cd /directory restore -rqf /dev/rmt0 Recreate the mirrored copies of logical volumes: mklvcopy lv_name 3 hdiskX Synchronize the new mirror with the data on the current mirrors: syncvg -p hdisk3 ============================================================================== Fixing a damaged file system ============================================================================== Unmount the damaged file system. Assess file system damage: fsck /dev/hd1 If the file system cannot be repaired, restore it from backup: mkfs /dev/hd1 mount /dev/hd1 /filesys cd /filesys restore -r ============================================================================== Maximum journaled file system size ============================================================================== Logical partition size of the volume group: 32 x 1016 x partition_size Fragment size: ( 2 ** 28 ) x fragment_size Number of i-nodes: number of bytes per i-node (NBPI) x ( 2 ** 24) +-------------+--------------------+-----------------------+-----------------+ | NBPI Ratio | Fragment Size | Partition Size (MB) | Max FS Size(GB) | +-------------+--------------------+-----------------------+-----------------+ | 512 | 512,1024,2048,4096 | >=2 | 8 | +-------------+--------------------+-----------------------+-----------------+ | 1024 | 512,1024,2048,4096 | >=2 | 16 | +-------------+--------------------+-----------------------+-----------------+ | 2048 | 512,1024,2048,4096 | >=2 | 32 | +-------------+--------------------+-----------------------+-----------------+ | 4096 | 512,1024,2048,4096 | >=4 | 64 | +-------------+--------------------+-----------------------+-----------------+ ============================================================================== Log size issues for greater than 2GB file systems ============================================================================== The default log size (4 MB) may not be sufficient when file systems exceed 2GB. JFS log sizes must be scaled upward as file system size increases. To increase the log size beyond the usual 4MB (or 1 partition): crfs ... -l or mklv -t jfslog ...; logform or umount all; extendlv log_lv; logform; mount all ============================================================================== Reducing the /usr file system size ============================================================================== Remove any files in /usr that you do not want. Make sure all file systems in the rootvg volume group are mounted. Type the command: mkszfile Edit the /image.data file: lv_data: VOLUME_GROUP= rootvg . . LPs= 58 . . LV_MIN_LPS= 51 fs_data: FS_NAME= /usr . . FS_SIZE= 475136 . . FS_MIN_SIZE= 417792 The FS_SIZE value is calculated: FS_SIZE = PP_SIZE ( in KB ) * 2 ( 512-blocks) * LPs Umount all file systems that are NOT in the rootvg volume group. Varyoffvg and exportvg any user-defined volume groups. With a tape in the tape drive, type the following command: mksysb /dev/rmt0 Restore the operating system from the tape in service mode. Reboot the system. Import all user-defined volume groups. Mount all file systems. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + Network Configuration and Customization + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Setting up anonymous FTP ============================================================================== Run /usr/samples/tcpip/anon.ftp as root user to create the ftp and anonymous user id's, and to set up the correct directory structure in ftp's $HOME. If you want to have a different directory for the anonymous FTP server create the ftp user yourself and set the directory you desire. The ~ftp/pub directory can be written by everyone. To tighten security, files or directories in the ~ftp directory tree should not be writable by ftp, and all profiles should be removed. Install fake group and passwd files in the ~ftp/etc directory. Modify /etc/inetd.conf and change the ftp configuration line from ftp stream nowait root /usr/sbin/ftpd ftpd to ftp stream nowait root /usr/sbin/ftpd ftpd -l and then execute "refresh -s inetd" to log all requests via syslogd. ============================================================================== Network address classes ============================================================================== 1-127.h.h.h Class A 128-191.n.h.h Class B 192-223.b.b.h Class C 224-239.x.x.x Class D, multicast addresses 240.x.x.x-255.x.x.x Reserved for the Internet Activities Board (IAB) The following networks can never be connected to Internet: 10 Used by the U.S. military 172.16-172.31 192.168.0-192.168.254 The following addresses are reserved: 0.0.0.0 Old-style broadcast address 255.255.255.255 New-style broadcast address 127.0.0.1 Loopback address ============================================================================== Network routing options ============================================================================== If the LAN consists of a single network, no explicit routing is usually needed. The ifconfig commands used to configure the network interfaces will provide them with enough information for them to route packets to their destination. Static routing may be used for small to medium-sized networks not characterized by many redundant paths to most destinations. This is set up by issuing explicit route commands at boot time. To add a static route to a separate LAN: chdev -l inet0 -a route=net,-hopcount,1,192.1.6,192.1.6.1 Dynamic routing, in which optimal paths to destination are determined at packet transmission time, may be used via the routed or gated daemon. ============================================================================== OSI and TCP/IP network architectures ============================================================================== +--------------------------------------+-------------------------------------+ | OSI | TCP/IP | |--------------------------------------|-------------------------------------| | APPLICATION LAYER | APPLICATION LAYER | | Specifies how application programs | Handles everything else; TCP/IP | | interface to the network and | network daemons and applications | | provides services to them. | have to perform the jobs of the OSI | | | Presentation Layer and part of its | |--------------------------------------| Session Layer themselves (many | | PRESENTATION LAYER | protocols and services). | | Specifies data representation to | | | applications. | | |--------------------------------------| | | SESSION LAYER |-------------------------------------| | Creates, manages and terminates | TRANSPORT LAYER | | network connections. | Manages all aspects of data routing | |--------------------------------------| and delivery, including session | | TRANSPORT LAYER | initialization, error control and | | Handles error control and sequence | sequence checking (TCP and UDP | | checking for data moving over the | protocols). | | network. |-------------------------------------| |--------------------------------------| INTERNET LAYER | | NETWORK LAYER | Responsible for data addressing, | | Responsible for data addressing and | transmission, and packet fragmenta- | | routing and communications flow | tion and reassembly (IP protocol). | | control. | | |--------------------------------------|-------------------------------------| | DATA LINK LAYER | NETWORK ACCESS LAYER | | Defines access methods for the | Specifies procedures for transmit- | | physical medium. | ting data across the network, | |--------------------------------------| including how to access physical | | PHYSICAL LAYER | medium (many protocols including | | Specifies the physical medium's | Ethernet and FDDI) | | physical and procedural operating | | | characteristics. | | +--------------------------------------+-------------------------------------+ ============================================================================== Automounter ============================================================================== The automounter uses configuration files known as maps, which are of two types: Direct maps hold entries for remote directories to be mounted on demand by the automounter: automount /- /etc/map.direct The contents of /etc/map.direct is really just abbreviated versions of traditional NFS entries: /metal -intr dalton:/metal Indirect maps are used for local directories whose subdirectories are each NFS-mounted, most likely from several to may different remote hosts: automount /homes /etc/auto.homes Contents of /etc/auto.homes: chavez -rw,intr dalton:/home/chavez harvey -rw,intr iago:/home/harvey Never terminate the automounter with kill -9: kill `ps -ef | grep [a]utomount | awk `{print $2}'` ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + Distributed Services + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== How X applications search for resources ============================================================================== Usually X applications search for resources in the following sequence: (1) Command line (2) A resource file that is specified via $XENVIRONMENT (3) $HOME/.Xdefaults- (4) Resource manager properties in the X server, which are the resources loaded via the xrdb command (5) $HOME/.Xdefaults (6) $XAPPLRESDIR/ (7) $HOME/$LANG/ (8) $HOME/ (9) /usr/lib/X11/$LANG/appp-defaults/ (10) /usr/lib/X11/appp-defaults/ (11) Predefined resources in the application ============================================================================== Using the fontserver ============================================================================== (1) Install X11.fnt.fontServer on the font server (2) Edit /usr/lib/X11/fs/config to configure the font server (3) Run fsconf to activate the font server (4) To use the font server execute "xset fp+ tcp/YourFontServer:7500" or start the X server with the option -fn tcp/YourFontServer:7500 Note: A standard font server listens on port 7000, but IBM's default is 7500. ============================================================================== Remapping keys for aixterm ============================================================================== Include the following in your ~/.Xdefaults file: aixterm.Translations: #override \ F1: string("vi ~/.Xdefaults\^M") \n\ F2: string("/usr/bin/smitty\^M") \n\ . . Note: Use to enter ^M in vi. To make the backspace key emit a DELETE code instead of CTRL-H: xmodmap -e "keysym Backspace = Delete" ============================================================================== Using the right $DISPLAY variable ============================================================================== :0 connect to the X server with the fastest possible method - shared memory transport unix:0 connect to the X server with local UNIX sockets localhost:0 connect to the X server with IP sockets :0 connect to the X server with the network layer ============================================================================== Installing aixterm on a remote system ============================================================================== Perform the following one-time operation on the remote system: (1) su (2) cd /tmp (3) mkdir Xxxxx (4) cd Xxxxx (5) ftp localSystemName (6) cd /usr/share/lib/terminfo (7) get ibm.ti (8) quit (9) TERMINFO=/tmp/Xxxxx (10) export TERMINFO (11) tic ibm.ti (12) ls (13) ls a (14) mkdir /usr/share/lib/terminfo/a (15) cp a/aixterm* /usr/share/lib/terminfo/a (16) cd /tmp (17) rm -r /tmp/Xxxxx (18) exit ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + User Management and Security + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Logging off a user after a period of inactivity ============================================================================== In /etc/profile: TMOUT=xxx;export TMOUT (for Korn shell) TIMEOUT=xxx;export TIMEOUT (for bsh shell) where xxx is in seconds ============================================================================== Using access control list (ACL) ============================================================================== ACLs extend the permission bits with three fields: permit, deny and specify. Example: attributes: base permissions owner (zaphod) : rw- group(staff): r-- others: --- extended permissions enabled permit rw- u:artur deny rwx g:hackers specify --- u:boss,g:system Every time chmod is used with an absolute numeric argument the extended permissions are disabled. ACLs even work across NFS. "ls -e" shows '-' for no ACL and '+' for files with ACLs. To find files on the system that have ACLs applied: find / -perm -200000000 -print ============================================================================== Caution using recursive chown ============================================================================== Using the -R flag makes chown follow symbolic links. To get around this problem use an additional flag -h, or find /directory \( -type f -o -type d \) -print | xargs chown user_name ============================================================================== Automatic log in ============================================================================== Create the file /etc/autolog containing a user name. At the next reboot the system will automatically start the log in shell for this user on the console without any authentication. Set the shell for this id to the application you want to run on the machine. ============================================================================== File/Directory Permissions ============================================================================== File Directory -------------------------------------------------------------------- r user can read contents of file user can list contents of dir w user can change contents of user can create and remove file files within direcory x user can use the file name as user can "cd" to directory and a command can use it in PATH SUID program runs with effective UID - of owner SGID program runs with effective GID files created in directory have of owner the same group as the directory SVTX - use must own file or directory to delete a file in directory It is not required to have write access on a file to delete it; write access to the directory where the file resides is sufficient. You only need read access to a directory in order to do a simple "ls"; any operation that involves more than simply reading the list of filenames from the directory is going to require execute access. ============================================================================== Validating the user environment ============================================================================== pwdck verifies the validity of local authentication information: pwdck [options] [ALL | username] usrck verifies the validity of a user definition: usrck [options] [ALL | username] grpck verifies the validity of a group: grpck [options] [ALL | groupname] Options: -n reports errors but does not fix them -p fixes errors but does not report them -t reports errors and asks if they should be fixed -y fixes errors and reports them ============================================================================== The "X" access type ============================================================================== The "X" access type grants execute access to the specified access classes only when execute access is already set for some access class. $ ls -lF -rw------- $ chmod go+rX * $ ls -lF -rw-r--r- ============================================================================== Using the pwdadm command ============================================================================== To list a user's current status: pwdadm -q user_name To disable default password restrictions of a user: pwdadm -f NOCHECK user_name To clear all flags for a user: pwdadm -c user_name To force every user on the system to change their password: #!/bin/ksh users="`lsuser -a id ALL | grep \"id=[2-7][0-9][0-9]\" | awk `{print $1}'`" for u in $users do pwdadm -f ADMCHG $u done ============================================================================== User account data files ============================================================================== /etc/passwd /etc/security/passwd /etc/security/user /etc/security/limits /etc/security/environ ============================================================================== Restricted shell ============================================================================== /bin/Rsh is a hard link of /bin/bsh. For restricted Korn shell, create a hard link /bin/rksh to /bin/ksh. Users of a restricted shell may NOT: use the "cd" command set or change the value of the PATH variable specify a command or filename containing a slash (/) use output redirection (> or >>) ============================================================================== UNIX groups ============================================================================== Although it is not required except under AIX, all groups are generally listed in the /etc/group file. To display a user's group memberships: groups To display the currently active group: id To change the primary group after login: newgrp new_group To add the listed groups to the group set: setgroups -a group1,group2,... To delete the listed groups from the group set: setgroups -d group1,group2,... To set the group set to the specified list of groups: setgroups -s group1,group2,... To add a group to the the current group set (if necessary) and designate it as the real group ID (group owner of new files and processes, etc.): setgroups -r group_name ============================================================================== Trusted computing base checking ============================================================================== A trusted computing base (TCB) is a system environment whose security is verifiably trustworthy and that includes the capability of ensuring its continued integrity. Users interact with the system in a trusted mode via a trusted path (TCP) which eliminates any untrusted applications and operating system components before allowing access the TCB. Communication with the TCB is initiated by pressing the Secure Attention Key (SAK) sequence (CTRL-X CTRL-R by default). To report the security state of the system: -n ALL ============================================================================== Monitoring unsuccessful login attempts ============================================================================== To display the username and number of unsuccessful logins when this value is greater than 3: egrep `^[^*].*:$|gin_coun' /etc/security/user | \ awk `{if (NF>1 && $3>3) {print s,$0}} ; NF ==1 {s=$0}' ============================================================================== Host/account level equivalence ============================================================================== When a user from a remote host attempts an access (with rlogin, rsh, or rcp), and if the host requesting access is not listed in /etc/hosts.equiv, a password will be required. If the /etc/passwd file in the target machine contains an account with the same username as the user on the remote system, then the remote access is permitted without requiring the user to enter that account's password. If the user is trying to log in under a different username or is the superuser, the /etc/hosts.equiv file is not used. Account-level equivalence uses a file called .rhosts in the home directory of the target account: hostname [username1 username2 ...] Each line means that usernameX is allowed to log in to this account from hostname. If usernameX is not present, then only the same username as the owner of the .rhosts file can log in from hostname. If a remote access is attempted and the access does not pass the host level equivalence test, the remote host then checks the .rhosts file in the home directory of the target account. If it finds the hostname and username of the person making the attempted access, the remote host allows the access to take place without requiring the user to enter a password. Account-level equivalence should never be used for the superuser. There should be no .rhosts file in the root directory. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + Backup and Recovery + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Backup devices - diskette ============================================================================== For a 31/2" (1.44) diskette drive: /dev/fdxl 720 KB /dev/fdxh 1.44 MB /dev/fdx.9 720 KB /dev/fdx.18 1.44 MB The format command formata a diskette with the highest capacity by default: format (high capacity) format -d /dev/rfd1.9 (low capacity) format -l (low capacity) The fdformat command can only format a diskette in drive /dev/fd0: fdformat (low capacity) fdformat -h (high capacity) You can copy a diskette using the flcopy command. You can read and write to PC-DOS diskettes using the dosdir, dosread, and doswrite commands. ============================================================================== Backing up and restoring rootvg volume group ============================================================================== To create a bootable image of the rootvg volume group: mksysb /dev/rmt0 The -i option creates /image.data. The /bosinst.data file will also be created unless there is already a user-generated one. The default for /bosinst.data can be found in /usr/lpp/bosinst/bosinst.template. The -e option excludes files listed in /etc/exclude.rootvg To generate the /image.data file for customization before backing up rootvg: mkszfile Before backing up the rootvg volume group: Mount all the file systems you want to backup Unmount any local directories that are mounted over another local directory Make at least 88 MB of free disk space available in the /tmp directory To restore a file /usr/bin/vi, for example, from mksysb backup: cd / tctl -f /dev/rmt0 rewind tctl -f /dev/rmt0 fsf 3 restore -xqf /dev/rmt0.1 -s 1 ./usr/bin/vi The last 3 steps can be combined into 1 step as follows: restore -xqf /dev/rmt0.1 -s 4 ./usr/bin/vi ============================================================================== Using savevg/restvg ============================================================================== To backup the volume group homevg: savevg -f /dev/rmt0 homevg The -i option creates /tmp/vgdata/homevg/homevg.data. The -e option excludes files listed in /etc/exclude.homevg. To generate the /tmp/vgdata/homevvg.data file for customization before backing up homevg: mkvgdata homevg To restore a backup created by restvg: restvg -qf /dev/rmt0 -s The "s" option specifies that the logical volumes be created at the minimum size possible to accommodate the file systems. The disk for the volume group needs to be unallocated for restvg to work. The restored file system will be automatically shrunk to their minimum size. To restore the volume group to different hard disks make sure they are in the available state and do not currently belong to a volume group: restvg -qf /dev/rmt0 hdiskN ... To list the contents of the tape: restore -Tqf /dev/rmt0 To restore a file /home/xxx/.profile: restore -xqf /dev/rmt0 ./home/xxx/.profile ============================================================================== Backup/restore by file ============================================================================== To backup by file: cd / find /home -print | backup -iqf /dev/rmt0 To restore from this tape: cd / restore -xqf /dev/rmt0 To list contents of this tape: restore -Tqf /dev/rmt0 Only backup/restore will preserve ACL. If the backup was created with relative pathnames then the files will be restored relative to the current directory. ============================================================================== Backup/restore by inode (file system) ============================================================================== To backup by inode: [[ ! -a /etc/dumpdates ]] && touch /etc/dumpdates backup -nuf /dev/rmt0 /home or backup -nuf /dev/rmt0 /dev/homelv where n = 0-9 To restore a file system: mkdir /home cd /home restore -rqf /dev/rmt0 (Level 0 backup) . . restore -rqf /dev/rmt0 (Level 9 backup) To restore an individual file: cd /home restore -xqf /dev/rmt0 ./YourFile To list contents of this tape: restore -Tqf /dev/rmt0 Only backup/restore will preserve ACL. You should unmount the file system before you use backup by inode. This is not required for /. Always delete the restoresymtable file when the restore is finished. It is also very important when you are restoring that the file system is mounted. Backing up by I-node does not work properly for files that have UID or GID greater than 65535. These files are backed up with UID or GID truncated and will have the wrong UID or GID attributes when restored. ============================================================================== Using rdump/rrestore for remote backup ============================================================================== These commands are the remote versions of backup in file system mode. The target machine requires a proper /.rhosts file to make it work. If the rmt command on the remote server is not in /usr/sbin/rmt then a link will need to be created on the remote server from /usr/sbin/rmt to its actual location (usually /etc/rmt): ln -s /etc/rmt /usr/sbin/rmt To backup /home to the remote server: [[ ! -a /etc/dumpdates ]] && touch /etc/dumpdates rdump -nuf -d 6250 -s 33000 server:/dev/rmt0 /home (8mm 2.3GB) rdump -nuf -d 6250 -s 80000 server:/dev/rmt0 /home (8mm 5GB compressed) where n = 0-9 To restore from the remote tape: mkdir /home cd /home rrestore -rqf server:/dev/rmt0 (Level 0 backup) . . rrestore -rqf server:/dev/rmt0 (Level 9 backup) ============================================================================== Using cpio for backup ============================================================================== Advantages of using cpio over tar: Designed to easily back up completely arbitrary sets of files Packs data on tape much more efficiently than tar Skips over bad spots on the tape on restore Can span tapes To backup a file system: cd / find ./home -print | cpio -o > /dev/rmt0 or find ./home -cpio /dev/rmt0 To restore a file system: cd / cpio -idv < /dev/rmt0 To list the contents of an archive: cpio -itv < /dev/rmt0 ============================================================================== Using tar for backup ============================================================================== To create a relative backup of a file system: cd / tar -cf /dev/rmt0 To restore: cd / tar -xf /dev/rmt0 When using tar in pipes use the -B flag for 512-byte blocking which is necessary on pipes: cd / tar -cBf - ./home | compress | dd of=/dev/rmt0 To restore such an archive: cd / dd if=/dev/rmt0 | uncompress | tar -xBf - Note: The default tape drive can be set with the TAPE variable. ============================================================================== Using pax for backup ============================================================================== Pax can handle both cpio and tar formats. It defaults to tar format but with the added error recovery of cpio. To backup a file system: pax -wf /dev/rmt0 /home To restore from the archive: pax -rpe -f /dev/rmt0 To list the archive: pax -f /dev/rmt0 To copy files to an alternate directory: pax -rw /olddir /newdir When using pax it is not necessary to use relative pathnames. They can be changed when restoring file, regardless of whether the archive was created with pax, tar or cpio: pax -rpe -f /dev/rmt0 -s":^/olddir:/newdir:g" Note: The -pe option is used to preserve not only the modification time but also the ownership of the files. ============================================================================== Remote backup using rsh ============================================================================== To backup without compression: cd /home tar -cBf - . | rsh server "dd ibs=512 of=/dev/rmt0 obs=16k" To restore without compression: cd /home rsh server "dd if=/dev/rmt0 ibs=16k obs=512" | tar -xf - To backup with compression: cd /home tar -cBf - . | rsh server "compress | dd of=/dev/rmt0 obs=16k conv=sync" To restore with compression: cd /home rsh server "dd if=/dev/rmt0 bs=16k" | uncompress | tar -xf - To create a remote copy of a directory: tar -cBf - . | rsh server "(cd /target; tar -xf -)" ============================================================================== Restoring libc.a ============================================================================== Get to a maintenance shell without mounting the file systems ("getrootfs hdisk0 sh"): tctl -f /dev/rmt0 rewind tctl -f/dev/rmt0 fsf 3 mkdir /mnt/usr mount /dev/hd2 /mnt/usr cd /mnt restbyname -xqf /dev/rmt0.1 ./usr/ccs/lib/libc.a sync;sync;reboot -q ============================================================================== Replacing a failing disk ============================================================================== Assuming that hdisk3 containing mirrored copies of datalv1 and datalv2 went bad, but it is still possible to access it: (1) Make a backup. (2) Make the physical volume inaccessible for normal file system operation: chpv -v r hdisk3 (3) Remove the copies for datalv1 and datalv2 from hdisk3: rmlvcopy datalv1 2 hdisk3 rmlvcopy datalv2 2 hdisk3 (4) Remove any non-mirrored logical volumes on this physical volume: rmlv -p hdisk3 datalv3 (5) If there is data spread over several disks including hdisk3 then use migratepv to move the data off the disk to make sure that the spread file system is at least consistent on the remaining disks even if you lose data. (6) Remove the physical volume from the volume group: reducevg datavg hdisk3 (7) Remove the disk from the ODM: rmdev -l hdisk3 -d (8) Remove the disk from the SCSI bus. (9) Add the new SCSI disk. (10) Run cfgmgr. (11) Add the disk to the volume group (disk name may have changed): extendvg datavg hdisk3 (12) Add new copies to the new disk: mklvcopy datalv1 3 hdisk3 mklvcopy datalv2 3 hdisk3 (13) Synchronize the mirrored copies: syncvg -v datavg (14) Restore any non-mirrored logical volumes from backup. Note: syncvg should always be run in the foreground. Running syncvg in the background could prevent you from mounting a file system. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + System Performance and Tuning + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ============================================================================== Improving throughput of high-bandwidth connections (FDDI, ATM, etc.) ============================================================================== no -a rfc1323=1 rfc 1323 describes additional TCP options that extend the usual TCP windowing facilities to support larger window sizes. ============================================================================== General performance tuning diagnostics ============================================================================== Is system CPU-bound? Run iostat -> %user + %system CPU utilization > 90% Run ps auxw to locate CPU dominant processes Run tprof to gather data over a time period Optimize application Is system disk-bound? iostat/vmstat/sar indicates iowait > 40% filemon shows high utilization of PV > 70% I/O to paging space dominates Use svmon to determine which programs use the most real memory, reschedule programs to off-peak hours Add new paging space with the mkps command Add new paging space to another PV I/O to local files dominates Add more memory, reschedule jobs or spread I/O over multiple PVs Reorganize file systems and local volumes I/O to remote files dominates Add memory, reschedule jobs or localize "hot" files Is system memory-bound? vmstat shows free frames < 2 per MB real memory and Pagein rate > 5/sec Run svmon to detect memory leaks and misuse of pinned pages Is system network-bound? Goals Balance demands of users against resource constraints to ensure acceptable network performance Steps Characterize workload, configuration, bandwidth Measure performance: Run tools Identify bottlenecks Tune network parameters Monitoring tools netstat (availability of network) nfsstat (retransmissions) netpmon (network CPU and I/O) Tuning tools no chdev ifconfig ============================================================================== Review of I/O performance factors ============================================================================== LV/file system organization Allocate hot LVs to different PVs Spread hot LVs across multiple PVs Place hottest LVs in center of PVs Make LVs contiguous Defrag file systems File system utilization Move hot files to local vs. Remote node Consider buffered or asynchronous I/O strategy Consider rescheduling to off peak hours Disk/SCSI hardware Consider adding disk drives Use the fastest disk drives Consider adding a second SCSI adapter Other memory to improve file mapping Consider adding another paging space I/O pacing ============================================================================== When to add a SCSI adapter ============================================================================== For maximum aggregate performance the total of the transfer rates (Kbps) from iostat output should be 70% of the SCSI adapter throughput rating. SCSI-1 rating: 4 MB/sec Optimum throughput < 2.8 Kbps SCSI-2 rating: 10 MB/sec Optimum throughput < 7.0 Kbps (fast mode) SCSI-2 rating: 20 MB/sec Optimum throughput < 14.0 Kbps (fast/wide mode) ============================================================================== Achieving highest LVM performance ============================================================================== Numbers of copies of each LP: 1 (no mirroring) If mirroring is needed, set: A) Scheduling policy: parallel ("closest read") B) Allocation policy: strict (each copy on separate PV) Write verify: no Intra-disk policy: A) Center: for "hot" LVs B) Middle: for "moderate" LVs C) Edge: for "cold" LVs Inter-disk policy: maximum (R/W shared among PVs) One adapter for each PV Note: The inter-disk policy will take precedence over the intra-disk policy. ============================================================================== For highest LVM availability ============================================================================== Use 3 LP copies (mirroring twice) Write verify: yes Inter policy: minimum (mirroring copies = # of PVs) Scheduling policy: sequential Allocation policy: strict (no mirroring on the same PV) Include at least 3 PVs in a VG Mirror the copies on PVs attached to separate buses, adapters and power supplies One adapter for each PV ============================================================================== System resource control mechanisms ============================================================================== Resource Control Mechanisms -------- ------------------ CPU Nice numbers Process priorities Batch queues Scheduler parameters Memory Paging (swap) space Process resource limits Memory management-related parameters Disk I/O File system organization across physical disks and controllers File placement on disk I/O-related parameters ============================================================================== Process resource limits ============================================================================== User soft resource limit (resource currently applied by default): $ ulimit -a time(seconds) unlimited file(blocks) 2097151 data(kbytes) 131072 stack(kbytes) 32768 memory(kbytes) 32768 coredump(blocks) 2048 nofiles(descriptors) unlimited User hard resource limit (system-wide resource limit which can be further increased by superuser only): $ ulimit -aH time(seconds) unlimited file(blocks) 2097151 data(kbytes) unlimited stack(kbytes) unlimited memory(kbytes) unlimited coredump(blocks) unlimited nofiles(descriptors) unlimited ============================================================================== Configuring the scheduler ============================================================================== AIX uses a priority-based round-robin scheduling algorithm to distribute CPU resources among multiple competing processes. Once a process begins running, it will continue to execute until it needs to wait for an I/O operation to complete, receives an interrupt, or exhausts the maximum execution time slice defined on that system (10 ms is a common value). Process execution priorities change over time according to the formula: PRI = min + NI + (0.5 * recent) where min (C) = minimum process priority level, normally 40 NI = process's nice number (0 to 39) recent = 0-120, starting with 0, increased by 1 if currently in control of the CPU at the end of each 10 millisecond time slice To increase the nice number of a process (lower priority) by 10: nice -10 To decrease the nice number of a process (higher priority) by 10: nice --10 To change a running process's nice number by 10 (the range is -20 to 20): renice -n 10 PID To double the length of the time slice, setting it to 20 milliseconds: schedtune -t 1 ============================================================================== Processes that won't die ============================================================================== 1) Zombies 2) Processes waiting for unavailable NFS resources (e.g., trying to write to a remote file on a system that has crashed) 3) Processes waiting for a device to complete an operation before exiting (e.g., waiting for a tape to finish rewinding) ============================================================================== Interpreting vmstat statistics ============================================================================== "w" indicates the number of swapped out runnable processes, which should be 0. "pi" is the number of pages paged-in. Since a page-in in AIX always means that a page was previously paged-out, and does not includes process startup, it is a better indicator than "po". "po/fr" > 1/6 indicates that the system is thrashing. ============================================================================== Thrashing ============================================================================== The VMM decides that the system is thrashing when the fraction of page steals (pages grabbed while they were still in use) that are actually paged-out to disk exceeds one sixth by default. When this happens, the VMM begins suspending processes until thrashings stops. It chooses processes based on their own repage rate: when the fraction of its page faults are for pages that have been previously paged out rises above one fourth by default, then a process becomes a candidate for suspension. Suspended processes are resumed once system conditions have improved and remained stable for a certain period of time (by default, 1 second). ============================================================================== Using schedtune to configure virtual memory manager (VMM) ============================================================================== Option Label Meaning ------ ----- ------- -h SYS Memory is defined as overcommitted when page writes/total page steals > 1/-h_value. -p PROC A process may be suspended during thrashing conditions when its repages/page faults > 1/-p_value. This parameter defines when an individual process is thrashing. -m MULTI Minimum number of process to remain running even when the system is thrashing. -w WAIT Number of seconds to wait after thrashing ends (as defined by -h) before any reactivating suspended processes. -c GRACE Number of seconds after reactivation before a process may be suspended again. Note: All changes will last until the next system reboot. ============================================================================== Using vmtune to configure virtual memory manager (VMM) ============================================================================== Output Default Option Label Value Meaning ------ ------ ------- ------- -f minfree 120 Minimum size of the free list-a set of memory pages set aside for use by new pages required by processes (used to satisfy page faults). When the free list falls below this threshold, the VMM must steal pages from running processes to replenish it. -F maxfree 128 Page stealing stops when the free list reaches or exceeds this size. -p minperm ~18 Threshold value which forces both computational and file pages to be stolen (expressed as a percentage of the system's total real memory). -P maxperm ~75 Threshold value which forces only file pages to be stolen (expressed as a percentage of the system's total real memory). AIX distinguishes between computational memory pages, which consist of program working storage (non file-based data) and program text segments (the executable's in-memory image). File pages are all other kinds of memory pages (all of which are backed by disk files). By default, the VMM attempts to slightly favor computational pages over file pages when selecting pages to steal, according to the following scheme: %File Pages Repage Rates Kinds of Pages stolen ----------- ------------ --------------------- < minperm N/A Both types minperm < % < maxperm file rate < comp. rate File pages only file rate > comp. rate Both types > maxperm N/A File pages only Repage rates are the fraction of page faults which reference stolen or replaced memory pages rather than new pages. Note: All changes will last until the next system reboot. ============================================================================== Using vmtune to configure virtual memory manager (VMM) ============================================================================== Output Default Option Label Value Meaning ------ ------ ------- ------- -w npswarn 512 When only this many pages of paging space are left, a DANGER signal (33) is sent to all processes. -k npskill 128 If available paging space reaches this level, processes will be killed to prevent a system crash. Note: All changes will last until the next system reboot. ============================================================================== Disk I/O performance issues ============================================================================== Place disks on multiple disk controllers. Limit the total maximum disk transfer rate to 75-80% of the top adapter speed. Distribute the anticipated disk I/O across controllers and disks as evenly as possible. Use a separate disk for the operating system. Place heavily accessed files on local rather than network drives. General considerations for physical placement of files on disk: File system fragmentation degrades I/O performance. Sequential access of large files is most efficient when the files are contiguous and if disk striping is used. Placing large randomly accessed files in the center portion of disk drives will yield the best performance. Mirrored logical volumes with Mirror Write Consistency (MWC) set on ON should be at the edge because that is where the system writes MWC data. If mirroring is not in effect, MWC does not apply and does not affect performance. ============================================================================== Sequential read-ahead parameters ============================================================================== When AIX decides that a process is accessing data files in a sequential manner, it attempts to aid the process by performing read-ahead operations. By default, it begins by retrieving two pages instead one. As long as sequential access of the file continues, it doubles the number of pages read with each operation before leveling at a maximum of eight pages. The default threshold values of 2 and 8 pages can be altered with the following vmtune options: Output Default Option Label Value Meaning ------ ------ ------- ------- -r minpgahead 2 Starting number of pages for sequential read-aheads. -R maxpgahead 8 Maximum number of pages to read ahead. Note: All changes will last until the next system reboot. ============================================================================== Disk I/O pacing parameters ============================================================================== Write requests are serviced by the operating system in the order in which they are made (queued). A very large I/O operation can accumulate many pending I/O requests, and users needing disk access can be forced to wait for it to complete. Disk I/O pacing is designed to prevent this from happening: chdev -l sys0 -a maxpout=33 -a minpout=16 maxpout should be one more than a multiple of 4; minpout should be a multiple of 4 and at least 4 less than maxpout. If a process tries to write to a file for which there are already maxpout or more pending write operations, then the process is suspended until the number of pending requests falls below minpout. ============================================================================== How are process ids allocated? ============================================================================== Process ids are 32 bit numbers that are a combination of a process slot and a generation count. Each process has a process slot that is identified by the bits 7-23 (0-6 are not used) of the process id. The generation count uses bits 24-31 and is incremented for each new process. As each process slot would occupy pinned memory if used, AIX tries to minimize the number of process slots used by reusing them as much as possible. Therefore process ids reappear much more quickly on AIX and this is where the generation count comes in: it makes sure that process ids do not reappear too quickly. ============================================================================== Shared libraries ============================================================================== To identify which shared libraries a given executable uses and its library search path: dump -H executable_name "genkld" lists the currently loaded shard library objects, "genld" generates a listing of the shared objects laoded by different programs, and "genkex" displays all the loaded kernel extensions. All the system libraries are shared and stays cached in memory. To clean up this cache use the "slibclean" command. ============================================================================== The life cycle of a process ============================================================================== The ultimate ancestor for every process is the process with PID 1, init, created during the boot process. To create a new process, init makes an exact copy of itself (forking). The child process has the same environment as its parent process, although it is assigned a different process ID. Then, the image in the child process's address space is overwritten by the one it will run via the exec system call. Among the fork-and-exec processes created by init are usually one or more executing the getty program. Each getty program is assigned to a different terminal and displays the login prompt. When someone logins, getty exec's the login program, which validates user logins among other activities. Once the username and password are verified, login exec's the user's shell. When a user logs out, the login shell sends a signal to init. Init will fork again and start the getty, and the whole cycle will repeat itself again and again as different users use that terminal. ============================================================================== Using traceon/traceoff ============================================================================== To turn on tracing of a subsystem group: traceon [-h host] -g subsystem_group To turn on tracing of a subsystem: traceon [-h host] -s subsystem To turn off tracing of a subsystem group: traceoff [-h host] -g subsystem_group To turn off tracing of a subsystem: traceoff [-h host] -s subsystem ============================================================================== Performance cost of file system compression ============================================================================== Increased time to compress and decompress data. Additional cost of 4096-byte allocation and reallocation for update in place. 50 CPU cycles per byte for compression, and 10 CPU cycles for decompression. Same performance cost as for file system fragmentation. ============================================================================== Performance cost of file system fragmentation ============================================================================== Increased allocation activity. Free space fragmentation. Increased fragment allocation map size. Note: Only partial logical blocks for files or directories less than 32KB in size can be allocated less than 4096 bytes of fragments.