Dear All,
It's a long time since I've posted here as I haven't worked with Informix for a while. However, I'm 'last man standing' and there's a problem at a customer site.
The system has been installed and working for many years, but now the primary drive of their mirrored pair is beginning to show soft errors and could do with replacing. Regrettably, I've forgotten how to and I don't have the luxury of a development system to practice on. Nightmare scenario...
This is how the server is configured on Solaris 2.6:
server# onstat -d
Informix Dynamic Server Version 7.31.UC6 -- On-Line -- Up 20 days 19:43:44 -- 189696 Kbytes
Dbspaces
address number flags fchunk nchunks flags owner name
1104a150 1 2 1 2 M informix rootdbs
1 active, 2047 maximum
Chunks
address chk/dbs offset size free bpages flags pathname
1104a210 1 1 0 1026000 457502 PO- /INFORMIX_DBSPACE_CHUNK0
1104a2f0 1 1 0 1026000 0 MO- /.INFORMIX_INSTANCE_0.m0
11073798 2 1 0 852120 105896 PO- /INFORMIX_DBSPACE_CHUNK1
11073878 2 1 0 852120 0 MO- /.INFORMIX_INSTANCE_0.m1
2 active, 2047 maximum
server# ls -ld /INFORMIX_DBSPACE_CHUNK*
lrwxrwxrwx 1 root other 18 May 29 2010 /INFORMIX_DBSPACE_CHUNK0 -> /dev/rdsk/c0t2d0s3
lrwxrwxrwx 1 root other 18 May 29 2010 /INFORMIX_DBSPACE_CHUNK1 -> /dev/rdsk/c0t2d0s4
server# ls -ld /.INFORMIX_INSTANCE_0.m*
lrwxrwxrwx 1 root other 18 May 29 2010 /.INFORMIX_INSTANCE_0.m0 -> /dev/rdsk/c0t1d0s3
lrwxrwxrwx 1 root other 18 May 29 2010 /.INFORMIX_INSTANCE_0.m1 -> /dev/rdsk/c0t1d0s4
server# prtvtoc /dev/rdsk/c0t1d0s2
* /dev/rdsk/c0t1d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
* 135 sectors/track
* 16 tracks/cylinder
* 2160 sectors/cylinder
* 3882 cylinders
* 3880 accessible cylinders
*
* Flags:
* 1: unmountable
* 10: read-only
*
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
0 0 00 0 4320 4319
1 3 01 4320 864000 868319
2 5 01 0 8380800 8380799
3 0 00 868320 4104000 4972319
4 0 00 4972320 3408480 8380799
server# prtvtoc /dev/rdsk/c0t2d0s2
* /dev/rdsk/c0t2d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
* 135 sectors/track
* 16 tracks/cylinder
* 2160 sectors/cylinder
* 3882 cylinders
* 3880 accessible cylinders
*
* Flags:
* 1: unmountable
* 10: read-only
*
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
0 0 00 0 4320 4319
1 3 01 4320 864000 868319
2 5 01 0 8380800 8380799
3 0 00 868320 4104000 4972319
4 0 00 4972320 3408480 8380799
server# iostat -En
<..snip..>
c0t1d0 Soft Errors: 230 Hard Errors: 0 Transport Errors: 0
Vendor: IBM Product: DDRS34560SUN4.2G Revision: S98E Serial No: 9848310333
RPM: 5400 Heads: 16 Size: 4.29GB <4292075520 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 230
Illegal Request: 0 Predictive Failure Analysis: 0
c0t2d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: IBM Product: DDRS34560SUN4.2G Revision: S98E Serial No: 99285G2252
RPM: 5400 Heads: 16 Size: 4.29GB <4292075520 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
server# cat onconfig.INSTANCE_0
#**************************************************************************
#
# INFORMIX SOFTWARE, INC.
#
# Title: onconfig.ows
# Description: INFORMIX-OnLine Configuration Parameters
#
#**************************************************************************
# Root Dbspace Configuration
ROOTNAME rootdbs # Root dbspace name
ROOTPATH /INFORMIX_DBSPACE_CHUNK0
ROOTOFFSET 0 # Offset of root dbspace into device (Kbytes)
ROOTSIZE 2052000 # Size of root dbspace (Kbytes)
# Disk Mirroring Configuration Parameters
MIRROR 1 # Mirroring flag (Yes = 1, No = 0)
MIRRORPATH /.INFORMIX_INSTANCE_0.m0
MIRROROFFSET 0 # Offset into mirrored device (Kbytes)
# Physical Log Configuration
PHYSDBS rootdbs # Location (dbspace) of physical log
PHYSFILE 10000 # Physical log file size (Kbytes)
# Logical Log Configuration
LOGFILES 20 # Number of logical log files
LOGSIZE 2000 # Logical log size (Kbytes)
# Diagnostics
MSGPATH /usr/informix/etc/online.log # System message log file path
CONSOLE /dev/console # System console message path
ALARMPROGRAM /tanit/tcl/log_warn/log_full.sh # Alarm program path
# System Archive Tape Device
TAPEDEV /dev/rmt/0
TAPEBLK 128 # Tape block size (Kbytes)
TAPESIZE 4000000 # Maximum amount of data to put on tape (Kbytes)
# Log Archive Tape Device
LTAPEDEV /dev/rmt/1
LTAPEBLK 128 # Log tape block size (Kbytes)
LTAPESIZE 4000000 # Max amount of data to put on log tape (Kbytes)
# Optical
STAGEBLOB # INFORMIX-OnLine/Optical staging area
# System Configuration
SERVERNUM 0 # Unique id corresponding to a OnLine instance
DBSERVERNAME INSTANCE_0 # Name of default database server
DBSERVERALIASES INSTANCE_0_SHM,INSTANCE_0_TCP # List of alternate dbservernames
NETTYPE ipcstr,1,200,NET
NETTYPE ipcshm,1,200,CPU
NETTYPE tlitcp,1,100,NET
# increase tcp connections from 5 to 50. AL 1-mar-00
# read tasks failed to start correctly, error -25553 in database selection
# AL 23-nov-01
# increase again: str 100 -> 200
# shm 100 -> 200
# tcp 50 -> 100
# as a result of error -25560 from read tasks
DEADLOCK_TIMEOUT 60 # Max time to wait of lock in distributed env.
# AL 15-oct-00 change RESIDENT from 0 to 1
RESIDENT 1 # Forced residency flag (Yes = 1, No = 0)
MULTIPROCESSOR 0 # 0 for single-processor, 1 for multi-processor
NUMCPUVPS 1 # Number of user (cpu) vps
SINGLE_CPU_VP 1 # If non-zero, limit number of cpu vps to one
# AL 30-aug-2000
# disable process ageing
NOAGE 1 # Process aging
AFF_SPROC 0 # Affinity start processor
AFF_NPROCS 0 # Affinity number of processors
# Shared Memory Parameters
LOCKS 2000 # Maximum number of locks
#BUFFERS 1000 # Maximum number of shared buffers
# buffers increased from 1000 to 10000. al 22/12/98
# buffers increased to 20000. al 28/07/00
# buffers increased to 30000. al 17/4/02
# buffers increased to 40000. al 23/4/02
# buffers increased to 50000. al 16/10/02
BUFFERS 50000 # Maximum number of shared buffers
NUMAIOVPS # Number of IO vps
PHYSBUFF 32 # Physical log buffer size (Kbytes)
LOGBUFF 32 # Logical log buffer size (Kbytes)
LOGSMAX 20 # Maximum number of logical log files
#CLEANERS 1 # Number of buffer cleaner processes
# cleaners incread in line with LRUS. AL 16/10/02
CLEANERS 63 # Number of buffer cleaner processes
SHMBASE 0xa000000 # Shared memory base address
# SHMVIRTSIZE increased from 8000 to 37000, 5 jun 2000 by AL
# from 37000 to 75000, 16 Jun 2000 by AL
SHMVIRTSIZE 75000 # initial virtual shared memory segment size
SHMADD 8192 # Size of new shared memory segments (Kbytes)
SHMTOTAL 0 # Total shared memory (Kbytes). 0=>unlimited
CKPTINTVL 300 # Check point interval (in sec)
#LRUS 8 # Number of LRU queues
# increase LRUs from 8 to 32. al 22/12/98
# increase LRUs to 64. al 28/07/00
# decrease LRUs to 63. suspicion that even numbers are bad... al 19 apr 02
LRUS 63 # Number of LRU queues
#LRU_MAX_DIRTY 60 # LRU percent dirty begin cleaning limit
#LRU_MIN_DIRTY 50 # LRU percent dirty end cleaning limit
# reduce max/min to 20/10. al 16/10/02
LRU_MAX_DIRTY 20 # LRU percent dirty begin cleaning limit
LRU_MIN_DIRTY 10 # LRU percent dirty end cleaning limit
LTXHWM 50 # Long transaction high water mark percentage
LTXEHWM 60 # Long transaction high water mark (exclusive)
TXTIMEOUT 0x12c # Transaction timeout (in sec)
STACKSIZE 32 # Stack size (Kbytes)
# System Page Size
# BUFFSIZE - OnLine no longer supports this configuration parameter.
# To determine the page size used by OnLine on your platform
# see the last line of output from the command, 'onstat -b'.
# Recovery Variables
# OFF_RECVRY_THREADS:
# Number of parallel worker threads during fast recovery or an offline restore.
# ON_RECVRY_THREADS:
# Number of parallel worker threads during an online restore.
OFF_RECVRY_THREADS 10 # Default number of offline worker threads
ON_RECVRY_THREADS 1 # Default number of online worker threads
# Data Replication Variables
# DRAUTO: 0 manual, 1 retain type, 2 reverse type
DRAUTO 0 # DR automatic switchover
DRINTERVAL 30 # DR max time between DR buffer flushes (in sec)
DRTIMEOUT 30 # DR network timeout (in sec)
DRLOSTFOUND /usr/informix/etc/dr.lostfound # DR lost+found file path
# Read Ahead Variables
RA_PAGES 10 # Number of pages to attempt to read ahead
#RA_THRESHOLD 4 # Number of pages left before next group
# AL 22 may 02 increase threshold from 4 to 8 to try to get bufwaits down
# read ahead util is around 100%, so just try to get the
# pages a bit sooner
RA_THRESHOLD 8 # Number of pages left before next group
# DBSPACETEMP:
# OnLine equivalent of DBTEMP for SE. This is the list of dbspaces
# that the OnLine SQL Engine will use to create temp tables etc.
# If specified it must be a colon separated list of dbspaces that exist
# when the OnLine system is brought online. If not specified, or if
# all dbspaces specified are invalid, various ad hoc queries will create
# temporary files in /tmp instead.
DBSPACETEMP # Default temp dbspaces
# DUMP*:
# The following parameters control the type of diagnostics information which
# is preserved when an unanticipated error condition (assertion failure) occurs
# during OnLine operations.
# For DUMPSHMEM, DUMPGCORE and DUMPCORE 1 means Yes, 0 means No.
DUMPDIR /tmp # Preserve diagnostics in this directory
DUMPSHMEM 1 # Dump a copy of shared memory
DUMPGCORE 0 # Dump a core image using 'gcore'
DUMPCORE 0 # Dump a core image (Warning:this aborts OnLine)
DUMPCNT 1 # Number of shared memory or gcore dumps for
# a single user's session
FILLFACTOR 90 # Fill factor for building indexes
# method for OnLine to use when determining current time
USEOSTIME 0 # 0: use internal time(fast), 1: get time from OS(slow)
DS_MAX_QUERIES # Maximum number of decision support queries
DS_TOTAL_MEMORY # Decision support memory (Kbytes)
DS_MAX_SCANS 1048576 # Maximum number of decision support scans
DATASKIP off # List of dbspaces to skip
# OPTCOMPIND
# 0 => Nested loop joins will be preferred (where
# possible) over sortmerge joins and hash joins.
# 1 => If the transaction isolation mode is not
# "repeatable read", optimizer behaves as in (2)
# below. Otherwise it behaves as in (0) above.
# 2 => Use costs regardless of the transaction isolation
# mode. Nested loop joins are not necessarily
# preferred. Optimizer bases its decision purely
# on costs.
OPTCOMPIND 2 # To hint the optimizer
ONDBSPACEDOWN 2 # Dbspace down option: 0 = CONTINUE, 1 = ABORT, 2 = WAIT
# AL 15-oct-00 change LBU_PRESERVE from 0 to 1
LBU_PRESERVE 1 # Preserve last log for log backup
OPCACHEMAX 0 # Maximum optical cache size (Kbytes)
# HETERO_COMMIT (Gateway participation in distributed transactions)
# 1 => Heterogeneous Commit is enabled
# 0 (or any other value) => Heterogeneous Commit is disabled
HETERO_COMMIT 0
#
# al 22/12/98
DD_HASHMAX 50
DD_HASHSIZE 53
#
# al 23-feb-2000
# added from onconfig.std on upgrade to v7.31.uc4
# Optimization goal: -1 = ALL_ROWS(Default), 0 = FIRST_ROWS
OPT_GOAL -1
# Optimizer DIRECTIVES ON (1/Default) or OFF (0)
DIRECTIVES 1
# Status of restartable restore
RESTARTABLE_RESTORE off
SYSALARMPROGRAM /usr/informix/etc/evidence.sh # System Alarm program path
TBLSPACE_STATS 1
CDR_LOGBUFFERS 2048 # size of log reading buffer pool (Kbytes)
CDR_EVALTHREADS 1,2 # evaluator threads (per-cpu-vp,additional)
CDR_DSLOCKWAIT 5 # DS lockwait timeout (seconds)
CDR_QUEUEMEM 4096 # Maximum amount of memory for any CDR queue (Kbytes)
CDR_LOGDELTA 30 # % of log space allowed in queue memory
CDR_NUMCONNECT 16 # Expected connections per server
CDR_NIFRETRY 300 # Connection retry (seconds)
CDR_NIFCOMPRESS 0 # Link level compression (-1 never, 0 none, 9 max)
BAR_ACT_LOG /tmp/bar_act.log
BAR_MAX_BACKUP 4
BAR_RETRY 1
BAR_NB_XPORT_COUNT 10
BAR_XFER_BUF_SIZE 31
ISM_DATA_POOL ISMData # If the data pool name is changed, be sure to
ISM_LOG_POOL ISMLogs
MAX_PDQPRIORITY 100 # Maximum allowed pdqpriority
There is absolutely no prospect of upgrading or anything like that and in any case the system is hopefully going to get pensioned off soon.
What I'd greatly appreciate is the most reliable way to replace that failing drive, given that it is the primary of the mirrored pair. The server can be taken down to replace the drive. If anyone can give me the process & tell me the commands I'll be very grateful.
Many, many thanks,
Andy.