I would take a level 0 archive of the primaries and reestablish replication
from scratch. It might be a good idea to run onchecks on the primaries to
make sure that the inconsistencies did not originate there.
The nochecks that are complaining about ownership should be run by user
informix.
Art
Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
Blog: http://informix-myview.blogspot.com/
Disclaimer: Please keep in mind that my own opinions are my own opinions
and do not reflect on my employer, Advanced DataTools, the IIUG, nor any
other organization with which I am associated either explicitly,
implicitly, or by inference. Neither do those opinions reflect those of
other individuals affiliated with any entity with which I am affiliated nor
those of the entities themselves.
On Fri, Jan 25, 2013 at 8:30 PM, Schleicher, Keith <
Keith.Schleicher@searshc.com> wrote:
> I have two separate servers with two similar but separate issues. Both
> involve
> Secondary servers in a DR setup. Both are due to a power outage that caused
> these servers to power down without Informix being shut down.
>
> For the first server, here is the error log:
>
> 16:19:21 Event alarms enabled. ALARMPROG =
> '/ifmx/ifmx00175/etc/log_full.sh'
> 16:19:22 Booting Language <c> from module <>
> 16:19:22 Loading Module <CNULL>
> 16:19:22 Booting Language <builtin> from module <>
> 16:19:22 Loading Module <BUILTINNULL>
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:22 Could not disable priority aging: errno = 1
> 16:19:28 DR: DRAUTO is 0 (Off)
> 16:19:28 Requested shared memory segment size rounded from 544KB to 1024KB
> 16:19:29 IBM Informix Dynamic Server Version 10.00.UC6 Software Serial
> Number
> AAA#B000000
> 16:19:29 WARNING: Next backup of DBspace llogdbs must be level-0 backup.
> 16:19:29 WARNING: Next backup of DBspace plogdbs must be level-0 backup.
> 16:19:29 IBM Informix Dynamic Server Initialized -- Shared Memory
> Initialized.
>
> 16:19:29 Physical Recovery Started at Page (3:29800).
> 16:19:29 Physical Recovery Complete: 18 Pages Examined, 18 Pages Restored.
> 16:19:29 DR: Trying to connect to primary server = ifmx00174_repl
> 16:19:30 Dataskip is now OFF for all dbspaces
> 16:19:30 Restartable Restore has been ENABLED
> 16:19:30 Recovery Mode
> 16:19:33 DR: Secondary server connected
> 16:19:34 DR: Configuration parameter values of the paired servers do not
> match:
>
> Parameter: TAPESIZE
>
> Current server's value: 20480000
>
> Paired server's value: 30720000
>
> WARNING: The parameter values must match if ontape is used as
>
> backup/recovery tool.
>
> Please adjust this parameter value in onconfig file to correctly
>
> reflect the setting of the primary server.
> 16:19:34 DR: Secondary server needs failure recovery
>
> 16:19:34 DR: Failure recovery from disk in progress ...
> 16:19:34 Logical Recovery Started.
> 16:19:34 10 recovery worker threads will be started.
> 16:19:34 Start Logical Recovery - Start Log 54282, End Log ?
> 16:19:34 Starting Log Position - 54282 0x10018
> 16:19:34 Assert Failed: Page Check Error in page_reorg: slots overlap
> 16:19:34 IBM Informix Dynamic Server Version 10.00.UC6
> 16:19:34 Who: Session(26, informix@t5240-03-z02, 0, 3fc324e0)
>
> Thread(129, xchg_1.8, 3fc1d548, 3)
>
> File: rsdebug.c Line: 1081
> 16:19:34 Results: Possible inconsistencies in an index of
> 'campaign:"dbaparts".camptss_ses'
> "/ifmx/ifmx00175/online.log" [Read only] 92 lines, 5329 characters
> 16:19:30 Dataskip is now OFF for all dbspaces
> 16:19:30 Restartable Restore has been ENABLED
> 16:19:30 Recovery Mode
> 16:19:33 DR: Secondary server connected
> 16:19:34 DR: Configuration parameter values of the paired servers do not
> match:
>
> Parameter: TAPESIZE
>
> Current server's value: 20480000
>
> Paired server's value: 30720000
>
> WARNING: The parameter values must match if ontape is used as
>
> backup/recovery tool.
>
> Please adjust this parameter value in onconfig file to correctly
>
> reflect the setting of the primary server.
> 16:19:34 DR: Secondary server needs failure recovery
>
> 16:19:34 DR: Failure recovery from disk in progress ...
> 16:19:34 Logical Recovery Started.
> 16:19:34 10 recovery worker threads will be started.
> 16:19:34 Start Logical Recovery - Start Log 54282, End Log ?
> 16:19:34 Starting Log Position - 54282 0x10018
> 16:19:34 Assert Failed: Page Check Error in page_reorg: slots overlap
> 16:19:34 IBM Informix Dynamic Server Version 10.00.UC6
> 16:19:34 Who: Session(26, informix@t5240-03-z02, 0, 3fc324e0)
>
> Thread(129, xchg_1.8, 3fc1d548, 3)
>
> File: rsdebug.c Line: 1081
> 16:19:34 Results: Possible inconsistencies in an index of
> 'campaign:"dbaparts".camptss_ses'
> 16:19:34 Action: Run 'oncheck -cI campaign:"dbaparts".camptss_ses'
> 16:19:34 stack trace for pid 5543 written to
> /dbengines/informix/tmp/af.46904f6
> 16:19:34 See Also: /dbengines/informix/tmp/af.46904f6, shmem.46904f6.0
> 16:19:34 Assert Warning: Error during recovery left index inconsistent.
> 16:19:34 IBM Informix Dynamic Server Version 10.00.UC6
> 16:19:34 Who: Session(26, informix@t5240-03-z02, 0, 3fc33520)
>
> Thread(130, xchg_1.9, 3fc1da74, 5)
>
> File: rskey.c Line: 1516
> 16:19:34 Results: Index 'campaign:"dbaparts".camptss_ses# 120_163' is now
> unusable
> 16:19:34 Action: Run 'oncheck -cI campaign:"dbaparts".camptss_ses# 120_163'
> 16:19:34 stack trace for pid 5559 written to
> /dbengines/informix/tmp/af.46a04f6
> 16:19:34 See Also: /dbengines/informix/tmp/af.46a04f6
> 16:19:35 Rollforward of log record failed. iserrno = 126
> 16:19:35 Log Record: log = 54282, pos = 0x1906c, type =
> OLDRSAM:HINSERT(40),
> trans = 91
> 16:19:35 Error during recovery left index inconsistent.
> 16:19:35 Rollforward of log record failed. iserrno = 105
> 16:19:35 Log Record: log = 54282, pos = 0x1917c, type =
> OLDRSAM:ADDITEM(28),
> trans = 91
> 16:19:35 Rollforward of log record failed. iserrno = 105
> 16:19:35 Log Record: log = 54282, pos = 0x1917c, type =
> OLDRSAM:ADDITEM(28),
> trans = 91
> 16:19:44 Assert Warning: Chunk 8 is being taken OFFLINE.
> 16:19:44 IBM Informix Dynamic Server Version 10.00.UC6
> 16:19:44 Who: Session(26, informix@t5240-03-z02, 0, 3fc33520)
>
> Thread(126, xchg_1.5, 3fc1c5c4, 1)
>
> File: rsmirror.c Line: 1846
> 16:19:44 Results: Dynamic Server will block at next checkpoint
> 16:19:44 Action: Shutdown (onmode -k) or override (onmode -O)
> 16:19:44 stack trace for pid 5462 written to
> /dbengines/informix/tmp/af.46604f6
> 16:19:44 See Also: /dbengines/informix/tmp/af.46604f6
> 16:19:45 Chunk 8 is being taken OFFLINE.
> 16:19:45 Rollforward of log record failed. iserrno = 126
> 16:19:45 Log Record: log = 54282, pos = 0x1906c, type =
> OLDRSAM:HINSERT(40),
> trans = 91
> 16:19:45 Rollforward of log record failed. iserrno = 101
> 16:19:45 Log Record: log = 54282, pos = 0x1b044, type = OLDRSAM:UNIQID(17),
> trans = 93
> 16:19:45 Rollforward of log record failed. iserrno = 101
> 16:19:45 Log Record: log = 54282, pos = 0x1b044, type = OLDRSAM:UNIQID(17),
> trans = 93
> 16:21:38 Page Check Error in page_reorg: slots overlap
> 16:21:38 Rollforward of log record failed. iserrno = 105
> 16:21:38 Log Record: log = 54282, pos = 0x1b140, type =
> OLDRSAM:ADDITEM(28),
> trans = 93
> 16:21:38 Rollforward of log record failed. iserrno = 105
> 16:21:38 Log Record: log = 54282, pos = 0x1b140, type =
> OLDRSAM:ADDITEM(28),
> trans = 93
> 16:21:38 Checkpoint blocked by down space, waiting for override or shutdown
>
> I tried to run 'oncheck -cI campaign:"dbaparts".camptss_ses,' but it tells
> me
> that I can't run it because I don't own that table. The chunk that is down
> includes all of the indexes for this instance. Is it possible to drop the
> chunk, recreate the chunk, and then recreate the indexes, or will we need
> to
> restore the data from a backup?
>
> 17:13:34 Event alarms enabled. ALARMPROG = '/ifmx/ifmx00804/etc/event.pl'
> 17:13:34 Booting Language <c> from module <>
> 17:13:34 Loading Module <CNULL>
> 17:13:34 Booting Language <builtin> from module <>
> 17:13:34 Loading Module <BUILTINNULL>
> 17:13:34 Could not disable priority aging: errno = 1
> 17:13:34 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:35 Could not disable priority aging: errno = 1
> 17:13:41 DR: DRAUTO is 0 (Off)
> 17:13:41 Requested shared memory segment size rounded from 544KB to 1024KB
> 17:13:43 IBM Informix Dynamic Server Version 10.00.UC6 Software Serial
> Number
> AAA#B000000
> 17:13:44 IBM Informix Dynamic Server Initialized -- Shared Memory
> Initialized.
>
> 17:13:44 Physical Recovery Started at Page (3:206904).
> 17:13:44 Physical Recovery Complete: 36 Pages Examined, 36 Pages Restored.
> 17:13:44 DR: Trying to connect to primary server = ifmxrepla
> 17:13:44 Dataskip is now OFF for all dbspaces
> 17:13:44 Restartable Restore has been ENABLED
> 17:13:44 Recovery Mode
> 17:13:49 DR: Secondary server connected
> 17:13:50 DR: Secondary server needs failure recovery
>
> 17:13:50 DR: Failure recovery from disk in progress ...
> 17:13:51 Logical Recovery Started.
> 17:13:51 10 recovery worker threads will be started.
> 17:13:51 Start Logical Recovery - Start Log 128356, End Log ?
> 17:13:51 Starting Log Position - 128356 0x1fb018
> 17:13:52 Rollforward of log record failed. iserrno = 126
> 17:13:52 Log Record: log = 128356, pos = 0x215044, type =
> OLDRSAM:HINSERT(40),
> trans = 843
> 17:14:01 Assert Warning: Chunk 5 is being taken OFFLINE.
> 17:14:01 IBM Informix Dynamic Server Version 10.00.UC6
> 17:14:01 Who: Session(44, informix@t5240-02-z00, 0, 3fc4cb38)
>
> Thread(142, xchg_1.4, 3fc34b6c, 8)
>
> File: rsmirror.c Line: 1846
> 17:14:01 Results: Dynamic Server will block at next checkpoint
> 17:14:01 Action: Shutdown (onmode -k) or override (onmode -O)
> 17:14:01 stack trace for pid 9979 written to
> /dbengines/informix/tmp/af.47611b0
> 17:14:01 See Also: /dbengines/informix/tmp/af.47611b0
> 17:14:02 Chunk 5 is being taken OFFLINE.
> 17:14:03 Rollforward of log record failed. iserrno = 126
> 17:14:03 Log Record: log = 128356, pos = 0x215044, type =
> OLDRSAM:HINSERT(40),
> trans = 843
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x226098, type =
> OLDRSAM:ADDITEM(28),
> trans = 645
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x226098, type =
> OLDRSAM:ADDITEM(28),
> trans = 645
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22b098, type =
> OLDRSAM:ADDITEM(28),
> trans = 152
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22b098, type =
> OLDRSAM:ADDITEM(28),
> trans = 152
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22f098, type =
> OLDRSAM:ADDITEM(28),
> trans = 986
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22f098, type =
> OLDRSAM:ADDITEM(28),
> trans = 986
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22d0a4, type =
> OLDRSAM:DELITEM(29),
> trans = 986
> 17:14:03 Rollforward of log record failed. iserrno = 101
> 17:14:03 Log Record: log = 128356, pos = 0x22d0a4, type =
> OLDRSAM:DELITEM(29),
> trans = 986
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x233098, type =
> OLDRSAM:ADDITEM(28),
> trans = 149
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x233098, type =
> OLDRSAM:ADDITEM(28),
> trans = 149
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x22e0a4, type =
> OLDRSAM:ADDITEM(28),
> trans = 986
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x22e0a4, type =
> OLDRSAM:ADDITEM(28),
> trans = 986
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x2310a4, type =
> OLDRSAM:DELITEM(29),
> trans = 149
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x2310a4, type =
> OLDRSAM:DELITEM(29),
> trans = 149
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x2320a4, type =
> OLDRSAM:ADDITEM(28),
> trans = 149
> 17:14:04 Rollforward of log record failed. iserrno = 101
> 17:14:04 Log Record: log = 128356, pos = 0x2320a4, type =
> OLDRSAM:ADDITEM(28),
> trans = 149
> 17:14:04 Checkpoint blocked by down space, waiting for override or shutdown
>
> The chunk that is down does contain data. Is there any way of avoiding
> restoring the entire database to this server, or can we follow some other
> steps?
>
> Both are running Informix 10.00.UC6 on Sun Solaris 10.
>
> Keith Schleicher
> IT Database Administrator
> Cell: 224-210-8358
> Blackberry:
> 2242108358@messaging.sprintpcs.com<mailto:
> 2242108358@messaging.sprintpcs.com>
> Page: 2242108358@sprint.skytel.com<mailto:2242108358@sprint.skytel.com>
>
> For more information, use our DBA Wiki page link below:
>
>
> http://wiki.intra.sears.com/confluence/display/TechStrag/Database+Management#DatabaseManagement
> <
> http://wiki.intra.sears.com/confluence/display/TechStrag/Database+Management
> >
>
> This message, including any attachments, is the property of Sears Holdings
> Corporation and/or one of its subsidiaries. It is confidential and may
> contain
> proprietary or legally privileged information. If you are not the intended
> recipient, please delete it without reading the contents. Thank you.
>
>
>
> *******************************************************************************
> Forum Note: Use "Reply" to post a response in the discussion forum.
>
>
--f46d04426cccfe196a04d4396404