Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / Oracle / Oracle Server / February 2006

Tip: Looking for answers? Try searching our database.

Connection refused during opmnctl startall

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Nick - 24 Feb 2006 19:29 GMT
I am trying to restart AS10g on SPARC Solaris 9.  opmnctl command hangs
and only starts 1 of 4 processes (HTTP_Server only)... the output of
the command is below.

$ ./opmnctl startall
opmnctl: starting opmn and all managed processes...
================================================================================
opmn id=XXXX:6200
   0 of 3 processes started.

ias-instance id=XXXXXXX
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ias-component/process-type/process-set:
   OC4J/oca/default_island

Error
--> Process (pid=1620)
   time out while waiting for a managed process to start
   Log:
   $ORACLE_HOME/opmn/logs/OC4J~oca~default_island~1

--------------------------------------------------------------------------------
ias-component/process-type/process-set:
   OC4J/OC4J_SECURITY/default_island

Error
--> Process (pid=1621)
   time out while waiting for a managed process to start
   Log:
   $ORACLE_HOME/opmn/logs/OC4J~OC4J_SECURITY~default_island~1

--------------------------------------------------------------------------------
ias-component/process-type/process-set:
   OID/OID/OID

Error
--> Process (pid=0)
   database dependency failed
   SID
   failed to start a managed process because a dependency check failed
   Log:
   none

The sqlnet.log file (below) shows the following entry over and over
again, as the connection is being refused.  If I log in to the metadata
repository I receive an error indicating the listener failed to start a
dedicated process.  I have tweaked my kernel and oracle settings
(semaphores, rlim_max, etc.) and they should be more than adequate. I
initially thought it was the oidmon process crashing, but I do not
think that's it.

Can anyone offer some insight here?

   TNS for Solaris: Version 10.1.0.4.0 - Production
       TCP/IP NT Protocol Adapter for Solaris: Version 10.1.0.4.0 -
Production
 Time: 24-FEB-2006 14:16:12
 Tracing not turned on.
 Tns error struct:
   ns main err code: 12564
   TNS-12564: TNS:connection refused

TIA...
Frank van Bortel - 24 Feb 2006 19:50 GMT
>     TNS for Solaris: Version 10.1.0.4.0 - Production
>         TCP/IP NT Protocol Adapter for Solaris: Version 10.1.0.4.0 -
[quoted text clipped - 4 lines]
>     ns main err code: 12564
>     TNS-12564: TNS:connection refused

Oh yeah, I type the magic words for you:

[oracle10@csdb01 oracle10]$ oerr tns 12564
12564, 00000, "TNS:connection refused"
// *Cause: The connect request was denied by the remote user (or TNS
software).
// *Action: Not normally visible to the user.  For further details, turn on
// tracing and reexecute the operation.

More details, please
Signature

Regards,
Frank van Bortel

Top-posting is one way to shut me up...

Nick - 24 Feb 2006 20:01 GMT
Thanks for the response, Frank.  I have retried this a few times now,
and rebooted the machine.  Still no luck, I have turned on tracing, and
here is the error I see...

nsglbgetRSPidx: returning ecode=0
sntpcall: only 0 bytes read
sntpcall: Can't read from pipe; err[1] = 32
nserror: nsres: id=6, op=72, ns=12547, ns2=12560; nt[0]=517, nt[1]=32,
nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0

Does that shed some more light?
Frank van Bortel - 24 Feb 2006 20:13 GMT
> Thanks for the response, Frank.  I have retried this a few times now,
> and rebooted the machine.  Still no luck, I have turned on tracing, and
[quoted text clipped - 7 lines]
>
> Does that shed some more light?

12547, 00000, "TNS:lost contact"
// *Cause: Partner has unexpectedly gone away, usually during process
// startup.
// *Action: Investigate partner application for abnormal termination. On an
// Interchange, this can happen if the machine is overloaded.

The 12560 is the generic error reported back. You can do this, too, btw,
oerr tns 12547
or
oerr ora 1401

As you state all system (and kernel?) parameters are correct, I don't
quite know where to take it from here.
Did you try starting the services manually, one-by-one?
iirc, opmnctl verbose status will show you what (or s/status/getstate/g)

See if one of the others fails with a more meaningful error message
Signature

Regards,
Frank van Bortel

Top-posting is one way to shut me up...

Nick - 24 Feb 2006 21:20 GMT
I have found some processes that appear to be hung... the following is
output from ps -fu oracle:

    UID   PID  PPID  C    STIME TTY      TIME CMD
 oracle  3845     1  0 15:29:18 ?        0:00 oraclempris10g
(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 oracle  3859  3855  0 15:33:08 pts/2    0:00 -ksh
 oracle   418     1  0 14:02:12 ?        0:00 ora_pmon_mpris10g
 oracle   420     1  0 14:02:12 ?        0:00 ora_mman_mpris10g
 oracle   422     1  0 14:02:12 ?        0:01 ora_dbw0_mpris10g
 oracle   424     1  0 14:02:12 ?        0:01 ora_lgwr_mpris10g
 oracle   426     1  0 14:02:13 ?        0:04 ora_ckpt_mpris10g
 oracle   428     1  0 14:02:13 ?        0:05 ora_smon_mpris10g
 oracle   430     1  0 14:02:13 ?        0:00 ora_reco_mpris10g

The ? lead me to believe these processes are hung, the main issue is
that I cannot kill them b/c their parent process is pid 1 - init.

Any advice on how to proceed?
Jim Smith - 25 Feb 2006 06:51 GMT
>I have found some processes that appear to be hung... the following is
>output from ps -fu oracle:
[quoted text clipped - 15 lines]
>
>Any advice on how to proceed?

These are mostly oracle database background processes and are almost
certainly not hung. The ? just means they are not attached to a
terminal. Under no circumstances should they be killed.

The first one (PID 3845) is an oracle client shadow process and its
parent ought to be a sqlplus session or something similar and might be
hung.

These are probably not related to your problem.

If you want to get rid of the hung process, kill -9 3845 as root ought
to get rid of it and you can then bounce the database if you want.
Signature

Jim Smith
I'm afraid you've mistaken me for someone who gives a damn.

Nick - 27 Feb 2006 14:46 GMT
Thanks all for your reply.  I've still found no resolution for this
issue.  I've examined every potential resource shortfall I can think
of, and everything appears to be fine.  I have noticed, however, that
an effective group ID has been assigned to my oracle user.  I do not
recall seeing this in the past.

Below...

$ id
uid=101(oracle) gid=100(dba) egid=2(bin)

Could this be causing the problems I am having?
Frank van Bortel - 27 Feb 2006 18:15 GMT
> Thanks all for your reply.  I've still found no resolution for this
> issue.  I've examined every potential resource shortfall I can think
[quoted text clipped - 8 lines]
>
> Could this be causing the problems I am having?

Yes

Signature

Regards,
Frank van Bortel

Top-posting is one way to shut me up...

Nick - 28 Feb 2006 15:26 GMT
Thank you all for your help and insight.  Turns out that the
permissions have gotten out of wack on this box.  The oracle user was
using /bin/ksh - which somehow had a setgid bit in it's permissions,
and the group owner was bin - hence my egid of 2-bin.  I switched the
oracle user over to /bin/sh - and was able to bring AS10g up with no
issues.

I was able to circumvent the issue - but still can't figure out why the
permissions went bad.

Thanks again....

//NC
Nick - 24 Feb 2006 21:23 GMT
I have found some processes that appear to be hung... the following is
output from ps -fu oracle:

    UID   PID  PPID  C    STIME TTY     TIME          CMD
 oracle  3845     1  0 15:29:18    ?         0:00
oraclempris10g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 oracle  3859  3855  0 15:33:08 pts/2     0:00        -ksh
 oracle   418     1  0 14:02:12      ?        0:00
ora_pmon_XXXX
 oracle   420     1  0 14:02:12      ?        0:00
ora_mman_XXXX
 oracle   422     1  0 14:02:12      ?        0:01
ora_dbw0_XXXX
 oracle   424     1  0 14:02:12      ?        0:01
ora_lgwr_XXXX
 oracle   426     1  0 14:02:13      ?        0:04
ora_ckpt_XXXX
 oracle   428     1  0 14:02:13      ?        0:05
ora_smon_XXXX
 oracle   430     1  0 14:02:13      ?        0:00
ora_reco_XXXX

The ? lead me to believe these processes are hung, the main issue is
that I cannot kill them b/c their parent process is pid 1 - init.

Any advice on how to proceed?
Frank van Bortel - 24 Feb 2006 19:50 GMT
> I am trying to restart AS10g on SPARC Solaris 9.  opmnctl command hangs
> and only starts 1 of 4 processes (HTTP_Server only)... the output of
[quoted text clipped - 16 lines]
>     Log:
>     $ORACLE_HOME/opmn/logs/OC4J~oca~default_island~1

What if you just retry? Those Java processes are not known
for their speed...

Signature

Regards,
Frank van Bortel

Top-posting is one way to shut me up...

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2010 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.