|
IDS Forum
Re: How to Monitor Open Connection Tiime???
Posted By: Cesar Inacio Martins Date: Friday, 3 December 2010, at 12:44 p.m.
In Response To: Re: How to Monitor Open Connection Tiime??? (Fernando Nunes)
Damn!!
Sorry, this is my "Homer Simpson" moment... :P
I marked the wrong lines on the strace output... let's try again...
###14:01:48 accept(7, {sa_family=AF_INET6, sin6_port=htons(29315),
inet_pton(AF_INET6, "::ffff:172.18.0.104", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, [28]) = 4
14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
...}) = 0
14:01:48 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
...}) = 0
14:01:48 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
14:01:48 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
14:01:48 read(3, "#\n# hosts This file desc"..., 4096) = 3957
###14:01:48 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644,
st_size=850, ...}) = 0
###14:01:48 connect(3, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("8.8.8.8")}, 28) = 0
14:01:48 send(3,
"x8\1\0\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"..., 43,
MSG_NOSIGNAL) = 43
14:01:48 recvfrom(3,
"x8\201\203\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"..., 1024,
0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("8.8.8.8")}, [16]) = 43
###14:02:09 accept(7, {sa_family=AF_INET6, sin6_port=htons(46297),
inet_pton(AF_INET6, "::ffff:172.18.0.105", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, [28]) = 4
14:02:09 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
14:02:09 read(3, "#\n# hosts This file desc"..., 4096) = 3957
###14:02:09 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644,
st_size=871, ...}) = 0
###14:02:09 open("/etc/resolv.conf", O_RDONLY) = 3
###14:02:09 read(3, "### /etc/resolv.conf file autoge"..., 4096) = 871
###14:02:09 connect(3, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("1.1.1.1")}, 28) = 0
14:02:09 send(3,
"\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
MSG_NOSIGNAL) = 43
14:02:14 send(3,
"\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
MSG_NOSIGNAL) = 43
14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
...}) = 0
14:02:19 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644, st_size=230,
...}) = 0
14:02:19 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
I forgot to mention on the last message...
The code appear don't reread the resolv.conf every time, appear get the
status from the file (stat() function) and if something is
different(probably the last modification date), reread the file...
On 12/03/2010 03:32 PM, Cesar Inacio Martins wrote:
> Ooops.. sorry, I confusing my self when I want say resolv.conf and
> says hosts.equiv...
>
> the correct is : "- Each open connection, always reread resolv.conf...".
>
> So, I executed some tests here (my netbook), now using the same
> version of our production (11.50 xC7W1GE).
> And works too!
>
> I started the database with strace and keep a tail over the files.
> Between this two block I changed the /etc/resolv.conf , the dns from
> 8.8.8.8 to 1.1.1.1 and try open a new connection from differ host.
>
> (check lines marked with "###" by me)
> -----------------------------------
> $tail -n +0 -f ifx* | egrep "add|nscd|host|resol"
>
> ###14:01:48 accept(7, {sa_family=AF_INET6, sin6_port=htons(29315),
> inet_pton(AF_INET6, "::ffff:172.18.0.104",&sin6_addr),
> sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 4
> ###14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644,
> st_size=230, ...}) = 0
> ###14:01:48 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
> ###14:01:48 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644,
> st_size=230, ...}) = 0
> 14:01:48 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
> 14:01:48 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
> 14:01:48 read(3, "#\n# hosts This file desc"..., 4096) = 3957
> 14:01:48 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=850, ...}) = 0
> ###14:01:48 connect(3, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("8.8.8.8")}, 28) = 0
> 14:01:48 send(3,
> "x8\1\0\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"..., 43,
> MSG_NOSIGNAL) = 43
> 14:01:48 recvfrom(3,
> "x8\201\203\0\1\0\0\0\0\0\0\003104\0010\00218\003172\7in-add"...,
> 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("8.8.8.8")}, [16]) = 43
>
> ###14:02:09 accept(7, {sa_family=AF_INET6, sin6_port=htons(46297),
> inet_pton(AF_INET6, "::ffff:172.18.0.105",&sin6_addr),
> sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 4
> 14:02:09 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3
> 14:02:09 read(3, "#\n# hosts This file desc"..., 4096) = 3957
> ###14:02:09 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=871, ...}) = 0
> ###14:02:09 open("/etc/resolv.conf", O_RDONLY) = 3
> ###14:02:09 read(3, "### /etc/resolv.conf file autoge"..., 4096) = 871
> ###14:02:09 connect(3, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("1.1.1.1")}, 28) = 0
> 14:02:09 send(3,
> "\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
> MSG_NOSIGNAL) = 43
> 14:02:14 send(3,
> "\204H\1\0\0\1\0\0\0\0\0\0\003105\0010\00218\003172\7in-add"..., 43,
> MSG_NOSIGNAL) = 43
> 14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644,
> st_size=230, ...}) = 0
> 14:02:19 open("/etc/hosts.equiv", O_RDONLY|O_LARGEFILE) = 3
> 14:02:19 stat64("/etc/hosts.equiv", {st_mode=S_IFREG|0644,
> st_size=230, ...}) = 0
> 14:02:19 read(257, "#\n# hosts.equiv This file desc"..., 4096) = 230
> -----------------------------------
>
> And this behave, reread the resolv.conf occur only the first time what
> this IP make a connection, if close and reopen, they don't try resolve
> the DNS again... only a few minutes later (probably some kind of cache
> timeout).
>
> Include this IPs to /etc/hosts, was the first thing what our Linux
> Admin does , but is a lot and this can change dynamically...so, isn't
> a option.
>
> Now I'm start to believe the Informix innocence :) and guilty the O.S.
> , but I need some way to prove this.
>
> After some research about the functions gethostbyaddr / gethostbyip ,
> I found them are part of the "resolver" lib, what is from GLIBC.
>
> My OpenSuse 11.2 (recent updated) , the Glibc is 2.10 (kernel 2.6.31)
> The production environment (updated aug/2010) , Red Hat 5.5 , the
> Glibc is 2.5
>
> Fernando, what's version is your glibc ?
>
> I looking for the changelogs / patches for this functions and try
> identify they are the problem and what version of glibc already able
> to solve this.
> Looking the changelog of my glibc (rpm -q --changelog glibc) , have a
> fews changes over the resolv.conf.. but nothing in particular for this
> situation.
>
> Regards
> Cesar
>
>
> On 12/03/2010 01:47 PM, Fernando Nunes wrote:
>> On Fri, Dec 3, 2010 at 12:44 PM, Cesar Inacio Martins<
>> cesar_inacio_martins@yahoo.com.br> wrote:
>>
>>> Hi Fernando,
>>>
>>> I wrote a little program here and confirm what you said about not "see"
>>> the call of gethostbyaddr...
>>>
>>> But........ I got new variables....
>>> Testing on my net book , IFX 11.70 xC1 (isn't the same version used on
>>> our production) + OpenSuse 11.2 , open connections with changes on the
>>> resolv.conf, tracing with strace:
>>> - Each open connection, always reread hosts.equivs. (I believed this
>>> the
>>> res_init() function working inside of the gethostbyaddr).
>>>
>> hosts.equiv has nothing to do with DNS. It's for trusted connections,
>> and
>> yes, it's read every time (or maybe when it changes, depending on the
>> OS).
>>
>>> - When I change the resolv.conf , without bounce the instance, the next
>>> connection already try solve the DNS using the new configuration.
>>>
>> Not on my system (with the test program). I was using Fedora 13.
>>
>>> So, now the question, is the Informix or Linux problem ??
>>> I will install the same version what we have the problem on my
>>> netbook ,
>>> ifx 11.50 uc7w1ge (but 32 bits..) and test.
>>>
>> On my system, I tested without Informix, so it was definitively a
>> problem
>> with the gethostbyaddr() function.
>> Note that calling this a "problem" is a bit simplistic... I'm not
>> sure we
>> would want it to re-read the file on each request...
>>
>>> Just for curiosity, read this thread :
>>> http://fixunix.com/redhat/17199-etc-resolv-conf-how-reload.html
>>>
>>>
>> Too long! :) Sorry. Only later.
>>
>>> Is a similar situation, not with Informix and over RH 4 (not 5.5) .
>>>
>>> know my suspicious now is over glibc used on this RH.
>>> When I finish my test on my netbook with the same version what we use
>>> here in production , I will post here the results.
>>>
>>>
>> What about the possibility of adding the hosts to your /etc/hosts
>> file? Are
>> there many connection points in this situation?
>>
>>> On 12/01/2010 11:07 PM, Fernando Nunes wrote:
>>>> On Wed, Dec 1, 2010 at 6:01 PM, Cesar Inacio Martins<
>>>> cesar_inacio_martins@yahoo.com.br> wrote:
>>>>
>>>>> Hi Fernando!
>>>>> Thanks for your message!
>>>>> Sorry take to long to answer, I don't know why the last 2 days of
>>>>> messages incoming from IIUG has arrived just now.
>>>>>
>>>>> So, about the gethostbyaddr() , I'm not sure about that because isn't
>>>>> what we see on the strace .
>>>>>
>>>> I wrote a simple program in Fedora, to test your situation.... Several
>>>> interesting things came up. They're not very helpful, but the prove
>>> Informix
>>>> innocence :)
>>>>
>>>> On strace you will not see the reference to gethostbyaddr. Only in a
>>>> debugger or if you force a stack trace in the precise moment...
>>>>
>>>>> The request of DNS reverse lookup, appear to be executed
>>>>> "manually" by
>>>>> Informix or the strace just traced the gethostbyaddr() too.
>>>>> Check the output marked with "##" by me.
>>>>> where: 172.18.0.57 and 172.18.0.119 are the OLD DNS when the
>>>>> server was
>>>>> started. When we run this strace, the resolv.conf already have new
>>>>> values (for at least 2 days):
>>>>>
>>>>> 11:24:28 munmap(0x2abeb6d75000, 4096) = 0
>>>>> 11:24:28 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
>>>>> ##11:24:28 connect(3, {sa_family=AF_INET, sin_port=htons(53),
>>>>> sin_addr=inet_addr("172.18.0.57")}, 28) = 0
>>>>> 11:24:28 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
>>>>> 11:24:28 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>>>>> 11:24:28 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3,
>>>>> revents=POLLOUT}])
>>>>> ##11:24:28 sendto(3,
>>>>> "\242;\1\0\0\1\0\0\0\0\0\0\003185\00248\00222\003172\7in-ad"..., 44,
>>>>> MSG_NOSIGNAL, NULL, 0) = 44
>>>>> ##11:24:28 poll([{fd=3, events=POLLIN}], 1, 5000) = 0 (Timeout)
>>>>> 11:24:33 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
>>>>> 11:24:33 connect(4, {sa_family=AF_INET, sin_port=htons(53),
>>>>> sin_addr=inet_addr("172.18.0.119")}, 28) = 0
>>>>> 11:24:33 fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
>>>>> .... they continue trying to third DNS server...
>>>>> (check the timestamp,.. 5 seconds of timeout delay)
>>>>>
>>>>> Before you ask, the Linux Admin already try change the timeout
>>>>> (option
>>>>> into resolv.conf) and don't have effect....
>>>>>
>>>>>
>>>> 2nd interesting observation... I put the program in loop... reading
>>>> an IP
>>>> from the console and trying to reverse DNS it.
>>>> Between loops I changed my resolve.conf.... It didn't matter for the
>>> running
>>>> process. I belive gethostbyaddr creates some static structures. First
>>> time
>>>> it runs it reads the resolv.conf, but it doesn't happen again...
>>>> Probably
>>>> due to performance reasons. But it's inconvenient in your
>>>> situation....:
>>>>
>>>> open("/etc/resolv.conf", O_RDONLY) = 3
>>>> fstat64(3, {st_mode=S_IFREG|0644, st_size=55, ...}) = 0
>>>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
>>>> -1, 0)
>>> =
>>>> 0xb776e000
>>>> read(3, "# Generated by NetworkManager\nna"..., 4096) = 55
>>>> read(3, "", 4096) = 0
>>>> close(3)
>>>>
>>>> This is why it keeps trying the old DNS...
>>>>
>>>> Sorry , yes is a 11.50 FC7 GE over Linux Red Hat 5.5
>>>>> This option to create a group is nice, but I not sure if is viable
>>>>> for
>>>>> this environment because have a lot of applications spread on the
>>>>> company (.net/windows, java/web, C/Linux) what we will need care
>>>>> about
>>>>> to change the SQLHOSTS...
>>>>>
>>>>> The nsswitch.conf is ok, it isn't modified from they default
>>>>> values...
>>>>> and the problem is with connections what become from out of our
>>>>> network
>>>>> (what don't exists into /etc/hosts) and create consequences to local
>>>>> connections (what exists in hosts) when the instance get trouble
>>>>> to try
>>>>> resolve their hostnames (not local clients)
>>>>>
>>>> Are there many client IPs from outside the network? If the number
>>>> is low
>>> you
>>>> could put them in /etc/hosts...
>>>>
>>>>> And the problem what we detected is over MSC, because is it what
>>> freeze...
>>>>> The most weird thing is, when the MSC "freeze" (stuck running in
>>>>> active
>>>>> threads: onstat -g act) they stack dump (onstat -g stk #threadid)
>>>>> don't
>>>>> change anything, still showing in yield process...and just change the
>>>>> status from sleep to running....
>>>>>
>>>>>
>>>> Now... none of these are good news. I really can't see a good solution
>>>> besides restart... You're a victim of how the TCP/IP name resolution
>>> works
>>>> Apparently nscd will not work also, because gethostbyaddr only
>>>> tries it
>>> the
>>>> first time...
>>>>
>>>> Some other thoughts:
>>>>
>>>> 1- You could launch more msc VPs. The new ones should pick up the
>>>> correct
>>>> DNS. And before you ask, no, you cannot remove MSC VPs...
>>>> But with more MSC VPs your chances of getting stuck should be lower
>>>> 2- (this is very weird...) You could use IPTABLEs to hijack
>>>> connections
>>> to
>>>> the OLD DNS and redirect them to the new one... Not sure if this is
>>>> possible, but you would have to create a rule for the old IP, port
>>>> 53 and
>>>> protocol UDP.
>>>> 3- You could "hack" the network to make the old IP available. This
>>>> could
>>> be
>>>> done by putting a machine on the network with that IP, or
>>>> eventually by
>>>> hacking the ARP table
>>>> 4- you could create another TCP interface on the machine (how?)
>>>> with the
>>> IP
>>>> of the old DNS server. I tried putting my own IP on the
>>>> /etc/resolv.conf
>>>> file and the reverse DNS returns to "quick" (although I don't have
>>> anything
>>>> listening on port 53)...
>>>> The problem you're facing comes from the fact that the IP cannot be
>>> reached.
>>>> If you make it go to an existing IP it will not resolv, but it will be
>>>> quick.
>>>>
>>>> Naturally 2, 3 and 4 are "crazy" suggestions... But I don't see any
>>> "sane"
>>>> alternative to an instance stop.
>>>>
>>>> Regards.
>>>>
>>>
>>>
>>>
>> *******************************************************************************
>>
>>> Forum Note: Use "Reply" to post a response in the discussion forum.
>>>
>>>
Messages In This Thread
IDS Forum is maintained by Administrator with WebBBS 5.12.
|
|