A short while ago, I noticed logging into crimson via ssh from other machines on my home network was occasionally very slow. I knew that the most common cause of delays in ssh were DNS issues, but (irrationally) I had a hard time believing this was the cause. I thought it more likely that my recent upgrade of crimson to FreeBSD 5.3 was the root of the problem. Well, that opinion was complete bollocks...
The reason I suspected the upgrade of being the root cause was due
to changes I had made to the kernel configuration. I had stripped
everything out of the kernel and world that I did not need, by
setting the following make variables in /etc/make.conf
:
# make.conf for CRIMSON # # Modification History: # Date Who # 20050311 mpw # Added options to prevent unused functionality being built # Added BATCH=true # build for i586 CPUTYPE=i586 # standard CFLAGS CFLAGS= -O -pipe # no profiled libraries NOPROFILE=true # just need a kernel - crimson uses no modules NO_MODULES=true # omit these elements from buildkernel/buildworld NO_BIND=true NO_IPFILTER=true NO_PF=true NO_AUTHPF=true NOATM=true NO_USB=true NO_LPR=true NO_ACPI=true NO_VINUM=true NO_BLUETOOTH=true NO_I4B=true NO_OBJC=true NO_FORTRAN=true NO_GPIB=true NO_SHAREDOCS=true NO_NIS=true # can't use AAAA queries against the Alcatel DNS NOINET6=true # Build ports in unattended mode BATCH=true # added by use.perl 2005-03-09 12:06:02 PERL_VER=5.8.6 PERL_VERSION=5.8.6
I had then found the following sshd error in the system log file for each ssh login:
sshd[n]: login_getclass: unknown class 'root'
Didn't seem to stop me logging in, but annoying just the same.
After a little digging, I found someone
else had reported the problem, which turned out to be due to the
removal of NIS from the kernel via a custom /etc/make.conf
.
The default system /etc/nsswitch.conf
settings assumed that
NIS was present. By modifying the /etc/nsswitch.conf
to
the contents shown below, the sshd errors no longer occurred.
# Modified by mpw # 18/05/2005 - remove nis from settings, as NO_NIS is specified in # /etc/make.conf group: files hosts: files dns networks: files passwd: files shells: files # original settings #group: compat #group_compat: nis #hosts: files dns #networks: files #passwd: compat #passwd_compat: nis #shells: files
Back to the topic in hand... One evening, the delays in ssh were
really significant, and continued to be reproducible for a long
period of time. At last I had a chance to perform some
diagnostics. Firstly, to convince myself that the problems were
not DNS related, I put the IP addresses of the two machines I was
testing into the /etc/hosts
file, in order to ensure that
DNS would not be used. Hmm, no delays when connecting via ssh. OK,
I was convinced it must be a DNS issue.
What puzzled me was that using nslookup
to check the DNS
servers seemed to work fine - no delays or errors were discernable.
What was going on?
I thought I ought to check that the DNS addresses provided by my ISP
were correct. What do you know? They'd changed the addresses and
omitted to tell me. Talk about impolite... To check that this was the
problem, I hacked the new addresses into /etc/resolv.conf
and, as if by magic, the delays disappeared. Obviously the old DNS
addresses worked, but were a lot slower to react to the queries from
sshd.
All I had to do was change the DHCP server (an Alcatel Speedtouch 510) to send the new DNS addresses when giving out a DHCP address to the local network machines. See this article for details.