Troubleshooting NetCon

Overview

We have tried to make NetCon as simple as possible to install and use. Experience has shown that most problems can be traced to either the wrong software version or improper responses to the configuration questions during installation.

Before calling technical support please read this section and follow the "NetCon Troubleshooting Chart".

The "NetCon Troubleshooting Chart is designed to guide you through the various steps in troubleshooting NetCon. Start at the top and work down, completing one step before going on to the next. In other words fix the installation errors before checking for the boot screen printout, debug and fix the boot screen printout before looking for the "New IPX/SPX address:". Make sure you have the correct "New IPX/SPX address: before you attempt to test the client or server.

Troubleshooting Chart

Debug Installation

If errors occur during the installation process it is most likely that either the "System Requirements" have not been met (i.e.: wrong software version) or there is not enough disk space free. Please make the following checks, correct the problem and re-install NetCon.

Make sure you are logged in as "root".

Check "System Requirements" section of the installation Guide

Verify that;

Your CPU type is supported.

Your Operating System Version is supported and is fully loaded and includes networking support. Make sure TCP/IP is functioning.

You have the Necessary Hardware/Network card and Driver installed and functioning.

Check disk free space

Make sure you have enough space to install NetCon. 2 MB of free disk space are required in "/ " and "/usr".

Example:

# df / (Enter)

# df /usr (Enter)

Check that /usr directory is mounted.

NetCon installs into the "/usr/bin", " /usr/lib/netcon" and the "/" directories.

Example:

# ls /usr/bin (Enter)

# mount (Enter)

Debug Boot

If errors occured during the boot process and the NetCon banner "NetCon TFS File-System Loaded" did not appear on the boot screen or in the "/usr/adm/messages" file, examine the installation and make sure the kernel has been rebuilt (SCO) or the system was booted with the "-r" (Solaris).

Examine and Verify the installation

If you do not see "NetCon TFS File-System Loaded", check for the existence of the following;

ALL Platforms

/dev/str*_* 3 to 48 files

/dev/netcon* 4 files

/usr/bin/netc* see distribution for complete list

/usr/lib/netcon/* see distribution for complete list

SCO see distribution list for "/etc/conf/*"

AIX

/etc/inittab Check that the "rcnetcon" line has been added.

Solaris 2.X

/kernel/fs/TFS

/kernel/drv/Str

/kernel/drv/Str.conf

/kernel/drv/netc

/kernel/drv/netc.conf

/device/pseudo/netc*

/device/pseudo/str*

/etc/devlink.tab has NetCon device added to end of file "netc", "Str"

Solaris reboot with the reconfigure option -r

Example:

ok> boot -r (Enter)

or older rom versions

ok> b -r (Enter)

check and touch "/reconfigure"

SCO Rerun "netconfig" and Relink the SCO kernel

Example:

# netconfig (Enter) #remove and add netcon chain

or

# cd /etc/conf/cf.d (Enter)

# ./link_unix (Enter)

Debug "New IPX/SPX address:"

Check the UNIX boot screen or the end of the "/usr/adm/messages" file for

New IPX Address:

<net:XXXXXXXX, host:XXXXXXXXXXXX, port:0000>

This is NetCon's basic sanity check to verify that the network is functioning properly and able to transmit and receive packets. Verify that "net:" is your correct network number or "00000000" for a Stand-Alone system and "host:" is your correct network card address. If either one is incorrect your network card is not installed or configured properly or the network type selected is wrong. Verify the network card configuration, check for any conflicts with other cards. Then check and verify the network type.

Check that all the NetCon modules are loaded TFS, Str and netc.

SUN

Example:

# modstat (Enter)

Check the network card operation

The following command should produce a stat listing with input and output packets and no errors > 1%. (Greater than 1%)

SCO

Example:

hwconfig (Enter)

llistat (Enter)

netstat -i (Enter)

SUN

Example:

netstat -i (Enter)

AIX

Example:

netstat -i (Enter)

Check and verify the correct Network type

Check the NetWare server "autoexec.ncf" file

Check the Client "net.cfg" file

FRAME=ETHERNET_II # TYPEII Ethernet

FRAME =ETHERNET_802.3 # 802.3

If the wrong network type was selected during installation you can re-install NetCon or edit the "/usr/bin/netcon.rc" file and change the line following the "###link" line.

Example:

SUN Solaris 2.x

###link
     /usr/bin/netclink /dev/le /dev/str0_ethernetII          #TYPEII

or

     /usr/bin/netclink /dev/le /dev/str0_802.3          #802.3

AIX

###link
     /usr/bin/netcload -c                              #TYPEII
###nconfig
     /usr/bin/netcconfig en0 "X0" up                    #TYPEII
or
###link
     /usr/bin/netcload -c -e /dev/ent0                    #802.3
###nconfig
     /usr/bin/netcconfig et0 "X0" up                    #802.3

SCO

###link
     /usr/bin/netclink /dev/le /dev/str0_ether2          #TYPEII

or

     /usr/bin/netclink /dev/le /dev/str0_ether               #802.3

TEST and Debug Client Functions

Verify the Activation Key

Example:

# netcbrand (Enter)

If you don't see "Address verified" after each program, you have a problem with the brand. Check the network card configuration. You may have a bad activation key or bad adapter address. Verify the adapter address and the activation key. Re-brand and reboot then run;

Example:

# netcbrand (Enter)

# netcon (Enter)

# netcmount server_name:sys:/ /mnt (Enter)

Enter user name for server SERVER_NAME : supervisor (Enter)

Enter the password for user supervisor on server SERVER_NAME : xxxxxxx (Enter)

Re-enter supervisor password : xxxxxxx (Enter)

NETCMOUNT SERVER_NAME:SYS: /mnt

It errors are produced from any of the programs first verify that the SUPERVISOR password is correct and that a default user-entry has been added for the server with the "netcon" menu utility. Also you MUST have a PASSWORD for SUPERVISOR on your NetWare Server.

Now try to restart the password mapping daemon "netcpass"

# netcpass (Enter)

If errors are still displayed when "netcpass" is run, try deleting the bindery and object databases and rebooting the NetCon server and then rerun "netcpass".

# rm /usr/lib/netcon/*.dat (Enter)

# reboot (Enter)

If "netcpass" still displays errors call for tech support!

TEST and Debug Server Functions

Verify that all the NetCon processes are running.

# ps -ef | grep netc (Enter)

You should see "netclink", "netcpass", "netcvt" and at least 4 "netcserv". If not, check the brand.

# netcbrand (Enter)

If the brand is ok try to reboot the NetCon server and re-check if the NetCon processes are running.

Test the File server from a DOS client

On stand-alone systems NETX.COM or VLM.COM should load without error and find the NetCon server.

Try "slist" or "ncslist" and "map" or "ncmap" from a DOS client to the NetCon server.

J:> slist (Enter)

J:> map k:=netcon/sys:\(Enter)

or for stand-alone

J:> ncslist (Enter)

J:> ncmap k:=netcon/sys:\(Enter)

If "slist" shows the NetWare and/or NetCon server and you are able to map a drive to the NetCon server your hardware configuration is OK!

Otherwise you have a problem with the network type or network card.

First Verify the network type by re-checking the NetWare servers "AUTOEXEC.NCF" file. If your network is Ethernet and you see;

"frame=ethernet_II"

at the end of the drivers load line you have a TYPEII Ethernet otherwise you are a NOVELL STD 802.3 or Token-Ring. Edit the "/usr/bin/netcon.rc" file as shown in the previous section.

If the network type is correct and you still can't see the NetCon server with "slist" the problem is in the network card.

SCO

SUN

Check the last boot screen at the end of the "/usr/adm/messages" file, see if the network card is listed and if there are any errors. If there are errors listed for the card the problem is the IO address, DMA, SHARED MEMORY or the INTERRUPT vector is wrong.

AIX

Use Smit to verify that the card is installed and TCP/IP is running on the card. You can also use "Smit", "Problem Determination" to verify that the network card is functioning properly.

Verify the "System Requirements"

Especially the version of "LLI", "LSL.COM", "IPXODI.COM" and "VLM.COM" or "NETX.COM". Double check the "System Requirements" page 1-1 of the "INSTALLATION GUIDE".

Start the server with debugging on

You must first kill all the server daemons then restart them with the "-d" debug option the stdout can be redirected.

# netcrestart -s (Enter)

SCO

AIX

# netcserv -d (Enter)

SUN Solaris 2.x

# netcserv -d -a /dev/le -i aafe (Enter)

# netcvt -d (Enter)

Test the terminal server

From a DOS client

Example:

C:> ncterm server_name (Enter)

A "login: " should appear.

Other Common Errors

ERROR: /usr/bin/netcpass: Cannot connect to SERVER_NAME , error 117 and /usr/bin/netcmount: error 116.

Cause: Network Type is wrong. Re-check the AUTOEXEC.NCF file. If the "load DRIVER_NAME......" line ends with "frame=ethernet_II" you have a TYPE II Ethernet network, otherwise you have a 802.3 network, change the configuration in the "/usr/bin/netcon.rc" file. Edit the line starting "/usr/bin/netclink" changing "str_ether" to "str_ether2" for TYPE II Ethernet or for 802.3 change "str_ether2" to "str_ether".

ERROR: Adapter address does not match key.

Cause: NetWork Card not functioning or Bad activation key. Check the network with "netstat -i", Re-check your Ethernet address and get the correct activation key from your NetCon supplier.

ERROR:/usr/bin/netcmount: No such file or directory.

or

ERROR: netcmount IO error

or

ERROR: ls -al /netware

total 0

Cause: Wrong default password for supervisor, use "netcon" menu utility to correct

Third Level Support

Introduction

Some extraordinary problems require a higher level of technical support and sometime debugging the kernel or other programs. The following describes some advanced debugging techniques that may be used by NetCon's technical support staff and is provided for reference only.

Re-configuring kernel parameters

Always refer to the System Administration Guide before re-configuring the kernel.

SUN

Edit and add new parameters to "/etc/system"

SCO

# cd /etc/conf/cf.d (Enter)

# ./configure (Enter)

AIX

# smit (Enter)

Debugging a kernel PANIC

A UNIX kernel "panic" and the subsequent halting of the system is the result of a kernel function making a "panic()" system call generally due to inconsistent data. The "panic()" system call should do a clean shutdown and copy an image of memory to the system dump/swap device. After the system reboots the memory image should be saved to a regular file or tape. On some newer Unix O/S the kernel absolute debugger "kabd" is available. This utility simplifies kernel debugging and does not require a core file. We will examine both methods of debugging a kernel panic.

Debugging a kernel with kadb

SUN

To debug a kernel panic with kadb, the kadb program must be loaded before the Unix kernel so when the kernel halts it will drop into kadb rather than dumping a core image and halting.

To start kadb you must halt the system and boot with kadb which will load and eventually boot Unix.

Example:

OK> Boot kadb -rv (Enter)

You may also want to modify boot ROM to always boot kadb if you are developing, porting or testing a new software version.

To examine a back trace of the system call leading up to the panic.

Examples:

kadb> $c (Enter)

to check registers:

kadb> $r (Enter)

to disassemble code:

kadb> nsinput/i (Enter)

to exit kadb:

kadb> $q

AIX

See Kennel debugging in Info explorer; To build a boot image with debugging turned on.

# bosboot -d /dev/hdisk0 -a -D (Enter)

When a Crash occurs and The system won't boot

Key to "secure", reset, at "200" put Tape in, Key to "service", Push "reset" twice,

Select 4 Start a limited maintenance Shell.

# getrootfs hdisk0 (Enter)

# vi /bin/netcon.rc (Enter)

Set up a serial terminal from the RS 6000 serial port "S0" for Kernel debugging, 9600baud , 8, 1, NONE. Windows Terminal works fine.

Serial Terminal DIAGLOG with degugging on after CRASH

NetCon Token-Ring ADDR<10:0:5A:A8:57:E> speed 1=16Mbps
ns_opentoken: open 180000 err 2
GPR0  00000000 2FF7FEA0 BADFCA11 00000000 00000001 0599FE98 00000000 40000000
GPR8  FFFFFFFF 0059B000 0000103D 00115731 2FF98000 DEADBEEF DEADBEEF DEADBEEF
GPR16 DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF
GPR24 DEADBEEF DEADBEEF DEADBEEF 00000001 00000000 00038DA0 00000000 00000000

MSR 000890B0  CR   20008024  LR   000C6CC0  CTR   00000000  MQ  DEAC2400
XER 00000006  SRR0 00000000  SRR1 000890B0  DSISR 40000000  DAR 0059B004

IAR 00000000  (ORG-00000000)  ORG=00000000   Mode: VIRTUAL
00000000   00000000 BADFCA11 00115731 00116000   |..........W1..`.|
           |        ?
00000010   0057B000 00000001 0057B000 0059B000   |.W.......W...Y..|

           |
00000000   00000000 BADFCA11 00115731 00116000   |..........W1..`.|
00000010   0057B000 00000001 0057B000 0059B000   |.W.......W...Y..|
00000020   00002DDA 001FA2FC 037C1000 84C20004   |..-......|......|
00000030   94C10004 7C020000 4180FFF4 76260800   |....|...A...v&..|
00000040   40820028 7CAE0426 388000F8 7C852010   |@..(|..&8...|. .|
00000050   7C0428EC 7C040000 4180FFF8 7C0004AC   ||.(.|...A...|...|
00000060   4C00012C 48000038 388000F8 7C0020AC   |L..,H..88...|. .|

Invalid operation interrupt.
>

 ******* Please define the System Console. *******

If you want this display to be the System Console.
 Type a 1 at this terminal and press <Enter>
 
INIT: EXECUTING /sbin/rc.boot 

AIX Base Operating System Installation...
Restoring BOS installation files from tape: rmt0.

                        AIX 3.2 INSTALLATION AND MAINTENANCE




Select the number of the task you want to perform.
  > 1)  Install AIX.
     2)  Install a system that was created with the SMIT                "Backup The System"function or the "mksysb"                   command.
     3) Install this system for use with a "/usr" server.
     4) Start a limited function maintenance shell.

Type the number for your selection, then press "Enter" :  4

Type `exit' to return to the main menu

Use the getrootfs command to access the file systems
that reside on the root volume group.

# getrootfs hdisk0 (Enter)
Importing Volume Group ...
rootvg
log redo processing for /dev/rhd4
syncpt record at eeacc
end of log 10f00c
syncpt record at eeacc
syncpt address eeacc
number of log records  = 1310
number of do blocks = 80
number of nodo blocks = 1
/dev/rhd4 (/): ** Unmounted cleanly - Check suppressed
/dev/rhd2 (/usr): ** Unmounted cleanly - Check suppressed
Checking all mounts and the existance of df
Saving special files and device configuration information.
/dev/rhd3 (/tmp): ** Unmounted cleanly - Check suppressed
/dev/rhd9var (/var): ** Unmounted cleanly - Check suppressed
Files systems mounted for maintenance work.
# vi /bin/netcon.rc (Enter)
I don't know what kind of terminal you are on - all I have is `unknown'.
[Using open mode]
"/bin/netcon.rc" 87 lines, 1757 characters
:
Q gets ex command mode, :q leaves vi
:
#
#       Copyright 1995 NetCon Corporation
:q
#
# pwd
/
# vi /bin/netcon.rc
I don't know what kind of terminal you are on - all I have is `unknown'.
[Using open mode]
"/bin/netcon.rc" 87 lines, 1757 characters
:
#
#       Copyright 1995 NetCon Corporation
#       All Rights Reserved
#
#       NetCon start/stop script
#       Usage: netcon [ start | stop ]
#
PROCS="netcpass netcconfig netclink netcvt"

PATH=/bin:/usr/bin:/etc:/usr/etc:/usr/local/bin:/usr/lib/netcon:

case "$1" in
start)
        set `who -r`
        if [ $9 != "S" ]; then
                exit 0
        fi
        echo "Starting NetCon Network Client and Server, Release 5.1"
        echo "Copyright (c) 1992, 1995 NetCon Corp."
        echo "Copyright (c) 1984, 1985, 1986, 1987 Regents of the University of
California."
        echo "All rights reserved."
###link
#/usr/bin/netcload -c
#/usr/bin/netcload -c -t /dev/tok0
:x
"/bin/netcon.rc" 87 lines, 1758 characters
# reboot (Enter)




Setting up a savecore (saving a core dump) from a crash

SUN

Edit the "/etc/init.d/sysetup" file and uncomment the "savecore" lines make sure you have enough disk space to hold the core in the directory you specify.

SCO

SCO system: To dump the core image to /dev/swap you must have more than 16 MB of memory to examine the dump from swap, otherwise the "fsck" will overwrite the core image in the swap file. If this occurs you must save the core image to tape and reload it after the reboot.

Debugging a PANIC with crash (Examining a core image)

Examples: see the man (manual) pages first, they are all different.

SUN

# cd savecoredir (Enter)

# crash -d vmcore.x -n unix.x (Enter)

SCO

# crash -d /dev/swap (Enter)

AIX

# crash

to get a back trace:

> t (Enter)

To check stream statistics:

> strstat (Enter)

To examine a process first find the process slot:

> p (Enter)

Then print the user area for the slot number.

> u (Enter)

To see a list of crash functions:

> ? (Enter)

To get help on a function:

> help trace (Enter)

To print a back trace to a file name "/tmp/trace":

> t -w /tmp/trace (Enter)

Debugging NetCon Kernel Code.

Using adb or kadb to examine the kernel or turn on debugging code. When debugging is turned on, the output is sent to the system console and to "/usr/adm/messages". The output is very voluminous and will slow the system to a crawl so use it sparingly and wisely.

Example:

SUN

# adb -kw /dev/ksym /dev/mem (Enter)

SCO

# adb -w /unix (Enter)

AIX

# adb -kw /usr/lib/boot/unix /dev/mem (Enter)

To turn on NetCon vnode operation debugging.

vnodeopsdebug/W 1 (Enter)

vnodeopsdebug 0x0 1

To turn off.

vnodeopsdebug/W

vnodopsdebug 0x1 0

Other NetCon debugging symbols.

nsinputdebug/W1 #IPX/SPX input from the network

nsoutputdebug/W1 #IPX/SPX output to the network

vnodopsdebug/W1 # file system operation entry points

tfs_trace_mask/W 0xffffffff #Client operations

Check system configuration

Example:

# sysdef -i (Enter)

Check Module load modinfo

Example:

SUN

# modinfo (Enter)

System won't boot after NetCon Removal/Installation

SUN

If NetCon is running and the system is shutdown to single user mode (not RE-BOOTED into single user mode) and an attempt is made to remove or install NetCon the kernel may panic and not reboot. This is caused by corruption of the kernel configuration files in the "/etc/" directory. If this should occur, restore the corrupt files from CDROM.

SUN Restoring /etc configuration files from Solaris distribution.

Boot from the cdrom select "Command Tool" from the "Menu" button.

copy the following files from the cdrom "/etc" to the root file system "/etc" directory.

Example:

# fsck -y /dev/dsk/c0t0d0s0 (Enter)

# mount /dev/dsk/c0t0d0s0 /mnt (Enter)

# cp /etc/xxx /mnt/etc/xxx (Enter)

copy the following files.

name_to_major

devlink.tab

minor_perm

path_to_inst

driver_aliases