For Windows-related problems, check WindowsDisasterRecovery

Prerequisites for our network

  • Power
  • Network connectivity (from CAC, watch CAC Alerts).
    • uplink to the campus network (u.washington.edu servers) and the Internet is also important for some services (email, web, etc.)
    • CAC NOC phone is 206-221-6000 if network is dead
  • DNS (from department, non-local forwarded to CAC)

Our locations

Note that some areas may be OK while others are out:

  • 128.95.66 subnet - Annex, Jim's office
  • 128.95.180 subnet - Harris 321 server room, shared with PBIO
  • 128.95.228 subnet - T170B/165

Wireless is on a separate UW CAC subnet.

Critical SIG Linux Servers

vagal

Symptoms: SIG website down, NFS data down (`ls /usr/local/data/` hangs), CVS down, SIG mailing lists down
  • Location: T165, near the floor to the left of the door; large custom-build rack-mounted server with 16 disk slots
  • What to do: check console for obvious messages, check network connectivity, reboot (unfortunately sometimes the best way to clear NFS hangs)
  • Note: hardware is
    • two 3ware Escalade 8506 HardwareRaid controllers (sda, sdb)
    • boot drive (hde) is SATA connected with HighPoint 1520 PCI card
    • motherboard is a Tyan S2721-533 "Thunder i7501 Pro"
  • When rebooting, always turn the power off then back on ("cold boot") instead of resetting due to a problem in that model 3ware controller
  • There is a known issue in the 2.6 linux kernel with NFS and automount causing infrequent panics, see Bugzilla #117357

axon

Symptoms: main LDAP server down (if both LDAP servers are down, no one can log in), sig-imap server down

Important SIG Linux Servers

sphenoid

Symptoms: DXBrain down, Interactive Atlases are down, various Wirm repos are down (bmap, eyelab celo)
  • Location: Harris 321, one of the Dell PowerEdge 1750 servers in the rack; clearly labeled "sphenoid"
  • What to do: check console for obvious messages, check network connectivity, reboot
  • Note: this is a production server and does not depend on LDAP or NFS for startup

cuboid

Symptoms: DXBrain CSM queries down, SIG Publications website is down
  • Location: Harris 321, one of the Dell PowerEdge 1750 servers in the rack; clearly labeled "cuboid"
  • What to do: check console for obvious messages, check network connectivity, reboot
  • Note: WIX CSM depends on NFS at startup

xiphoid

Symptoms: FME down, FMA OWL queries don't work
  • Location: Harris 321, one of the Dell PowerEdge 1750 servers in the rack; clearly labeled "xiphoid"
  • What to do: check console for obvious messages, check network connectivity, reboot
  • Note: FME depends on NFS at startup

cuboid

Symptoms: SIG Publications website is down
  • Location: Harris 321, one of the Dell PowerEdge 1750 servers in the rack; clearly labeled "cuboid"
  • What to do: check console for obvious messages, check network connectivity, reboot
  • Note: this is a production server and does not depend on LDAP or NFS for startup

various /home/ exports

Symptoms: ls /home/someuser hangs; http://sig.biostr.washington.edu/~someuser/ website down

*axon:/home/detwiler (FME) *stylus:/home/andrew (brain browser) *axon:/home/brinkley (Jim's webpage)

  • Note: user home directories are kept on the machine they theoretically use most often (most Windows users use lamina), and exported via NFS. Other machines get automount information from LDAP about the location of each home
  • For a complete list, see HomesList or use ldapsearch -xLLL | grep automountI

Less important

incus, lamina, uvula

Symptoms: several smaller services are down
  • Locations:
  • T165, Precision 530 clearly labeled "incus" (live FMA database)
  • T170b, under back left corner desk (Andrew's), Dell Precision 530s clearly labeled "lamina" and "uvula" (SVN and DSG)
  • What to do: check console for obvious messages, check network connectivity, reboot (unfortunately sometimes the best way to clear NFS hangs)