Jan 20

I setup dhcpd and tfpt just infrequently enough to forget the details. I’m putting my gottchas here so I don’t forget them.

syslinux package ‘pxelinux’:
pxelinux loads and gets the right IP, then it fails trying to
getting the error “tftp server does not support tsize option”

Fix:

in file /etc/dhcpd.conf:

# absolutly critical to have the next-server line for tftp booting
# when you get "tftp server does not support tsize option" error,
#it's because your missing the config line, Double check with:
#          grep next-server     /etc/dhcpd.conf
#    - Tony 10/17/08
next-server 192.168.0.50;

Troubleshooting:

1] for setting up tftpd you have to make sure there are not entries like
this in /etc/hosts file

127.0.1.1      joust.famemobile.com joust

if so you have to change them to this.

192.168.1.155   joust.famemobile.com joust

2] Using tcpdump for tftp trouble shooting

The fact that loading pxelinux.0 succeeds made me think everything else should work.

The pxelinux.0 loads fine, but the config file ‘pxelinux.cfg/01-00-0c-29-c4-b0-5a’ does not.

05:27:20.882329 IP (tos 0×0, ttl 20, id 2, offset 0, flags [none], proto: UDP (17), length: 55) 192.168.0.51.ah-esp-encap > 192.168.0.50.tftp: [udp sum ok] 27 RRQ “pxelinux.0″ octet tsize 0
05:27:20.893400 IP (tos 0×0, ttl 20, id 4, offset 0, flags [none], proto: UDP (17), length: 60) 192.168.0.51.acp-port > 192.168.0.50.tftp: [udp sum ok] 32 RRQ “pxelinux.0″ octet blksize 1456
05:27:20.953322 IP (tos 0×0, ttl 20, id 29, offset 0, flags [none], proto: UDP (17), length: 91) 192.168.0.51.57089 > 0.0.0.0.tftp: 63 RRQ “pxelinux.cfg/01-00-0c-29-c4-b0-5a” octet tsize 0 blks
… stuff cut out…
05:27:20.972168 IP (tos 0×0, ttl 18, id 44911, offset 0, flags [none], proto: UDP (17), length: 54) 0.0.0.0.tftp > 192.168.0.51.57089: [udp sum ok] 26 ERROR tftp-err-#8 ” tsize option required”

The “0.0.0.0.tftp” is the indicator there is something wrong.

written by admin \\ tags:

Jan 07

I always seem to need a tmp file, I used to do ‘vi /tmp/foo’ but it usually had something in it from last time.  This function opens a new file and stores the file name in $f.

I use it like:

vt
<paste some stuff, clean it up>
perl -pe ’s/foo/bar/’ $f

####
function vt () {
    for i in `seq 0 255`;
    do
        FILE=/tmp/$USER-foo-$i;
        if [ -f "$FILE" ]; then
            echo -n '.';
        else
            f=$FILE;
            vi $FILE;
            echo $FILE;
            return;
        fi;
    done
}

###### Cleanup
function cleanvt () {
for i in `seq 0 255`
do
    FILE=/tmp/$USER-foo-$i

    if [ -f "$FILE" ]
    then
    echo -n '.'
    rm $FILE
    else
        echo
        return
fi
done
echo
}

written by admin

Sep 28

No matter how good an admin you are, you’ll eventually delete something by accident.  I don’t like ‘rm -i’,  it’s too much.  I use the trick of typing ‘#’ before the rm command. Tab competion still works but if I mess up and hit tab-enter (like I do a lot) the ‘#’ saves me. When I’ve got the line looking like I want I ‘ctrl-a’ to beginning and delete the ‘#’ and execute the line.

# rm -rf  /etc/sysconfig/network-hold
Once I got used to it, it became reflexive. It adds a couple of extra characters, but it doesn’t break my admin flow. 

written by admin

Aug 11

It’s challenging to hire a great ops person. How do you judge in an interview?

If I want an A player, I ask hard questions, set high expectations and request their commitment to the teams needs. Even then I make mistakes, but at least I have their commitment. I can work with that if their performance has to be brought up.

Here’s who I have in mind when I’m interviewing:

  1. They know they can’t win the first time, they’ll keep trying.
  2. They are not looking to change careers or directions – they should have been customer service oriented technical people for last 7 years at least.
  3. They hate work and dedicate themselves to eliminating it.
  4. They love people.
  5. They are students of their work.
  6. They are disciplined and self-motivated
  7. They accept work – this is an all-day every-day job. We have a “No Slashdot” policy.
    1. If you don’t have something else to do you are required to ask the NOC if you can help with a problem or same day ticket
    2. Do customer follow up mails.
    3. Eliminate work for someone

i. Write a script

ii. Write a tool that lets business users see data they’ve never seen

    1. “Widen the Moat” so we can make gains against downtime.
    2. Attempt to reduce the complexity of something to the appliance level.
    3. Read customer service and management books.
    4. Make (or review) a list of people you owe things.

i. Ask them if they are getting what they need.

ii. Ask them how you’ll know when they have exactly what they need.

iii. If there is no clear way to deliver what they want – tell them so.

iv. If you are going to drop their concern – let them know.

  1. Put goals, meeting and other important work things on the calendar for the next month, quarter and year.
  2. Invent drills to make sure we are where are where we should be in emergency response and disaster recovery.
  3. Write a new monitor and figure out how to make it supportable forever.
  1. Bad uses of time:
    1. Changing the degree of transparency of you xterms for 100th time.
    2. Second guessing management any more than one level above you.
    3. Pretending you can’t affect the direction of the management one level above you.
    4. Internet reading – even if it’s “background” research on technical things (digg, Slashdot, boing-boing, etc have nothing to do with what we do).
    5. Writing documentation no one but you can interpret.

written by admin