Server 2003 memory leak horror...

Discussion in 'Operating Systems' started by Linuxtim, Jan 30, 2006.

  1. Linuxtim

    Linuxtim Limp Gawd

    Feb 26, 2003
    Got a big problem with Server 2003 randomly crashing. It just stops talking to the network and requires a reboot. Event log is full of 1009 errors than a few 2019 errors (can not page from non-paged area I think) and then it stops reponding to network traffic.

    Things is, this is effecting at least 15 of 53 servers and it seems f*cking random. Also it seems about 1 new server a week starts doing this when it has been fine for 3 or so months. All server 2003, not SP1 though (I could not see any reason to upgrade to SP1 - correct me if needed!).

    Servers run DHCP, DNS, AD, McAfee 8.0 enterprise (with patch 11), Veritas 9.1 or 10 (fully patched) and updated to the latest Intel Pro 1000 MT drivers (10.2 I think) but I still see 1009 errors (although a few less 2019s that hang boxes).

    So - anyone else seen these sorts of errors on server 2003 and any idea how to fix it? It's starting to get desperate! We just upgraded from NT4 and find that server 2003 seems more unstable! Our users are starting to get concerned.

  2. SJConsultant

    SJConsultant 2[H]4U

    Jan 14, 2004
    If you want any real help your going to need to post more details.

    Helpful information would include *all* details when referring to event logs and that includes errors and warnings from application and system logs.

    What makes you think it is a memory leak?

    Have you even considered contacting Microsoft PSS?

    I would think if this is happening on so many machines you would want to solve the issue ASAP rather than rely on forums which could take days.
  3. pigster

    pigster [H]ard|Gawd

    Jul 24, 2004
    I agree if 15 of 50 some servers are affected, you really should be talking to Microsoft.
  4. Erich in Az

    Erich in Az [H]Lite

    Oct 24, 2005
    If you know it's on a cycle, watch resources as the time goes on and look for the memory hog (lsass?). Try restarting some services and see what reduces the memory usage. When you find the one, it should be pretty obvious. ;)

    I've seen a couple of Symantec products do this (ITA and AV) and we ended up setting up a script to bounce the service every night until a fix was made. Good luck!
  5. oakfan52

    oakfan52 [H]ard|Gawd

    Oct 5, 2003
    can you post the two errors?

    on one of the questionable servers wait a day. then open task manager.
    customioze the view to include:

    User Name
    Paged Pool
    Non-Paged Pool
    Handle Count
    Thread Count

    Now look at most services.
    svchost typically uses 1424 handles and my outlook 1333. I'd be looking for the process thats taking way too many tread handles.i say you might see one that is over 100,000 before the server crashes. You should be able to identify the source of the leak and move from there.
  6. oakfan52

    oakfan52 [H]ard|Gawd

    Oct 5, 2003
    ... (double post =\)
  7. MorfiusX

    MorfiusX 2[H]4U

    Feb 13, 2004
    I would try update one of them to see if it fixes the problem. SP1 fixed a number of small bugs. It's definitly worth a shot.
  8. mbrewthx

    mbrewthx [H]Lite

    Feb 29, 2004
    I would look at your backup in addition to some of the other recommendations.
    See the logs in backup exec and look for errors there and also see if your backups are running when people are trying to use the network.

    Let us know, and also give Microsoft a call.