stumped on email delivery problem- need help bad

Joined
Aug 10, 2001
Messages
2,312
okay, this has been plaguing me for a week now and i thought it was fixed. perhaps not though.

last weekend i changed our email infrastructure to include an smtp relay through which incoming mail would be filtered. the flow of outgoing mail was not altered- our exchange server still connects directly to the internet to send mail. there were however, changes made to the firewall to accomodate the new setup so perhaps i botched something there.

so, mail comes in to the firewall where it is redirected to a windows server running fluffy the smtp guard dog. that does some spam blocking, etc and passes the mail onto our exchange server. the exchange server connects straight out to the internet to send. there is symantec AV and ihatespam running on the exchange server itself.

those changes were made 10 days ago, and all last week i was having problems with outgoing email. everything coming in worked great. going out the problem was tricky. we were able to send email to a lot of people. however, it seems that certain entire domains became unreachable. shell.com and hotmail.com to name 2. also, i could send email to my gmail account, but it would take like 15-20 minutes to arrive.

in any case the problem seemed to rest with domains so i thought it could possibly be a DNS issue. i changed the DNS servers used by our mail server from local, root, root, and local to local, ISP, ISP, local. also, i noticed that the firewall was blocking any DNS requests from all but our 2 local DNS servers anyway so i added an allow for the email server. so this morning i come in and the queue on my email server has gone down to a normal level. it was huge on friday. i send my gmail account an email and it arrives in about 3 minutes. i think "great, all fixed."

now i see that we are still having trouble reaching a few domains. i tried a certain domain about 30 minutes ago and the msg has not gotten there yet. now here comes the kick in the pants. from the email server i am able to run nslookup to find the mx records for basically ANY domain that we cannot get email through to. then i can telnet to port 25 of the primary mx record and get their smtp banner, send a helo, and get a hello back. so it seems as if the connection is completely fine. and yet, i am unable to actually get an email through to them.

i'm baffled. i just sent an email to my gmail and wildmail accounts. gmail was fine, but i got an unable to send to that recipient msg for my wildmail.com account. running some google searches on the "You do not have permission to send to this recipient." msg i see a lot of posts about our mail server possibly being on an RBL and the recipient mail servers are blocking us using that list. well, i went to rbl.org and MAPS and ran our IP through their RBL checks and they come back clean.

so again i say it, i'm baffled. can anyone provide some insight? further troubleshooting ideas? i'd greatly appreciate it.

thanks,
billy ocean
 
We had almost the exact same problem with our Exchange 2000 Pro server this summer, at the school I work at...we were running a Watchgaurd Firebox 1000.... I will give my boss an IM and talk to him about what he eventually did to get things working again :confused:
 
are you sure your outgoing is not being filtered?
That would be my only guess, is that for some reason your outgoing email is getting sent through the filter.
 
only outgoing filter i have set up (that i know of at least) is that my symantec AV installation is supposed to delete any msgs trying to go out addressed from postmaster.
 
Try routing your outgoing through your ISP and see if it changes anything. Maybe its the way people are reacting towards your exchange server (some IT admins are so extremely paranoid about smaller smtp servers and either block or filter them).
 
okay, i spoke with a tech at another organization we were having trouble reaching a couple times today. he says that he checked his smtp logs and sees that there was a communication this morning between his server and mine (i tried to send him an email). the conversation goes as follows:
my.email: opens connection
his.email: returns banner
my.email: sends helo
his.email: returns helo
my.email: ****does nothing****
his.email: closes connection after waiting for timeout period to pass

i am running exchange 2000 on a windows 2000 server. anyone have an idea of what would possibly be causing this? the processor on this machine is fairly busy at times, but not to the point where i'm concerned about it. most times i've beeen checking taskmanager today it's sitting pretty much idle. there is roughly 512MB of free RAM at any given moment (out of 2GB total).

thanks again
 
okay, i found an event log entry saying something about the exchange DBs being fragmented and that i need to stop and restart all exchange services. so as a result i thought i'd check the disk to see if it was fragmented as well. it was. horribly. so i spent yesterday afternoon and most of last night running defrags on the sys partition and the partition with the IS. finally got the exchange partition to a very good condition.

since last night i do not see the major event warnings i saw before. exchange services have been restarted as the event log asked. i don't know if this fragmentation is enough to cause the problem i saw though so i'm not ready to call this a solution.

if anyone has ever seen their exchange server open connections to destination mail servers, but then not send any data please post here because that seems to be exactly my problem.

thanks,
big daddy fatsacks
 
big daddy fatsacks,

Not sure if you have done this already, but defragging the exchange database requires taking it offline and running a CLI util. Checkout this MSKB.
 
SJConsultant said:
big daddy fatsacks,

Not sure if you have done this already, but defragging the exchange database requires taking it offline and running a CLI util. Checkout this MSKB.
yeah, i may have to do that. do you have any idea how frequently the offline defrag is necessary? all it says is if you move users off the server it is a good idea. but is offline defrag also just a part of good periodic maintenance?

just need to decide when i can do this. i've run it when i had to do a restore before and it takes forever.
 
big daddy fatsacks said:
yeah, i may have to do that. do you have any idea how frequently the offline defrag is necessary? all it says is if you move users off the server it is a good idea. but is offline defrag also just a part of good periodic maintenance?

just need to decide when i can do this. i've run it when i had to do a restore before and it takes forever.

Depends on how heavily your Exchange services are utilized and information is added vs deleted. The Exchange store does not shrink automatically as information is deleted, thus the need for the offline defrag. However, as information is added, it will eventually fill up that void.

Heh.. it's own of those topics that is very specific to the environment your working in.

As a sidenote, I am evaluating a product called Perfectdisk for Exchange which is supposed to have the ability of automating the process of the offline defrag. I just installed it about a week ago, but I haven't tested the Exchange portion of it yet.
 
well i'm 60% through the mailbox store DB defrag atm. then i'll do the public folder store, and then tomorrow i'll find out if that actually helped at all.

SJConsultant, how is perfectdisk? regardless of the exchange portion i need a better defragger than the windows one. it's telling me disks need defragging, and then when i run it it does nothing. this is after running it like twice in a row and it actually cleaned things up a little. i guess it's just not that great.
 
big daddy fatsacks said:
well i'm 60% through the mailbox store DB defrag atm. then i'll do the public folder store, and then tomorrow i'll find out if that actually helped at all.

SJConsultant, how is perfectdisk? regardless of the exchange portion i need a better defragger than the windows one. it's telling me disks need defragging, and then when i run it it does nothing. this is after running it like twice in a row and it actually cleaned things up a little. i guess it's just not that great.

So far I have tried Diskeeper, O&O, and now Perfectdisk.

Diskeeper - Licensing implications can be very expensive depending on Volume sizes and OSs involved.

Diskeeper has Server Standard and Enterprise Edition @ $249 and $999 respectively. Enterprise edition is required if the environment is running 2k Advanced server *or* if the logical volume sizes are 100GB or more.

Centralized deployment and management requires an additional $99 admin software for each workstation that will be used to manage the network defrag. Pricing alone IMO, excludes the possibility of using this software for my clients or I.

O&O - Scheduler sometimes fails on network deployment and is very clumsy to work with. I have also run into problems with the O&O defrag service terminating unexpectedly on either large volumes (100Gb or more), heavily fragmented drives, or sometimes for no apparent reason. Service termination happened on servers and workstations which require a restart. Not even setting the service to restart on failure works.

Admin capabilities are built into the server and workstation versions. Remote deployment can be done through the software, but I have found it to be a bit unreliable.

Because of the termination errors without any resolution from O&O, I can't recommend this product.

Perfectdisk - By far this defragger seems to do the most amount of "work" on initial and subsequent defrags than the others. Scheduling network defrags is pretty straightforward, plus I like the ability to watch a defrag in progress from a *remote* system. I have several systems with a combined total of over 600GB in storage space (and growing!), Perfectdisk has not faltered on any of my systems doing online or offline defrags.

Two things i have yet to test are the Exchange defrags and the ability to install and control Perfectdisk using group policies. Some of the policies that can be defined are:

Computer settings: defrag threshold, network enabled or not, logging of messages, configuring of autoupdate

User settings: whether a user can analyze, defrag, change schedules. You can configure it so users cannot even run PerfectDisk.

All Defraggers above are comparative in price with the two exceptions for Diskeeper ( being the separate admin console and Enterprise Edition).

I've only been utilizing PerfectDisk for about two weeks on my own Servers and Workstations, but I'm impressed with the capabilities provided in the package and so far am leaning towards this software as the winner for the best combination of price / performance / and features.

I realize that some people probably have run these products with no problems, however, all of the above are as a result of testing products during their full trial period and my results are not indicative of how well these products will perform in other environments.

I wish I had the time to go into a much fuller detail and make actual "lab quality" tests, but for time being, my client demands are taking up much of my time.
 
well that's a pretty thorough review. thanks a lot.

okay, i seem to have gotten this problem under control. i'm wary of declaring it "fixed" but i have been able to get emails through to some addresses i was not able to just a few days ago.

on a related note though- there are messages in my mail queues that should not be there. i see destination domains that no one has business sending mail to. like:
funimg349.biz
imgwtfj32.biz
mx.121mailoffers.org
etc

i also see some messages in the queues from either postmaster or blank senders. i always check my symantec system center console, but no one shows up as having a virus. so that shouldn't be the source of this issue. i'm not really sure how to track down the source of these messages so i can eliminate it with extreme prejudice.

thanks again
 
big daddy fatsacks said:
on a related note though- there are messages in my mail queues that should not be there. i see destination domains that no one has business sending mail to. like:
funimg349.biz
imgwtfj32.biz
mx.121mailoffers.org
etc

i also see some messages in the queues from either postmaster or blank senders. i always check my symantec system center console, but no one shows up as having a virus. so that shouldn't be the source of this issue. i'm not really sure how to track down the source of these messages so i can eliminate it with extreme prejudice.

thanks again

One thought comes to mind, in my experiences, I have noticed an increase of using "reverse NDR" spam. If your email system is configured to send out NDRs then this might explain some of them.

Can you check the content of some messages to get a better idea what's coming in or going out that is suspicious?
 
Back
Top