Network Authentication issue?

Mackintire

2[H]4U
Joined
Jun 28, 2004
Messages
2,983
OK so here's a real weird one.

We have three AD/DCs (2) in the datacenter and (1) locally


A couple of our users cannot:

  • Install a program that authenticates against AD
  • Create Mapped drives on their workstations

All of these users have full local Admin rights on their workstations.

I've restarted all the AD/DCs

I've checked and modified local security policy a couple of different ways.

I've checked Kebros authentication logs for errors.

On the static machines I've individually tested each of the DNS servers with no issues.



Here's the weird part:

Take those same machines...and move them into our static IP range 10.10.1.2-254 and boom ( things start working) .


WTH!!!!



Example of our network

10.10.1.1 /24 256 addresses Static Win 2003 R2

10.10.2.1 /23 512 addresses DHCP via ASA 5505 at a remote site

10.10.4.4 /23 512 addresses DHCP Win 2003 R2 with some of the addresses excluded at the datacenter and the other addresses excluded from distribution locally: This is so that when we have an outage machines don't go nuts from the loss of contact to the DHCP at the datacenter.

10.10.8.1 /22 1024 addresses DHCP Win 2003 R2 Not in use currently.
 
Here's the weird part:

Take those same machines...and move them into our static IP range 10.10.1.2-254 and boom ( things start working) .
There is a lot of data missing from your post, but the above makes me think your firewall is blocking a port or three, causing the issues.
 
Well, you say that the problems go away when you put the workstation on the "static" network. What are they on otherwise? Do they have a DC on the network segment with them? Is dhcp properly configured with your somewhat odd masks?

What I'm driving at is that the symptoms lead me to suspect a layer 2 or 3 issue; you are having a communication issue with your DCs.
 
Well, you say that the problems go away when you put the workstation on the "static" network. What are they on otherwise? Do they have a DC on the network segment with them? Is dhcp properly configured with your somewhat odd masks?

What I'm driving at is that the symptoms lead me to suspect a layer 2 or 3 issue; you are having a communication issue with your DCs.


The Local Network network has (1) domain controller, but I 'm not sure if I should explain it that way. We are connected to the datacenter by a 100Mb fiber ring. The other two DCs are located there, but it is logically one network.


The static (subnet) network includes all three DCs. 10.10.1.x

The normal DHCP address range for the workstations in question is 10.10.4.100-10.10.5.254 that is assuming the local DC is online and giving out addresses (which it appears it is)

The physically local DC has one nic card with (2) IP addresses 10.10.1.10 and 10.10.4.3
BOTH addresses are set to serve DNS requests.

This local DC is in both the 10.10.1.x range (of static assigned addresses) and the 10.10.4.x range of DHCP addresses
Address pool is 10.10.4.11 to 10.10.5.254 with 10.10.4.11 to 10.10.4.99 excluded from distribution



The datacenter located DCs are 10.12.44.20 and 10.12.44.21 and are routed via private Vlan using juniper/cisco equipment controlled be the ISP.

One of the DCs at the datacenter holds the PDC role.

Can they ping the DCs when they are on DHCP?

All the machines can ping all the DCs when on DHCP. NSlookup also appears to be working on both configurations.
 
What's really strange.... Try to use the mapped network drive function with alternate credentials and it will fail. Try to connect to the same share using cached credentials....no problem.

Log in a user B, Delete the credentials of user A, log off user B, log user A back on, connect via UNC to a network share....no problem.

Try to connect again using mapped drives and alternate credentials....fails (even using domain admin credentials!)

Domain Admins are set to have full local Admin access on these boxes.
 
I'm getting a messy idea of how the network is setup, maybe you should expound upon that. It sounds like you have two networks onsite; your static /24 network, and your DHCP /23 network. Are you trunking both from the datacenter? Or is the /24 local only?

Are your workstations ( when doing dhcp ) in the same vlan/broadcast domain as a DC? I assume from your other data that there is one, but I want to be clear on this point.
 
Both are trunked to and from the datacenter.

The workstations are in 10.10.4.xx
One of the DCs is 10.10.4.3 so yes, we are in the same broadcast domain.

We only have 3 Vlans

Vlan1 = Default
Vlan 2= IP phones
Vlan 3= guest wireless
 
I think this may be an AD/DNS integration issue.

When trying to authenticate via installashield (using known good credentials) we receive the message "the server can not be found"

I 'm going to go through some replication verification steps tomorrow... just to be sure.
 
The clients might be trying to connect to the .1 side of the local DC. Can they talk to the .1 side while on the .4 network?

This network design is "interesting"
 
The clients might be trying to connect to the .1 side of the local DC. Can they talk to the .1 side while on the .4 network?

How would you test this?

I know the clients can ping the .1 side, an nslookup works, changing the DNS to the .1 still allows things to route somewhat normally..... Normally being the behaviors I previously explained do not change.


Last time this issue occurred, it disappeared 3 days later.

We 're going to be migrating our domain from Server 2003 R2 to Server 2008R2 here in a week and am praying that after the migration these stupid issues disappear.

I plan on posting another thread on suggestions for DNS setup when you have 3 DCs.
 
This network design is "interesting"
Agreed, and I don't think it's a coincidence that we have an "interesting" network topology AND the symptoms being described suggest a root cause of connectivity.

OP, the problem is there is just so much going on that it's hard to isolate. In order to fully diagnose, we'd have to get our fingers in there to get real data about what's going on. However, as I mentioned above, the symptoms you have relayed suggest we are looking at a connectivity issue, so I'd start there. Look at firewalls ( software or otherwise ), routers, given your non-standard subnets I'd check your DHCP parameters as well, to ensure it's handing out the correct subnet mask.

Given what you've described, merely upgrading the DCs probably won't help.

As far as DNS setup with DCs, it's fairly straight forward. Just make sure they are AD integrated ( not required, but it does make things easier to work with ), and you're set.
 
This local DC is in both the 10.10.1.x range (of static assigned addresses) and the 10.10.4.x range of DHCP addresses

So your server is acting as a router between subnets or you have something else that does routing? Or do you do no routing and just have dual homed servers?

That sounds like a problem, if that's the case. You need a normal router.
 
Agreed, and I don't think it's a coincidence that we have an "interesting" network topology AND the symptoms being described suggest a root cause of connectivity.

OP, the problem is there is just so much going on that it's hard to isolate. In order to fully diagnose, we'd have to get our fingers in there to get real data about what's going on. However, as I mentioned above, the symptoms you have relayed suggest we are looking at a connectivity issue, so I'd start there. Look at firewalls ( software or otherwise ), routers, given your non-standard subnets I'd check your DHCP parameters as well, to ensure it's handing out the correct subnet mask.

Given what you've described, merely upgrading the DCs probably won't help.

As far as DNS setup with DCs, it's fairly straight forward. Just make sure they are AD integrated ( not required, but it does make things easier to work with ), and you're set.

When I looked at Microsoft's best practices, there were two different ways to set up DNS. In one method all DNSs have one DNS set at the primary in TCP/IP In the other, all DNSs have their own IP set as the preferred primary DNS.
 
So your server is acting as a router between subnets or you have something else that does routing? Or do you do no routing and just have dual homed servers?

That sounds like a problem, if that's the case. You need a normal router.

We have Alcatel 6850s with L3 turned on acting as the gateway for each subnet.


Currently, on the local DC, the ONE physical adapter has TWO IP addresses setup on it.
 
When I looked at Microsoft's best practices, there were two different ways to set up DNS. In one method all DNSs have one DNS set at the primary in TCP/IP In the other, all DNSs have their own IP set as the preferred primary DNS.
Rule of thumb: DCs point to themselves for DNS services. However, things can go strange in a hurry with corporate policies and infrastructure demands. The overriding concept to keep in mind is to push the data sources as close to the data consumers as possible.

And your local DC has a single network card with two IP addresses? Both IPs are in the same subnet, right? So why is it set up like that?

EDIT: Ok, just reread the thread; is the 10.10.4.0/23 subnet being delivered to the local DC? I have to ask why it's setup like this. There is no technical reason why the local DC needs to be on both subnets, it seems to be needlessly complex and further reinforces my belief we are seeing a connectivity issue...somewhere.

EDIT2: Ok, I just caught that all DCs are in the static subnet ( 10.10.1.0/24 ), as well as, apparently, their respective subnets. Why was it setup like this? I think the answer to this question will go a long way in explaining the problems you are having.
 
Last edited:
ok...

So I'll answer you question and then I'll tell you what I just modified.

10.10.4.0/23 is DHCP range that is mostly being delivered by the local DC. The reason is: twice in the past 4 months we experienced a connectivity issue with the data center. 250+ machines in this building depend on the 10.10.4.0 /23 network DHCP. Want to guess what happened when the DHCP at the data center went away? The local DC DHCP got maxed out and about 60 machines went to 169.254.x.x with no connectivity. Now we have most of the 10.10.4.0 /23 subnet being addressed locally. The DCs in the datacenter are addressed 10.12.44.xx and routed by the ISP.


Now I just found both addresses on the local DC registered in DNS (which is a big no-no) I've deleted the 10.10.4.x ip off the machine and taken it out of the DNS on all three DCs.


I guess the current Admin (my boss) believed that without a DC existing in every subnet that AD would timeout and or there would be some sort of DNS related issues. So far everything I've tested is now working again. I 'm awaiting confirmation from the development guys that everything is fixed.


Evidently sometimes AD is dependent on DNS...and sometimes DNS is dependent on AD... it all depends on how you structure your DNS server.

If DNS is dependent on AD and you misconfiguration DNS you occasionally get what is referred to as DNS islands.

If AD is dependent on DNS and you misconfiguration DNS you occasionally get....and I'm making this name up, "AD islands" where authentication on some machines goes off in the weeds and the ADs don't respond or don't receive requests. (I'm not sure which occurs, but the result is the same.)
 
Yikes. Um. Hmm, how can I say this in the nicest way possible....

I can't. Sorry, this is going to seem mean, and I apologize for that, but your environment is a mess and needs serious professional attention. I maintain that the issues you are seeing here are communication based, but given everything else I wouldn't even know where to start looking for them. Everything from your network topology to your AD configuration would need a serious going over to ensure reliability.

Good luck. You'll need it.
 
Oh I'm digging up everything. Next week we are updating new DCs to 2008 R2. I'm coming up with a plan for DNS setup that follows Microsoft's best practices.

One things at a time.
 
Oh I'm digging up everything. Next week we are updating new DCs to 2008 R2. I'm coming up with a plan for DNS setup that follows Microsoft's best practices.

One things at a time.
Honestly? Were I you, I'd put off upgrading until you have your network fixed and the DNS scheme already in place. Upgrading introduces a large amount of variables to the equation, it's not something you do when your network is, essentially, a huge question mark. And it's certainly not a "fix" you implement to resolve that.

Look at it like this: Let's say you upgrade, and suddenly NO ONE can authenticate. Where's the problem? Anyone know? With your current network, it would be a nightmare to diagnose.

Fix your topology first, get your DNS issues resolved, then upgrade.
 
Last edited:
Unfortunately I'm not the boss. We have (2) contracted network engineers assisting in the migration. Worst thing that could happen would be the DC with the PDC emulator rule gets hosed.

I think the plan is to bring the new DCs online in 2003 functionality level and elevate the DCs to 2008 mode after everything has been hashed out.

I'm watching, learning and fixing things as I come to understand what it is they are doing and why. We have some fantastic guys who know a ton about Administration and system architecture. The problem is none of those guys run the network. They are all in development and get paid far more to be there. I can consult with them over lunch, but they have other things to worry about as the network is not their problem, unless it directly effects their roles.

Enterprise 151....class begins, fun fun fun.
 
You're not the boss.

But you know DNS is broken. So you do your due diligence and run dcdiag and all the pre-upgrade checks Microsoft recommends. They will fail.

Then you write a (polite) email to the boss, with a BCC to your personal address.

To Boss,
Based upon Microsoft's best practices, we have run tests X,Y, and Z. Test X has failed with # errors. If we proceed with the upgrade, we will be operating against the manufacturers recommendations, and the results of the upgrade cannot be predicted.

Should we proceed with the upgrade against Microsoft's recommendations?

Signed,
You

That's it, you don't need a bunch of details unless you're asked. And you don't do anything until you get an answer.

And the worst thing is not that you have a broken PDC. The worst thing is that your active DNS is broken, and none of your DCs will boot properly, and no one can login. You'll call microsoft and spend a minimum of 3 hours on the phone until it's fixed.
 
Actually its not as bad as you would think.
dcdiag passes

I got my hands on a bat file over at petri.co.il that includes dcdiag -v and other functions and dumps them out to a txt file.

I found 18 errors and fixed 16 of them. The two remaining errors are above our domain. Since our provider has enterprise domain control and we are only domain admins, there's not much I can do to resolve those two issues. I will let them know the issues exist at our Friday morning meeting.


as far as I can tell... 2 of the DNS servers are configured functionally correct.

The third is now, mostly correct.

The DHCP scope issues are going to go away.

I talked him into creating a synchronized local (backup) DHCP server with AD installed in read-only mode on our virtual cluster. It's cheap insurance and its there for DHCP redundancy and read only GC and nothing more.

I also found out our Alcatel switches have some sort of intellegent DHCP relay that only forwards DHCP calls to the data center only if the local DHCP server does not respond without a certain time out period. That would be why I haven;t seen any conflicts of users showing up on the other DHCP.

All that is going away anyhow when we get rid of the superscope and put one DHCP on each network.

I don't accept fair or ok..... I want this configured correctly and I will continue to be a squeeky wheel until it gets resolved correctly.
 
Last edited:
Back
Top