NetApp Authentication / Encryption problem

lcpiper

[H]F Junkie
Joined
Jul 16, 2008
Messages
10,611
Alright my mates, time to shine.

Here is the Background:

FAS2040-4, Ontap 8.2.3P3 in 7-Mode.

Was up and running fine with a pair of Win 2008 R2 DCs serving CIFS and NFS, the build and entire enterprise is actually only a couple of weeks old so sure to be some unknowns and not done yets.

Here is the problem:

Last week the team built two new Win 2012 DCs, they are AGM builds so really locked down tight security wise. On a scan they are pulling over 90% on a STIG eval out of the box.

But when they demoted the old 2008 R2 DCs my filers went to hell. No connections for CIFS or NFS. Turns out, I wasn't configured for LDAP for one reason but the new DCs require it, and are running NTLMv2.

So after work with NetApp all day I think I have the LDAP issue worked out, problem is, I can connect to a share by hostname, but when I try and connect by IP, the password comes back incorrect for the first 3 attempts and as the attempts run their course to the full 10, it locks my user account in AD and I have to reset it.

Kerberos looks OK, no time mismatch. And I am under the gun for a solution.

Any ideas ?

OH, and it's a military dev network so no connections to the world so no tech support except what I can get over the phone.
 
Is CIFS any better than it used to be? I have a 2020 at work and can only get a max of about 30-40MB/s with CIFS.

Sounds to me like NetAPP is still just as crappy as it used to be.. I am hoping that is not the case as my location is going to be getting a new NetAPP setup next year.

The old HP setup we had worked really well and was way quicker than the NetAPP 2020 they gave me to replace it with when I ran out of space. Oh, and it was a dream to setup compared to the NetAPP.

I've got my current NetAPP setup so that the shares are done through a 2012 R2 fileserver. So much faster than any other way I tried.
 
Is CIFS any better than it used to be? I have a 2020 at work and can only get a max of about 30-40MB/s with CIFS.

Sounds to me like NetAPP is still just as crappy as it used to be.. I am hoping that is not the case as my location is going to be getting a new NetAPP setup next year.

The old HP setup we had worked really well and was way quicker than the NetAPP 2020 they gave me to replace it with when I ran out of space. Oh, and it was a dream to setup compared to the NetAPP.

I've got my current NetAPP setup so that the shares are done through a 2012 R2 fileserver. So much faster than any other way I tried.


We moved all the real file shares to other storage, but CIFS still works just fine for the little jobs, like moving the next version of OnTap onto the filer or logfile storage. That being said, I do like managing NetApp filers for NFS Exports to host VM storage. A task I feel they do just fine at.
 
OK, it's taken me longer than it should have, but I have found some resolution and I am thinking the rest isn't a problem with my NetApp filers.

First off, early in the trouble shooting process we identified that LDAP needed to be configured and as part of this process we created a new service account to use with the netapp for such things. At that time I decided to run a fresh CIFS setup command and use this new service account. At that time I think I suffered from a dumb attack because I believe I set the Security Style to NTFS instead of Unix, probably because I had CIFS on the brain. Now what I didn't realize is that this will reset the Security Style on all existing storage volumes to NTFS and that is what killed vCenter connections for the datastores and cause the NFS disconnect.

Now, although the problem with NFS was the more serious, NetApp support had convinced me to stay with fixing CIFS cause they thought fixing one would probably fix the other. Not the case as it turns out, but it sounded reasonable at the time so as we worked through the CIFS and LDAP issue we reached the point above where we found that we could connect to CIFS shares by hostname but not by IP. I spent much of yesterday trying to chase down some sort of NTLM Kerberos authentication ghost and I have reached a point today where I just no longer think that everything is correctable from my NetApp. I think we have another issue with the domain controllers that is creating this odd behavior.

Just thought I would keep those who are interested updated on this. Call it professional discourtesy if you will :ROFLMAO:
 
With the hostname (FQDN) authentication is probably using Kerberos rather than NTLM. When you use the IP it will use NTLM. It sounds like that the array isn't using NTLMv2 since it sounds like the domain controllers are set to only send and receive NTLMv2. I would work with NetApp support to make sure that the array is only using NTLMv2 for NTLM authentication. You could also get a packet capture from the array side to verify which version of NTLM is being used.
 
I suspect this is correct, or that although the DC is enforcing NTLMv2, that something required for correct functioning of NTLMv2 is not configured correctly.

Two new discoveries.

The LDAP command getXXbyYY ................. can be used to check what LDAP is able to retrieve from the domain controller regarding hostnames, IPs, and test if passwords are being pulled for user accounts properly.

I get inconsistent results from these commands that again point to problems when IPs are being used for communications with the DC. Furthermore we found that when login into vCenter server, if the user enters the system hostname, user domain account and checks the box to use Windows session credentials, the login will fail, but if you enter username@domainname and the password or domainname\username and password, the login is successful.

So we must have some underlying issues with the DCs and until those are resolved I think I am spinning my wheels on the NetApp filer.


XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Whoot

Alright, the DC admin tracked down his problem, it was an NTLM setting that was refusing requests that are not NTLMv2. As soon as he backed that off so that the DCs didn't refuse the lower requests we were in, the vCenter server as well. He got that changed then all I had to do was track through both my controllers and make sure all my settings and config files were matched, then both were good to go.

Good to Go


Sometimes it's the little victories

Oh, and it's just a little satisfying to know that the same guy who is the AD admin was acting in the Boss's shoes when he chewed me out over my approach to troubleshooting so the fact that it was his shit that was weak, well anyway.


XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Last edited:
usually you have to add an @domain to log in and you have to make sure your domain controller is where all the DNS entries are resolved by modifying the host file. There is an easy way to fix but you will have to look through tech next for the easy solution. Don't forget that unix file systems uses unicode and windows needs to be set to askii, unicode or UTF-8, and the defautlt us install is askii long file names for 2008. The later ones swap to UTF-8 or Unicode. My guess the next major update after the Windows 10 Anniversary update will attempt to build the code path off a unicode based compiler, Intel has a few, Micorsoft has one that use for the international builds, and I am sure Amd, and Google have one. Google's deafult compilers used UTF-8 not unicode and tosses too many characters for memory space reasons.

Yes before someone whines I know what a resolved domain name is [email protected]. You database will still see user @ hosting server domain not client @ hosting server name, due to needing to have an entry in the sql database which the client and logged on user are two different entries. hopefully this helps add to what the other guys said. grin.
 
Back
Top