SQL Event ID 18456: Login failed for user Reason: Token-based server access validation failed

If you get event ID 18456 with Source MSSQLSERVER in your application and SQL logs with the verbiage of "Login failed for user 'domain\service.account.you.use'. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors" somebody may have messed with your "CONNECT SQL" permissions.
Everybody who reads this blog knows that I'm not a SQL guy, but a lot of advice on the internet suggests that this is UAC related so some of you Platform and AD folks might get asked about the error. Especially considering that the error suggests an issue with tokens and authentication, rather than permissions.
Though this isn't the only potential root cause, a deny or revoke or lack of grant can cause this. Especially on 2008 R2. In 2012 a lot of rules changed in regards to grouping but in 08R2 the loss of this would potentially cause various outages.
In our case what was happening was the configuration server was no longer accessible by the various SQL servers and this was appearing on that server every minute.
Try this:
select * from sys.server_permissions
and
select * from sys.server_principals
This may turn up that builtin\administrators or the service account (or some other important security principle) has a specific "deny" on "connect sql" or… revoke, or even a simple lack of grant.

Event ID 1057 – The Terminal Server has failed to create a new self signed certificate

If you receive Event ID 1057 – "The Terminal Server has failed to create a new self signed certificate to be used for Terminal Server authentication on SSL connections. The relevant status code was Key not valid for use in specified state" from source TerminalServices-RemoteConnectionManager in the System event log, you may have an issue with a lot of strange advice. For me, none of which worked. I finally figured out the problem.

The conditions you'll probably also notice is that you can't remote desktop into the server until you remove the "Allow connection only from computers running Remote Desktop with Network Level Authentication" checkbox in the Remote Desktop Session Host Configuration's RDP-Tcp properties General Tab or from the System settings under the Remote tab by changing the radio button back to "Allow connections from computers running any version of Remote Desktop (less secure)".

In my case I had already tried a lot of the advice like deleting the self-signed certificate and rebooting (MMC/Certificates/Local Computer/Remote Desktop) And deleting these keys and restarting:
“HKLM\SYSTEM\CurrentControlSet\Control\Terminal Server\RCM”  > Certificate “HKLM\SYSTEM\CurrentControlSet\Control\Terminal Server\RCM” > CertificateOld “HKLM\SYSTEM\CurrentControlSet\Control\Terminal Server\WinStations” > SelfSignedCertificate

I also deleted the Host Configuration's RDP-Tcp connection object all together and restarted the Remote Desktop Services service.

What did finally work, I noticed that we had a bunch of crypto keys that looked like this:
C:\ProgramData\Microsoft\Crypto\RSA\MachineKeys\f686aace6942fb7f7ceb231212eef4a4_XXXXXXXX

I moved them all to a subfolder so there were none left in the MachineKeys folder. I then opened the MachineKeys and re-applied the full-control permission to the local server administrators group. (Security/Advanced/Change Permissions/Replace all child object permissions) and applied this.

I then restarted the Remote Desktop Services service and this time I didn't get the error about the certificate. I changed the security setting for RDP back to secure and was able to log on through Remote Desktop.

Mitigating the Risk of Lateral Movement

In the news recently, you’ve probably read a lot about Pass the Hash.

I wanted to talk about another similar security threat that is often outlined in these whitepapers, and what we did at my customer to solve the problem.

In just about every large organization, servers and workstations are built from a standard image to save time. The problem is, those "standard" images come with a "standard" password for the local Administrator account. Oh I'm certain each of you reading this protect your Domain Admin accounts, right? Why? Because they have (in most environments) administrative rights to every computer on your network.

Guess what else has admin access to every computer on your network if you use a standardized password for your local Administrator accounts? You guessed it. All it takes is that one password which is probably the oldest one in your environment. A password which everyone who deploys servers or troubleshoots them when they're offline probably has access to. From one machine a malicious user with that information can jump laterally to every other server in your network.

There are three things all of us should do to eliminate this risk. Two of which you're probably already familiar with and perhaps are already doing. First, rename the built-in local administrator account. Second, restrict these accounts from network logons and RDP access (just use your DRAC, iLO, or crash cart). But the third is more complex, you need to make sure every single machine on your network has a unique and non-stale password for the local Administrator account.

So that's a cinch, right? Just log one-by-one into every computer in your environment. Make up some random password. Write it down on your super, um, secret-and-undoubtedly-secure Excel spreadsheet and hope nobody ever loses or misplaces that document. Riiiight. Well, what if you already had a tool that was secure, standardized, scalable,highly available, distributed, and with the capability to store stuff just like this? Well, I know one and it is called Active Directory.

A solution that I recently ran across was developed by one of our own MCS family members, and it elegantly solves this issue. It is called AdmPwd. It automatically scrambles and stores all your local Administrator passwords and stores them in AD in case you need them later. It then manages the passwords for you, changing them every 30 days (configurable). It is easy (for the right person) to find the passwords when needed and lets you force the computers to change their passwords should that become necessary. For example a local hardware admin out in a remote branch is working a server outage and needs login access. You can look up and give them that password and then easily flip a "switch" that will force the computer to change its password again the next time it is back on the network.

AdmPwd is written by MCS engineer Jiri Formacek. It is an open source GPO/CSE/PowerShell solution that is very simple to deploy, but nonetheless a very powerful tool with very few moving parts. I had this deployed in my lab in under an hour, and into production at my customer in a day.

The idea is ingenious in its simplicity. Every account on your domain exists in Active Directory as a computer security principle. And every computer in your domain has a local Administrator account, right? So that's the premise of this idea – to hook these two up.

At a high level all you need to do to deploy this solution is:

  • Download and read the AdmPwd manual from code.msdn.microsoft.com

  • Download and install the application on a management server (very small footprint)

  • Run a simple PowerShell script that automatically extends your schema *(1)

  • Create a GPO for this solution, attach it to the OU where your computers are, and run a simple PowerShell script to register that GPO with AdmPwd *(2)

  • Edit the GPO with your preferences (password complexity, etc)

  • Run a simple PowerShell script that gives users who need access to view/reset the password the appropriate rights

  • Deploy the client side extension (CSE) to your servers and workstations *(3)

    *(1) The two schema attributes which are added are confidential, meaning they are restricted from just any low-level admin or network user with LDP to query. You will, obviously, need to be a Schema Admin for a few minutes while you run this PowerShell script.

    *(2) If you're using a Central Store for your ADMX files, don't forget to copy the template and language file over.

    *(3) There are a lot of ways to do this. In reality it is a simple DLL file that needs registered. You could either do this via script, GPO scheduled task (if you need it fast), or use the same GPO your registered to silently deploy the included installer on the next reboot of the computer. I recommend the latter as any new computers added to your network will pick this up going forward.

     

    Some extra geek'y info FAQ. Feel free to read on if you're the one who will hands-on implement this:

     

    Q: Does it encrypt the passwords in AD or is it stored in the evil clear text?

    A: It is "evil" clear text. But it is stored in an attribute with the confidential searchFlags attribute set (specifically 0x288).

     

    Q: OMG WHY?!?!? Why is this thing in evil clear text in a directory database? (where's my tinfoil?)

    A: Because if your standard users can query confidential-marked attributes in your AD environment then you have a much bigger problem than lateral movement attacks. Actually I make fun but encrypted passwords will eventually be added to this solution to protect against offline attacks on the database. It is low priority however because again, you have MUCH bigger problems at this point.

     

    Q: We renamed the local Administrator account on all our computers via GPO (or manually). Will this solution still work?

    A: Yes, it uses then well-known GUID, not the display name of the account.

     

    Q: We have fourteen local admin accounts on all of our servers. Will this work for all of them?

    A: OMG why do you have fourteen local admin accounts? Your servers don't store any of my personal information on them do they? No, this is a solution for the local Administrator account's password. You might check into Managed Service Accounts and Group Managed Service Accounts.

     

    Q: We keep our computer accounts in the Computers container, will this still work?

    A: Yes, but you can't attach the GPO to the Computers container because… well, it's a container. Not an OU. So you'll need to link the GPO to the domain and that should work just fine.

     

    Q: Will this work on Domain Controllers?

    A: Sort of. Domain Controllers don't have local Administrator accounts, they have the Directory Service Restore Mode account. So it will "appear" to work, in that AdmPwd will populate a value. But you can't laterally attack anything with a DSRM account so it is MOOT.

     

    Q: Why create a new GPO rather than leveraging an existing one?

    A: This didn't work for me and I don't know why yet. The developer and I are working on it. But it worked great as soon as I created a new GPO so I'd start there.

     

    Q: It isn't working. Where do I go to see if it is trying to register?

    A: The Application Log on the client. If there aren’t events under source: AdmPwd saying things like "unable to register password, check permissions" it is possible the CSE is sitting idle. Change this registry key to beef up logging: HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\GPExtensions\{D76B9641-3288-4f75-942D-087DE603E3EA}\ExtensionDebugLevel=2  With this on you can see it running through its execution path about every five minutes. If it isn’t, your GPO likely isn’t registered. Also run a gpresult /scope computer /H PolicyResults.htm and then view that document to be sure you’re picking up the policies you set.