Azure Stack – Identifying and Deleting False Alerts in 1.1807.0.76

Issue: When updating from 1.1805.7.57 (1805 hotfix) you may encounter an issue with Alerts displayed that are not actually true. The alerts, upon further investigation, appear to be false, but they cannot be closed from the portal.
Version Impacted: 1.1807.0.76 (only seems to be present when upgrading from 1.1805.7.57)
Microsoft Response: Being unable to clear the alerts after updating to 1807 is currently a known issue and fixes are under investigation by PG. There is no resolution at this time. Once we have an action plan or patch available we will reach out and can reassign to work with the customer in their time zone.

The two alerts I received specifically were the following:

  • Azure Stack update stopped with errors – Critical – Capacity
  • Activation Required – Warning – Azure Bridge

Identifying the alerts shown are false

  • Alert – Azure Stack update stopped with errors
    • Verified the update tile showed a completed or successful update
    • Also verified from the PEP that the update completed by the following command: Get-AzureStackUpdateStatus

  • Alert – Activation Required
    • During the registration step, we received registration successful and Azure Stack activated notifications
    • This is also verified by browsing the marketplace from Azure Stack

Now that we have verified the alerts shown are false, it’s time to manually clear them!

Manually deleting the active (false) alerts

# Navigate to the downloaded folder and import the **Connect** PowerShell module
cd C:\AzureStack-Tools-master
Set-ExecutionPolicy RemoteSigned
Import-Module .\Connect\AzureStack.Connect.psm1
# For Azure Stack development kit, this value is set to https://management.local.azurestack.external. To get this value for Azure Stack integrated systems, contact your service provider.
# For Azure Stack development kit, this value is set to https://graph.windows.net/. To get this value for Azure Stack integrated systems, contact your service provider.
$GraphAudience = “https://graph.windows.net
# Register an AzureRM environment that targets your Azure Stack instance
Add-AzureRMEnvironment `
  -Name “AzureStackAdmin” `
  -ArmEndpoint $ArmEndpoint
# Set the GraphEndpointResourceId value
Set-AzureRmEnvironment `
  -Name “AzureStackAdmin” `
  -GraphAudience $GraphAudience
# Get the Active Directory tenantId that is used to deploy Azure Stack
$TenantID = Get-AzsDirectoryTenantId `
  -AADTenantName “contoso.onmicrosoft.com” `
  -EnvironmentName “AzureStackAdmin”
# Sign in to your environment
Login-AzureRmAccount `
  -EnvironmentName “AzureStackAdmin” `
  -TenantId $TenantID
  • Delete any and all active alerts by using the following powershell script:
Import-Module AzureStack
Write-Host -ForegroundColor Cyan “Getting active alerts”
$alerts = Get-AzsAlert -Filter “(Properties/State eq ‘Active’)”
$alerts | ft AlertId, Title, State, Severity, ImpactedResourceDisplayName, CreatedTimestamp, ClosedTimestamp -AutoSize
Write-Host -ForegroundColor Cyan “Closing all active alerts”
foreach($alert in $alerts) {
    Write-Host (“Closing alert {0} – {1}” -f $alert.AlertId, $alert.Title)
    Close-AzsAlert -Name $alert.AlertId -Verbose -Force
}
  • Verify the alerts are now gone from the Admin Portal

vRealize Orchestrator – Resolving the ${message} blue screen issue

Here’s an issue that frustrated me for a while until I was able to finally resolve it. If you’re here reading this too, I feel your pain… Hopefully this helps you out as well!

I’m deploying vRealize Orchestrator (vRO) 7.3 in our lab for testing as I continue to build our cloud environment. To help detail the issue we have been having, I’ll provide a quick overview of our environment.

For our cloud, we have three separate environments:

  • Core
    • Management nodes (NSX mgr, AD, DNS, SQL, PSCs for vCenter, and vCenter)
  • Automation
    • vRealize suite (vRO, vRA, IaaS, SQL, PSC for Auto environment)
  • Networking
    • NSX load balancer, ESGs, DLRs

During the initial vRO configuration, you configure it as standalone and then choose your authentication method. We are using vSphere authentication which will authenticate via the PSC (Platform Services Controller) in the Auto environment. We have a single SSO domain with relationships setup between the Core PSC and the Auto PSC.

Now that I’ve set the premise, let’s talk about the issue at hand. During the vRO standalone config, if you are using a load balancer you have to change the hostname to the your LB VIP for vRO. Then on the next screen you configure your authentication source. We’re using vSphere authentication and set it to our Automation PSC. Once complete, you’re taken right into control center using the root account. If you logout at any point, you may encounter the following issue when trying to browse back to control center (https://vro1.domain.local:8283/vco-controlcenter)

vro-issue-sso.jpg

Here’s what I realized after seeing this issue and attempting various failed fixes… we had missed a step during our NSX load balancer configuration. Since the hostname was set to the vRO VIP and the authentication source now set to our PSC, SSO was looking to authenticate via our VIP rather than the local node. This lead us back to NSX where we had to configure another virtual server for port 8283 and a pool for our two vRO nodes as well.

Here’s what we ended up configuring on the NSX end:

NSX Virtual Server on the Load Balancer

vro-nsxlb-virtualserver.jpg

NSX Pool on the Load Balancer

vro-nsxlb-pool.jpg

Once that was in place, I was able to get to the vRO control center using the VIP address. I also was able to join the second node to the cluster and verify all was good on that end after applying our needed SSL certificate!

vRO-cluster-configured.jpg

Learning Puppet

In this learning puppet series, you will see my thoughts and notes as I work to teach myself puppet within my own home lab. Any issues I come across will be documented here along with any notes I wish to save to document my journey.

I hope find this series helpful and I challenge you to take on this task with me as well! I have absolute zero experience with using puppet as a configuration management tool. In the past, I have used various shell scripts or powershell scripts to automate configuration tasks. Puppet provides an easier and better way to standardize configuration across your infrastructure.

I will be using the Puppet Learning VM along with a book “Puppet 4.10 Beginner’s Guide” by John Arundel. The book is fairly dated, but after reading several reviews I feel this is a good place to start and then we catch up to the latest release (5.x) from there.

To begin, let’s discuss the lab configuration. My initial thought is to start small with the Learning VM and a small local instance. Once I’m comfortable enough, I would like to test out a larger instance using AWS. Puppet has a Pay As You Go enterprise instance on AWS that is free for 1 – 10 nodes. Even though that may be free, the compute/storage in the lab to run it won’t be which is why I will want to refer to AWS much later.

Home lab configuration:

  • Puppet Master
    • Ubuntu 16.04.2 LTS (Xenial)
    • 6GB RAM
    • 2 x CPU
    • 100GB storage
  • Puppet nodes
    • Ubuntu 16.04.2 LTS (Xenial)
      • DNS1 – DNS server for the home lab
      • WEB1 – Apache/PHP server for testing
      • DB1 – MySQL DB server for testing