Azure Monitor, with Log Analytics Workspace as its data store and Kusto Query Language as its query engine, is the standard observability platform for Azure workloads.

The KQL query language

Kusto Query Language is Azure Monitor's query language for Log Analytics and Application Insights. Its pipe-based syntax (table | where | summarize | render) is concise and expressive for log and telemetry queries. KQL is optimised for log analytics: full-text search, time-series summarisation, join operations across telemetry sources, and geospatial queries. Learning KQL is a prerequisite for effective Azure observability, the Azure portal dashboards and alerts are built on KQL queries.

Application Insights for application telemetry

Application Insights auto-instruments .NET, Node.js, Python, and Java applications to collect: HTTP request metrics (rate, latency, failure), dependency calls (database, HTTP, Service Bus), exception tracking, and custom events. The Application Map visualises dependencies between services based on telemetry. Availability tests (HTTP ping tests from multiple Azure regions) monitor endpoint availability from outside the application.

Diagnostic settings and resource logs

Every Azure resource can emit diagnostic logs and metrics to a Log Analytics Workspace via diagnostic settings. Enabling diagnostic settings for AKS (kube-audit, cluster-autoscaler logs), Azure SQL (query performance, blocking), and Azure Front Door (access logs, health probe logs) feeds the data into Log Analytics for unified querying. The Azure Monitor baseline: diagnostic settings enabled on all critical resources, retention configured to meet compliance requirements.

Azure Alerts and action groups

Azure Monitor alerts trigger on metric thresholds, log query results, and resource health events. Action groups define the notification and remediation actions on alert: send an email, trigger a webhook, call a Logic App, or create an ITSM incident. Static metric alerts are simple but blind to workload changes; dynamic thresholds (ML-based) adapt alert thresholds to the workload's historical patterns, reducing false positives from traffic spikes.