Module 1: Introduction to Google Cloud Monitoring Tools
- Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring.
- Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace, and Profiler.
Module 2: Avoiding Customer Pain
- Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation.
- Measure customer pain with SLIs.
- Define critical performance measures.
- Create and use SLOs and SLAs.
- Achieve developer and operation harmony with error budgets.
Module 3: Alerting Policies
- Develop alerting strategies.
- Define alerting policies.
- Add notification channels.
- Identify types of alerts and common uses for each.
- Construct and alert on resource groups.
- Manage alerting policies programmatically.
Module 4: Monitoring Critical Systems
- Choose best practice monitoring project architectures.
- Differentiate Cloud IAM roles for monitoring.
- Use the default dashboards appropriately.
- Build custom dashboards to show resource consumption and application load.
- Define uptime checks to track aliveness and latency.
Module 5: Configuring Google Cloud Services for Observability
- Integrate logging and monitoring agents into Compute Engine VMs and images.
- Enable and utilize Kubernetes Monitoring.
- Extend and clarify Kubernetes monitoring with Prometheus.
- Expose custom metrics through code, and with the help of OpenCensus.
Module 6: Advanced Logging and Analysis
- Identify and choose among resource tagging approaches.
- Define log sinks (inclusion filters) and exclusion filters.
- Create metrics based on logs.
- Define custom metrics.
- Link application errors to Logging using Error Reporting.
- Export logs to BigQuery.
Module 7: Monitoring Network Security and Audit Logs
- Collect and analyze VPC Flow logs and Firewall Rules logs.
- Enable and monitor Packet Mirroring.
- Explain the capabilities of Network Intelligence Center.
- Use Admin Activity audit logs to track changes to the configuration or metadata of resources.
- Use Data Access audit logs to track accesses or changes to user-provided resource data.
- Use System Event audit logs to track GCP administrative actions.
Module 8: Managing Incidents
- Define incident management roles and communication channels.
- Mitigate incident impact.
- Troubleshoot root causes.
- Resolve incidents.
- Document incidents in a post-mortem process.
Module 9: Investigating Application Performance Issues
- Debug production code to correct code defects.
- Trace latency through layers of service interaction to eliminate performance bottlenecks.
- Profile and identify resource-intensive functions in an application.
Module 10: Optimizing the Costs of Monitoring
- Analyze resource utilization cost for monitoring related components within Google Cloud.
- Implement best practices for controlling the cost of monitoring within Google Cloud.