Reliability Assessment Guidance Start Here
This repository focuses on business applications built by organizations upon the Power Platform. It encompasses best practices learned and crafted by experts in the field.
This overview will highlight the principles of Reliability as they pertain to Dynamics 365 and the Power Platform. We understand that each application has unique requirements and nuanced architecture that can span into other technologies and clouds. As such we will provide our best effort detailing workloads within the Dynamics 365 and Power Platform ecosphere.
Reliability is an assurance. It covers application layers and tiers. It involves multiple teams including but not limited to both Operational and IT groups. The assurance provided is intended to keep significant problems from essentially crippling organizations.
Some of the quality attributes attested to reliability include:
- Resiliency
- Recovery
- Operational observability
- Simplicity and efficiency
Power Platform provides opportunities to build in observability and recovery into each pillar. Power Apps can trigger alerts based on tests or observed patterns. Power Automate has retry mechanisms. Copilot Studio sends telemetry and session information based on conversations.
Each workload that uses any of these pillars should include these capabilities. Each one that doesn’t, especially business critical workloads, could jeopardize your ability to react quickly to any problems.
As Power Platform provides opportunities so does Dynamics 365. Native redundancy for failover to a secondary location.
Each guidance provided is intended to help frame and answer the question. Each question will include comments as will each possible response to the question. The structure will look similar to this:
- Comments
- Artifacts, Learnings and General References
- Possible Question Response
- Comments
- Note detailing how to answer
- References
Guidance on Question 1: How do you keep the workload simple and efficient?
Guidance on Question 2: How do you identify and rate the workload’s flows?
Guidance on Question 3: How do you perform failure mode analysis?
Guidance on Question 4: How do you define reliability targets?
Guidance on Question 5: How do you strengthen the resiliency of your workload?
Guidance on Question 6: How do you implement background jobs?
Guidance on Question 7: How do you test your resiliency and availability strategies?
Guidance on Question 8: How do you plan for disaster scenarios?
Guidance on Question 9: How do you plan to monitor health?
Store the CSV file and perform another assessment at an agreed upon date. Review the new assessment to the previous to determine trends.
Review the artifacts contained in this repository. Review the self paced trainings below.
Failure Mode Analysis for Mission Critical Applications Template
Microsoft Business Applications Resiliency Shared Responsibility Matrix