Using Machine Language to Improve AWS Operations
You’ve used Lambda, S3, and RDS to build a successful application on AWS. However, as your application grows and becomes more successful, you may start to run into problems related to load and scaling. Is the problem with an underperforming service? Are you in the wrong AWS region? Is your application design not up to the task? These issues can be challenging to navigate and may leave you wondering what your next move.
In this video, I discussed some solutions that developers and developer groups can consider when facing these types of problems. One potential solution that I suggest is hiring a Site Reliability Engineer (SRE). An SRE is a specialized engineer responsible for ensuring that an application is highly available, reliable, and performing well. They work closely with developers and operations teams to identify and resolve issues related to load and scaling. Google popularized the idea of the SRE and you can find substantial research on the practice.
What if you could apply AWS’ years of developing and troubleshooting applications against your application infrastructure and operations? What if an AWS SRE could look at your Cloudwatch data alert and direct your developers to troublespots in code or your AWS VPC? How much would you pay for that expertise? While that level of professional services may be cost-prohibitive, AWS offers an alternative in DevOpsGuru.
Conclusion
Share This Story, Choose Your Platform!
IT infrastructure subject matter expert (Cloud, Virtualization, Network & Storage) praised for transforming IT operations in verticals that include Pharma, Software, Manufacturing, Government and Financial Services. I’ve lead projects that include consolidation of multiple data centers and combining disparate global IT operations. “Three letter” Federal agencies have called upon me to lead the modernization of critical IT communication platforms.