BYTE the Cloud | Transcript: Ep. 105 | AWS Auto Scaling Overview & Exam Prep | Mgmt & Governance | SAA-C03

Ep. 105 | AWS Auto Scaling Overview & Exam Prep | Mgmt & Governance | SAA-C03 | AWS Solutions Architect Associate

March 1, 2025 / 22:06/E105

Chris 0:00
Hey, there, fellow cloud wizaRDS, welcome to another deep dive designed just for you mid level cloud engineers who live and breathe. AWS, right today? Yeah, we're going deep on a service you're probably using every day and maybe even thinking about for an upcoming certification exam. That's right. AWS Auto Scaling.

Kelly 0:20
We're talking about making sure your applications can handle whatever's thrown at them exactly, from crazy traffic spikes to Late Night deployments.

Chris 0:28
Okay, let's break this down. So for anyone new to this, what exactly is Auto Scaling, and why should we care so much about it? Auto

Kelly 0:36
Scaling is like your application's bodyguard in the cloud. Okay, it automatically adjusts the number of EC2 instances you're running to match what your application needs. So no

Chris 0:47
more manual scaling in the middle of the night exactly when traffic suddenly spikes right or scrambling to add instances when a new feature launch takes off. Yeah, it's

Kelly 0:55
like having a super smart always on assistant. Oh, cool. Who can predict when things are about to go crazy and spin up more servers, right when

Chris 1:02
you need them, and it works the other way too, right? Absolutely. Traffic dips down. It scales down your

Kelly 1:07
instances, okay, to save you money, got it and prevent wasted resources. Makes sense. It's a self adjusting engine that keeps your application running smoothly and efficiently no matter what,

Chris 1:16
so agility, responsiveness, keeping those applications healthy, and for us cloud engineers, yeah, it means we can work on more interesting stuff, instead of constantly babysitting server capacity, right? Absolutely.

Kelly 1:29
It frees you up to design and optimize your cloud architecture. That's the good stuff, instead of being bogged down with manual scaling right. Now, imagine this. You're working on a popular e commerce app, and you've got a big sale about to start, 1000s of shoppers hitting your site. That sounds stressful. Without Auto Scaling, you'd have to either over provision your instances, which costs more, right, or risk slow performance and frustrated customers. Definitely not good. Auto Scaling spins up those extra instances before the sale even goes live. Oh, wow, for a smooth shopping experience for everyone. So it's like

Chris 2:07
a safety net that protects you from those sudden traffic surges. Exactly, okay, so we've got this system that automatically adjusts server capacity based on demand. Yeah. But how does it actually work? So

Kelly 2:18
Auto Scaling relies on something called Scaling policies, which are basically rules you define based on CloudWatch metrics. You could have a rule that says, if CPU utilization hits 80% spin up two more instances, or if network traffic goes above a certain level, add five more instances to the pool.

Chris 2:39
So we set thresholds and let Auto Scaling take over based on those pre defined rules, exactly

Kelly 2:43
like giving Auto Scaling a set of instructions and then letting it do its thing. Cool. Now, when Auto Scaling needs to launch new instances, it has to know what those instances should look like, right? This is where launch configurations and launch templates come in. So that's like a blueprint or recipe for the perfect instance. That's a great way to think about it. Okay? It lets you pre define everything. Okay, about those new instances. The instance type the AMI, the security groups, everything I see. So when it needs to scale up, it knows what to do and how to do it the same way every time. Very cool. So it's not just about scaling up and down, but making sure each instance is set up exactly how you need it.

Chris 3:23
What about those instances that are already up and running right? How does Auto Scaling make sure they're healthy? It

Kelly 3:29
uses health checks, just like when a doctor checks your vitals, Auto Scaling is always watching your instances nice. And if an instance is unhealthy, maybe has high CPU load or network issues, Auto Scaling, automatically swaps it out with a new healthy one. So an

Chris 3:45
auto healing system built right in. You got it not just scaling, but resilience, yeah, making sure the application keeps running even when things go wrong.

Kelly 3:53
And there are different ways to scale. Oh, tell me more. You have dynamic scaling, scheduled scaling, and even predictive scaling, predictive scaling.

Chris 4:01
That sounds interesting. Predictive

Kelly 4:03
scaling uses machine learning, oh, wow, to actually forecast traffic patterns. That's cool. So you can proactively scale your application before a surge even hits. It's like having a crystal ball for your application exactly next level proactive management, so

Chris 4:18
you're always ahead of the game, right? Okay, I'm starting to see how powerful Auto Scaling can be. Yeah, it's like having a whole team of cloud engineers working behind the scenes making sure everything runs smoothly, right? But let's be real, yeah, no, system is perfect, sure. What are some limitations? Sure, so we know what to watch out for, especially if we're thinking about that certification exam.

Kelly 4:41
Let's talk about those limitations. Okay, one thing to remember is scaling cool down periods. Okay, those are intentional pauses after a scaling activity, whether it's up or down, to keep things stable and avoid chaos. So it's

Chris 4:57
a giving the system a moment to catch its breath. Before making more changes,

Kelly 5:01
exactly, it needs time to settle down before we adjust. Anything else makes sense. Another thing to keep in mind is instance spin up time, right? Even with predefined configurations creating a new EC2 instance takes time, yeah, I've been there, and that can cause delays if there are sudden, unexpected traffic spikes. Okay,

Chris 5:20
so it's not instant, right? We have to account for that time, yeah. What about cost? Good question, Auto Scaling sounds great for performance, yeah, making sure things are resilient. But can it actually save us money, too? It

Kelly 5:34
can. That's good. Cost optimization is a big benefit of Auto Scaling. Okay, it keeps you from over provisioning when demand is low, right? You do have to be mindful of how you configure those scale and policies to avoid extra costs. I too, it's all about finding the right balance between being responsive and using your resources wisely, so

Chris 5:55
finding that sweet spot between performance resilience and cost efficiency. You got it all right. I think we have a good grasp of the basics and some of the limitations. Now let's get into some of the more advanced stuff. Okay, the things that can make us Auto Scaling pros, sure, especially with those certification exams in mind,

Kelly 6:12
perfect. Let's test our Auto Scaling knowledge and see how we answer those tricky exam questions. Bring

Chris 6:18
it on. I'm ready to level up. Okay,

Kelly 6:19
let's say you get an exam question that asks, okay, you're working on an application with really predictable traffic patterns. Okay, what type of scaling policy do you use

Chris 6:31
predictable traffic patterns, right? So that sounds like scheduled scaling exactly, because we know when things are gonna get busy, right? So we set up our Auto Scaling group to scale up in advance, exactly.

Kelly 6:44
It's like setting an alarm clock for your application. Okay? You know, like every day at 8am traffic is gonna spike as people start their work day, right? So you tell Auto Scaling Hey, add instances right before that happens. Okay,

Chris 6:57
that makes a lot of sense, right? But what if we have an application? Yeah, that gets these, like, really short but intense traffic spikes every hour.

Kelly 7:06
That's a good question, yeah, and it shows why we need to understand the differences between these scaling policies, right? In that case, dynamic scaling might be better, okay, because it's designed to react in real time, so it's constantly looking at those CloudWatch metrics, right? And as soon as it sees a spike, okay, it kicks into action, adding more instances to handle the load. So

Chris 7:29
it's like having a super fast reflex system that can just adapt

Kelly 7:32
Exactly. Imagine having a team of responders just waiting, oh, cool, ready to jump in. Okay. Now here's a common exam question, okay, you need to make sure your Auto Scaling group always has at least four instances running, okay, even when things are quiet. How do you do that? That's those

Chris 7:49
capacity settings, right? Yes, we just set the minimum capacity to four Exactly. So even if traffic drops to zero, yep, we'll always have those four instances up and running spot

Kelly 7:59
on. And this shows how important capacity management is with Auto Scaling,

Chris 8:02
right? It's not just about scaling up and down. It's about defining those boundaries, right,

Kelly 8:07
the upper and lower limits, right, okay, to keep your application running smoothly. Got it? Okay? How about this one? Okay, what is the purpose of a health check grace period in Auto Scaling. Ooh, I

Chris 8:21
remember we talked about this before we did so how would you explain this in an exam scenario? Think

Kelly 8:27
of it like this. Okay, you launch a brand new instance. Okay, it's like a newborn baby, okay. It needs some time to boot up. Okay, load all its software makes sense, pass its health checks before it can handle traffic.

Chris 8:41
So the health check grace period is like giving that instance a chance to get settled

Kelly 8:46
exactly. You don't want to send traffic to an instance that's not ready, right? That makes sense. Okay, let's talk about how Auto Scaling fits in with other AWS services. Okay, we talked about CloudWatch, right? What else works well with Auto Scaling?

Chris 8:59
Well, elastic load balancing comes to mind, of course, or ELB, yep, it's like the traffic cop that sits in front of your Auto Scaling group, right? And make sure incoming traffic is balanced across the instances

Kelly 9:11
as Auto Scaling adds or removes instances. Okay? ELB, makes sure traffic is routed

Chris 9:16
correctly, so a dynamic routing system that adapts as things change precisely,

Kelly 9:22
and then we have launch templates right, which give us a consistent way to define our instances, right? But what if we want to manage those instances after they're launched? Yeah, things like patching, configuration, software updates, that's

Chris 9:36
where AWS Systems Manager comes in, right, to automate things, yes, and keep our instances consistent right, even as they scale up and down. You got

Kelly 9:43
it end to end management, okay, making sure our instances are launched correctly, okay, and kept healthy throughout their life cycle, yeah, but

Chris 9:52
what about cost optimization with Auto Scaling? Great question.

Kelly 9:55
Yeah, cost optimization is a huge part of Auto Scaling, and one of the best. Goals we have is right sizing our instances.

Chris 10:02
Okay? So it's not just about having enough instances, right? It's about having the right kind of instances Exactly.

Kelly 10:07
If you're running a small application, yeah, that doesn't get a lot of traffic, right? You don't need those big, expensive instances. Makes sense. You can choose smaller, more cost effective ones,

Chris 10:19
okay, but what about those times when we do need larger instances, yeah, like during a traffic surge or a big product launch, are we stuck paying those high prices? Not

Kelly 10:29
necessarily. Oh, good. We can use things like reserved instances or Spot Instances, okay, to bring the costs down. Oh, I love saving money. Yeah. So reserved instances are great for stable workloads. Okay, you commit to using a specific instance type for a certain amount of time, like a year or three years, and you get a discount. It's like a bulk discount for compute Exactly, okay, then we have Spot Instances, okay, those are unused EC2 instances. Oh, okay, that you can bid on. Oh, cool, and get a much lower price. But there's got to be a catch, right? There is Spot Instances can be interrupted with little notice if someone else outbids you. It makes sense. So they work best for workloads that can handle interruptions. Okay,

Chris 11:14
so it's all about understanding what you're working with and choosing the right option precisely. All right, I'm feeling good about our Auto Scaling knowledge so far. Yeah, ready for some more exam style questions? Absolutely,

Kelly 11:24
practice makes perfect. Okay,

Chris 11:26
here's a tricky one. All right. You notice that your Auto Scaling group is really slow, okay, reacting to changes in demand, uh huh. What could be causing this?

Kelly 11:38
Ooh, a performance mystery, yeah, let's see so slow reaction time, yeah, first thing I check is the scaling policy itself, okay? If the thresholds are too conservative, yeah, it might take a while for Auto Scaling to kick in,

Chris 11:53
right? We need those CloudWatch metrics to be sensitive enough, exactly. But what about that instant spin up time? Oh, absolutely. That's always a factor. Even

Kelly 12:01
if Auto Scaling makes a decision quickly, right? It still takes time to actually get those new EC2 instances up and running, yeah, and that can take a while, especially if you're using large instance types or custom Ami. So it's

Chris 12:15
not just about how fast the decision is made, right? It's also about how long it takes to execute that decision Exactly.

Kelly 12:22
Here's another thing to think about, okay, resource contention. Okay? If your Auto Scaling group is competing for resources with other applications or services, right? That can slow things down too. So we need

Chris 12:36
to be aware of the whole AWS environment, yes, not just our little Auto Scaling setup. It's like

Kelly 12:41
traffic in a big city, right? If there are too many cars in the road, everyone slows down, right?

Chris 12:44
Resource availability, network bandwidth, exactly. All those things can affect our Auto Scaling group.

Kelly 12:50
It's like conducting an orchestra. Ooh, I like that. Everything needs to play its part at the right time, right? That's what makes Auto Scaling so interesting. Yeah, there's always something new to learn and optimize. You're right about that. Ready for another challenge, absolutely. All right, let's say we have an application that gets these brief, okay, but really intense traffic spikes every hour.

Chris 13:12
Okay, so brief, intense traffic spikes every hour.

Kelly 13:16
What type of scaling policy would you use? Hmm, to handle that.

Chris 13:21
It sounds like we need something that can react fast. Yeah, so dynamic scaling seemed like the best choice. You're right, because it can just spit up those instances right when the traffic hits. Dynamic

Kelly 13:31
scaling is perfect for that. Okay, it's responsive, right, and can handle those real time changes. But

Chris 13:38
is there a downside? Yeah, there's

Kelly 13:40
something we need to think about. What could go wrong. So even though dynamic scaling is fast, yeah, it's not instant. Remember those instant spin up times we talked about, right? If those traffic spikes are really short, you might end up scaling up and then right back down before the new instances are even ready. Ah.

Chris 13:58
So it's like calling for backup, yeah, and then realizing you didn't actually need it exactly that could make things unstable. It's like adjusting

Kelly 14:06
the thermostat too quickly, okay, you overshoot the target and things fluctuate too much, right? So

Chris 14:10
dynamic scaling is a good place to start, yeah, but we might need to add something else, right? So what other options are there? What other

Kelly 14:17
options are there? Well, maybe we could combine dynamic scaling with something else, okay, like, what like scheduled scaling? Oh, interesting. So we could use scheduled scaling, yeah, to get ready for those hourly spikes. Okay? And then dynamic scaling could handle anything extra. So a two pronged approach, exactly proactive and reactive. Okay, I like it. Now. How about this question? Okay, how do we make Auto Scaling launch instances across multiple availability zones. Oh,

Chris 14:45
that's all about high availability. Yes, we don't want all our eggs in one basket, right? So how do we tell it to spread things out? So

Kelly 14:52
when you create your Auto Scaling group, yeah, you choose the availability zones you want. Okay? And you. Make sure your subnets, load balancer and everything else are also set up for multiple availability zones.

Chris 15:05
Okay, so it's about more than just the Auto Scaling group. It's

Kelly 15:08
a whole system approach to make sure your application can keep running right, even if one availability zone goes down, it's

Chris 15:15
like having a backup generator exactly. Okay, so one availability zone goes down, the other one takes over, right? No downtime. And

Kelly 15:22
here's an extra tip, okay, when you're setting up Auto Scaling for multiple availability zones, yeah, remember those placement groups? Placement groups aren't those for it can be used for high performance computing, for low latency applications, yeah, but they can also help with availability. Oh, okay, by choosing the right placement group strategy, okay, you can reduce latency, got it isolate important workloads, okay, and even boost your application's performance.

Chris 15:51
Wow. So it's not just about spreading those instances out right. It's about putting them in the right places within those availability zones exactly,

Kelly 15:59
getting the most out of your setup. Okay, here's

Chris 16:02
another challenge. All right, let's hear it. We've got an Auto Scaling group that's terminating instances based on their age, okay, to keep things fresh, yeah, but we're seeing some instances being terminated too early, even though they're still healthy. Interesting. What could be going on

Kelly 16:19
if instances are being terminated before they should be, yeah? It usually means there's a conflict somewhere in your configuration. The first thing I check is the termination policy, okay? Is the Time To Live setting too short,

Chris 16:31
right? Like, maybe we accidentally set it to terminate after 24 hours, exactly? That would definitely cause problems instead of, like, seven days, yeah?

Kelly 16:39
So always double check those settings. Okay, but

Chris 16:42
what if the time to live is right? Yeah, and we're still seeing those early terminations. Good question. What else could it be? It

Kelly 16:50
could be a conflict with another scaling policy, like maybe you have one that's removing instances based on low CPU utilization, right? And that's overriding your age based policy.

Chris 17:02
So we need to make sure those policies aren't fighting each other exactly. Think of

Kelly 17:07
them like a set of rules, okay? If one rule contradicts another, yeah, the higher priority rule wins, right? Okay, and don't forget about those capacity settings. Oh, right, right. If the minimum capacity set too low, Auto Scaling might terminate instances early, okay, just to keep that minimum, even

Chris 17:24
if they're healthy, even if they're healthy, those capacity settings can be tricky. They can

Kelly 17:28
it's like setting a minimum number of guests for a party. Oh, it's too many people leave early, yeah, you have to invite random people just to hit that number. I like that analogy, so always make sure those capacity settings are what you want. Okay, all right, ready for a real brain teaser? Hit

Chris 17:44
me with it. Okay,

Kelly 17:45
let's say you have an application, okay, that takes a really long time to start up. Oh, how would you design an Auto Scaling strategy, okay, to minimize the impact of that slow startup time when there's a traffic spike, hmm,

Chris 18:00
so we need enough instances ready, yeah, even though it takes forever for them to boot up exactly if we just rely on dynamic scaling, yeah, those new instances might not be ready in time. They might not so what can we do? So, what can we do? What do you think? Well, if those startup times are super long, yeah, maybe we could pre warm the Auto Scaling group. Oh, interesting, like launch instances way ahead of time. Yeah, so they're ready when the traffic hits

Kelly 18:27
pre warming is a great idea. Great. It's like preheating the oven, right? You want it ready, yes, when the batter goes in.

Chris 18:34
But what if the traffic spikes are unpredictable, right? Scheduled scaling wouldn't be as good, then

Kelly 18:40
true.

Chris 18:41
So what else could we do? We

Kelly 18:42
need something more dynamic, okay, remember those capacity settings? Yes, maybe we could increase the minimum capacity so you always have more pre warmed instances ready. That makes sense. But that could get expensive, yeah, if we're always running a lot of instances, it's a balance between cost and performance, okay, but there might be other options too, like, what? Well, what if, instead of just adding more instances, Yeah, we tried to make application startup faster? Oh, okay, maybe we could streamline the initialization, right, optimize the code, okay, or even switch to a different runtime environment.

Chris 19:19
So sometimes it's not just about Auto Scaling, right? It's about the application itself. It's

Kelly 19:24
both, okay, we need to look at everything. Okay, last challenge. Last challenge. You ready? I'm ready. Okay, imagine you have an Auto Scaling group, okay, and it's working great, but you need to update the operating system on all the instances, right? Without any downtime,

Chris 19:40
oh, updating operating systems without downtime, that's the dream that sounds almost impossible. It

Kelly 19:46
sounds impossible in a traditional environment, yeah, but in the cloud, yes, we have the tools. Okay, okay, so what can we do?

Chris 19:54
What can we do? I'm thinking immutable infrastructure. Oh, yeah, instead of updating the existing instances. Yes, right? We create a new launch template, okay, the updated OS, right, and then gradually replace the old instances with the new ones. Exactly. We treat our servers like Lego bricks. I like that. Swap them out easily. So

Kelly 20:13
we create a new Auto Scaling group, yep. With the new launch template, move traffic over, right, and then retire the old group.

Chris 20:20
No more. Messy updates,

Kelly 20:22
no downtime, no

Chris 20:22
downtime. But

Kelly 20:23
what about another option? Okay, what else is there? Blue Green deployments. Oh yeah, blue green deployment. It's like a dress rehearsal. Okay? You have two environments, one is live blue, and one is staging green. Okay, so you update the OS on the green environment, right? Do all your testing, make sure everything works, and then you switch them

Chris 20:44
so green becomes live exact and blue is the backup.

Kelly 20:47
It's like a safety net. If something goes wrong, you can quickly switch back so nobody even knows. And both of these methods, yeah, use Auto Scaling to minimize downtime. Wow.

Chris 21:00
This deep dive has been amazing. It has. I feel like I really understand Auto Scaling now. That's great. It's not just settings and configurations anymore. It's more than that. It's a powerful tool. It is that can make our lives as cloud engineers so much easier, absolutely, and our applications so much better. And

Kelly 21:16
that's what makes this field so exciting. We can build amazing things in the cloud. We can and Auto Scaling helps us do that. Any final words of wisdom, yes, embrace Auto Scaling. Don't be afraid to experiment, try those advanced features and keep learning. Great advice. The cloud is always changing, yeah, and so should we? For those of you prepping

Chris 21:36
for AWS certifications, remember, it's not just about memorizing facts, right? It's about understanding the why, exactly, connecting the dots, seeing the big picture

Kelly 21:47
with everything you've learned today. Yeah, you're well on your way to becoming an Auto Scaling expert.

Chris 21:51
You got this? You got this. Thanks for joining us on this Auto Scaling journey. It's been fun. We hope you learned a lot and feel ready to build great things in the cloud. Absolutely. We'll see you next time on The Deep Dive. See you then. Happy cloud computing. You.

Ep. 105 | AWS Auto Scaling Overview & Exam Prep | Mgmt & Governance | SAA-C03 | AWS Solutions Architect Associate

Broadcast by

headphones Listen Anywhere

Listen Anywhere