/r/aws
News, articles and tools covering Amazon Web Services (AWS), including S3, EC2, SQS, RDS, DynamoDB, IAM, CloudFormation, AWS-CDK, Route 53, CloudFront, Lambda, VPC, Cloudwatch, Glacier and more.
News, articles and tools covering Amazon Web Services (AWS), including S3, EC2, SQS, RDS, DynamoDB, IAM, CloudFormation, AWS-CDK, Route 53, CloudFront, Lambda, VPC, Cloudwatch, Glacier and more.
If you're posting a technical query, please include the following details, so that we can help you more efficiently:
Resources:
Sort posts by flair:
Other subreddits you may like:
Does this sidebar need an addition or correction? Tell us here
/r/aws
So, we have a web-server that is purpose built for our tooling, we're a SaaS.
We are running a ECS Cluster in Fargate, that contains, a Docker container with our image on.
Said image, handles SSL, termination, everything.
On gc we we're using a NLB, and deploying fine.
However... We're moving to AWS, I have been tasked with migrating this part of our infrastructure, I am fairly familiar with AWS, but not near professional standing.
So, the issue is this, we need to serve HTTP, and HTTP(S) traffic from our NLB, created in AWS, to our ECS cluster container.
So far, the issue I am facing primarily is assigning both 443, and 80 to the load balancer, my work-around was going to be
Global Acceleration
-> http-nlb
-> https-nlb
-> ecs cluster.
I know you can do this, https://stackoverflow.com/questions/57108653/ecs-service-with-two-load-balancers-for-same-port-internal-and-internet-facing - but I am not sure how, I cannot find in the AWS UI a option when creating a service inside our ECS cluster to allow multiple load balancers.
It's either 80:80 or 443:443, not both. Which is problematic.
Anyone know how to implement NLB -> ECS 443:80 routing?
Hi - I'm using the Pinpoint Javascript SDK to send text messages and I can't get any of the message logs into Cloudwatch.
I was successfully able to use the AWS SMS simulator to send a message AND it gets logged using the configuration set, but there doesn't seem to be any documentation about how to do it via the Javascript SDK or API.
This is what I'm doing right now, I'm trying to insert the ConfigurationSet ARN everywhere and I can see it being populated in the request, but it never logs in Cloudwatch - any idea what I'm doing wrong? The actual text message is going through to my phone, just no logs showing up.
const params: SendMessagesCommandInput = {
ApplicationId: this.applicationId,
MessageRequest: {
Addresses: {
[formattedPhoneNumber]: {
ChannelType: "SMS"
}
},
MessageConfiguration: {
SMSMessage: {
Body: message,
MessageType: "TRANSACTIONAL",
SenderId: "MY_SENDER_ID",
// Enable detailed CloudWatch metrics
EntityId: "SMS_EVENTS",
TemplateId: "SMS_DELIVERY_STATUS"
}
}
} as any // Type assertion to handle ConfigurationSet
};
// Add configuration set after type assertion
(params.MessageRequest as any).ConfigurationSet = this.configurationSet;
(params.MessageRequest as any).ConfigurationSetName = this.configurationSet;
const command = new SendMessagesCommand(params);
console.log("sendSms command", JSON.stringify(command, null, 2));
const response = await this.pinpointClient.send(command);
Ciao a tutti, ho notato che mi stanno arrivando piccoli addebiti legati a SageMaker su AWS, ma non sto usando questo servizio e vorrei disattivarlo completamente per evitare di spendere soldi inutilmente.
Ho già scritto all'assistenza AWS, ma nel frattempo volevo chiedere qui se qualcuno ha avuto lo stesso problema e sa come risolverlo in modo definitivo. Se qualcuno è disponibile, potremmo anche fare una breve call su Discord per risolvere il problema più velocemente.
Grazie in anticipo per l'aiuto!
Hi all! To help folks learn about EKS Auto Mode and Terraform, I put together a GitHub repo that uses Terraform to
Repo is here: https://github.com/setheliot/eks_auto_mode
Blog post going into more detail is here: https://community.aws/content/2sV2SNSoVeq23OvlyHN2eS6lJfa/amazon-eks-auto-mode-enabled-build-your-super-powered-cluster
Please let me know what you think
Hello,
I’ve been using an AWS NAT Gateway to provide a static IP for outbound traffic in my production environment. However, we’ve encountered a significant billing spike—around $3,000, which seems disproportionate since the only use of the NAT Gateway is for a static IP.
My client requires my IP address to be whitelisted for network access, but since my application is deployed on AWS ECS Fargate (with multiple tasks), I don’t have a static IP. As a result, I opted for the NAT Gateway to provide one. However, I didn’t expect 60% of the total bill to be consumed by NAT charges, primarily for providing just a static IP.
I’ve come across the NAT instance alternative but have concerns regarding its stability for large-scale environments. I’m hesitant to switch to EC2 due to potential scalability and reliability risks for production.
Any valuable suggestions or guidance would be greatly appreciated!
Hi!
I'm the IT manager of a medium business. We've been using AWS for a few months for a project. As the company is growing and we are starting to use AWS for other projects, we need that our use of AWS reflects more closely our company structure, which is as follows:
Holding X
├─ Subsidiary Y
│ ├─ Project A
│ ├─ Project B
├─ Subsidiary Z
│ ├─ Project C
To be more specific, I need to be able to administer everything, the Project A team needs to be able to administer anything in Project A, but nothing in Projects B and C, we need to be able to bill projects on different bank accounts depending on their subsidiaries, and being able to easily know which project cost how much, and our accountants need to be able to access those bills.
I've already created a new management account called after the Holding X, and it joined a new organization with the same name. I've moved the management account called after the Subsidiary Y, which currently holds Project A, to the new organization.
Following AWS best practices, how should I handle this?
I've watched a few tutorials about AWS Organizations and IAM Identity Manager and I think I now know the jargon, but I'm not sure what should be/do what (accounts, organization units, users, groups, permission sets…).
Thank you for your help! :)
I’m setting up a production-grade EKS cluster and I’m trying to decide the best NAT Gateway strategy for my private subnets.
My current setup:
options:
Any real-world experiences or best practices for balancing cost vs performance?
I requested a certificate for an EC2 instance and its been pending validation for several hours now. There are no messages on what, if anything, needs to be done. Lightsail certificates take less than a minute.
My bucket is structured in this manner project-prod-files/ year/month/day/raw/filename.ext
Here, year, month and day are dynamic values
How can I enter dynamic prefix in AWS console when creating s3 lambda trigger?
Any help would be greatly appreciated 🙏.
Hello everyone!
As the title says, i've been following the Amplify Gen 2 documentation/tutorial on how to configure AWS for local development (https://docs.amplify.aws/react/start/account-setup/). Everything works great up until the point where i have to configure the local environment through the aws configure sso command. The resulting URL returns an error message saying the page can't be loaded. I have used the same start URL given to me earlier in the process, i have installed the aws CLI and i have tried turning the firewall off to see if that was the problem, but the issue persists. Has this happened to anyone else and how can i sort it out?
TIA!
Hi community,
I'm working on a blue-green deployment strategy for a Strapi/Medusa project and would love your feedback on the attached architecture diagram and approach.
I. Problem Overview
A specific issue arises with rollback strategies:
Currently, the frontend deploys via Amplify, which can cause inconsistencies if the backend deployment fails. Moving to an EC2-based deployment could solve this, but is there a better alternative for syncing frontend and backend deployments?
I'd appreciate your insights on:
Hi everyone,
I’m using AWS Backup to create copies of my S3 buckets and RDS instances. Recently (since January 15.), I’ve noticed an issue with approximately 70% of my buckets. The backup status is showing as "Completed with issues", but there’s no additional information provided.
When I restore the problematic bucket, I can confirm that some files are missing. I’ve compared the properties of the files that were successfully backed up with those that weren’t, and they appear identical.
I haven’t made any changes to the AWS Backup IAM role or the bucket configurations. Has anyone else encountered this issue, or have any insights into what might be causing it?
Thanks in advance!
In order to reduce our NAT Gateway consumption costs, we decided to implement Gateway Endpoints so that our VPC could communicate with S3 directly, without routing through the NAT Gateway.
We configured the endpoint according to the AWS documentation, and it has been correctly added. However, the traffic is still not being routed through the endpoint and continues to go through the NAT Gateway.
Hello Everyone!
I've recently made a lifecycle rule in an S3 bucket in order to move ALL objects from Standard to Glacier Instant Retrieval. At first, it seemed to work as intended and most of the objects were moved correctly (except for those with less than 128KB). But then, the next day, a big chunk of them were moved back to Standard. How did this even happen? I have no other lifecycle rule and I deleted the lifecycle rule to move from Standard to GIR after it ran. So why are 80TB back to Standard? What am I missing or what could it be happening?
I am attaching a screenshot of the bucket size metrics, for information.
Thank you everyone for your time and support!
Hello!
When using AWS Load Balancer Controller in an EKS cluster, is there a way to use an existing ALB?
Currently, I can group multiple Ingress resources using the annotation: alb.ingress.kubernetes.io/group.name
, that creates one ALB per group.name
and deletes the ALB when no resources with that name exist anymore. That's ok.
But what I really want is to use an existing ALB specifying it by its name or ARN and that the Ingress resources associated append the rules to it or remove the rules when they are deleted. Is that even possible?
Even if that is possible, would you mix manually created rules and k8s created rules in the same ALB or just rely on aws cli
commands to automate it and forget about the "AWS Load Balancer Controller"?
Hey everyone!
I'm trying to create a single node pool in EKS that supports both amd64 and arm64 architectures, while also allowing nodes to be either spot or on-demand. Additionally, I want to be able to define in the pod specification which type of node it should be scheduled on.
From what I can see, the alternative would be to create four separate node pools to cover all combinations (amd64-spot, amd64-on-demand, arm64-spot, arm64-on-demand), but I'd prefer to manage this with a single node pool if possible.
Does anyone have any suggestions or best practices for achieving this?
- apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: &crons crons
annotations:
kubernetes.io/description: Crons purpose NodePool
spec:
template:
metadata:
labels:
purpose: *crons
annotations:
purpose: *crons
spec:
taints:
- key: purpose
value: *crons
effect: NoSchedule
requirements:
- key: purpose
operator: In
values:
- *crons
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: kubernetes.io/os
operator: In
values:
- linux
- key: karpenter.sh/capacity-type
operator: In
values:
- spot
- on-demand
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
expireAfter: 720h
limits:
cpu: 1000
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 1m
In the aws website, its mentioned 750 hrs per month of large dc2 nodes is free for 2 months. For eg: If I use redshift as dwh for loading incremental data, do i need to pay separately for the storage or is it included in the free tier?
If I need to give access to RDS Aurora Mysql cluster through a private link,
My understanding is the private link need to point to a NLB that need to point to the Aurora endpoints.
It is not possible to have NLB pointing 'automatically' to Aurora endpoint ? we still need to do some
automation with lambda/sns or lambda/proxy to dynamically refresh the NLB configuration to point to the 'private' IP of the Aurora or the proxy in front of it ?
https://aws.amazon.com/blogs/database/access-amazon-rds-across-aws-accounts-using-aws-privatelink-network-load-balancer-and-amazon-rds-proxy/
https://aws.amazon.com/blogs/database/access-amazon-rds-across-aws-accounts-using-aws-privatelink-network-load-balancer-and-amazon-rds-proxy/
Thanks.
Hey guys! I'm having a huge question related my aws account. I was not aware that there was a EC2 instance c7a.8xlarge running for a full year with no reason. I was not charged anything as the card in the account was not updated, I'm scared that I will be billed around $12k for nothing. I already cancelled the EC2 instance and other services that there were running in the background and close the account. Should I care about it and contact customer service to inquiry about or what?
The screenshot was taken from the billing tab, under the pending charges tab there was no pending charges to pay. I think that this c7a.8xlarge instance was created at some point as testing instance learning about AWS but now I'm too afraid that I will be billed for an amount that I cannot affort at all even selling my soul.
What can you suggest to me? I appreciate all ur help!
Hey Everyone, I'm creating an eks cluster via terraform, nothing out of the norm. It creates just fine, I'm tagging subnets as stated here, and creating the ingressParams and ingressClass objects as directed here.
On the created eks cluster, pods run just fine, I deployed ACK along with pod identity associations to create aws objects (buckets, rds, etc) - all working fine. I can even create a service of type LoadBalancer and have an ELB built as a result. But for whatever reason, creating an Ingress object does not prompt the creation of an ALB. Since in auto-mode I can't see the controller pods, I'm not sure where to even look for logs to diagnose where the disconnect it.
When I apply an ingress object using the class made based on the aws docs, the object is created and in k8s there are no errors - but nothing happens on the backend to create an actual ALB. Not sure where to look.
All the docs state this is supposed to be an automated/seamless aspect of using auto-mode so they are written without much detail.
Any guidance? I have to be missing something obvious.
Why the hell is AWS cost optimization still such a manual mess ?Worked at VMware vRealize on fullstack and saw infra guys constantly dealing with cost shit manually. Now I’m at a startup doing infra myself and it’s the same thing just endless scripts spreadsheets and checking bills like accountants. AWS has Cost Explorer Trusted Advisor all this crap but none of it actually fixes anything. Half the time it’s just vague charts or useless recommendations that don’t even apply
Feels like every company big or small just accepts this as normal like yeah let's just waste engineering time cleaning up zombie resources and overprovisioned RDS clusters manually forever. How is this still a thing in 2025 Am I crazy or is this actually just AWS milking the confusion?
i only have like 3 yoe so is there something i am not understanding and there is no way for this to imprve? we are actually behind on our roadmap since another project came in to reduce cost on eks now directly from the CTO, its never ending
I was checking our recent bill using Cost Explorer and found that the biggest charge was for VPC. Grouping charges by a resource I found that all charges are for ENI - Elastic Network Interfaces. Cost Explorer report them as following:
arn:aws:ec2:eu-north-1:XXXXXXXX:network-interface/eni-0XXXXXXXX
These are EC2 instances managed by Elastic Beanstalk. EB environments have a load balancer assigned to them. Networking and database - Public IP Address option is deactivated. EC2 instances are split between two availability zones.
I expected to be charged for internet egrees, but it seems that I'm being charged for local traffic as well.
Is there something I can do to avoid these charges?
First time poster here. IT agent in international customer support. Quick question: Anyone here with experience deploying DeepSeek R1 on AWS and integrating it? Specifically:
From what I’ve seen, the cost efficiency looks solid – just unsure about its overall English capabilities for US+EU market customer support AI agent.
Appreciate any insights!
Hi guys, i've started using the AWS sagemaker canvas for a forecasting use case, thinking its a free service for the first two months. But it ends up having a lot of hidden costs for the AWS Sagemaker domain instance that is required for the canvas to run. i am able to pause the User group, and log out of the Canvas instance, but I'm not able to pause the Sagemaker domain instance. Can you please help me sort this out? how do I pause the sagemaker domain?
We have a 2Tb FSX volume. It's billed at $30 a month plus just over $75/mo for 32Mb/s of throughput capacity. Can I lower that? 32 seems to be the minimum.
We have a directory service that serves one server instance that's only used a few hours a month. It's billed 24/7 though at almost $100/mo. It's only used to connect an FSx volume to one server. Can I lower that?
Thanks in advance :-) I'm in the UK zone.
So I'm definitely biting off more than I can chew here I know.
So I have this simple web app that connects to data stored in my onedrive and displays dashboards for the c-suite and other employees to use. At least that's the target. I just have the web app down hosted on my local.
I ran a quick cost calculator on the aws site and it's showing me around 4.5 dollars per month.. After the free tier is over. I'm highly sceptical rn cuz I've heard of people racking up huge bills.
I also would like a small database that stores when someone views the webpage at what time.. Expecting around 30 entries every day for 5 days a week... So 600 entries per month.
Could someone help me estimate the cost? 5 dollars per month seem way too cheap for AWS. I've also read some posts about people hosting a DB on an instance. How many instances will I need if I'm expecting around 30 visitors daily?
For reference as to why I'm so confused. I'm the only tech person (barely one year of experience with non tech degree) and this is the first time I'm hosting anything. I did host another web app using pythonanywhere but that doesn't count cuz my company also wants to use www.dashboards@{company-name}.com.
I'm open to any and all suggestions.
Hi, I found an auto-refreshed proxy list on github. Unfortunately most of the proxies there only work with http protocol, not depending on the protocol of the proxies themselves. And the weirdest thing is that these proxies (that only work on http) are configured and raised with squid proxy on amazon ec2 servers. What is their purpose, they are absolutely identical, about 100 proxies that are raised via squid and do not work on https. Can you guess what their purpose might be? I'm really curious. It's just that 99% of web resources have already switched to https, it's very strange
Up until recently, I've avoided anomaly detection alarms, because I doubted their usefulness. Unfortunately, my first experience with them is reinforcing that assumption, but I'm wondering if I'm just doing something wrong.
We have some ALBs with very consistent traffic, and very consistent traffic patterns. A third-party had a misconfigured client that started sending a ton of traffic to the ALBs for several weeks. Not enough to cause any operational issue (i.e., caught by other alarms), but it cost some money. This seemed like a perfect case where AD could have spotted a sudden, sustained change in this otherwise normal metric.
I created the anomaly detector and alarm, and it has never behaved in a way that would be useful. This is true across both stag and prod, and in all regions. Each of those has equally consistent traffic/patterns, and for each, the AD fails to track the metrics in a meaningful way.
You can see on this chart, that on the 16th, the model turns into spaghetti and basically stays in ALARM since that point. The weird thing is that the 16th is when I created the AD. So it seems like when I created the AD, the historical model it came up with for the past was actually pretty ok, but all new model data it generated since than has been wrong/bad.
I've talked to support twice about this, and they always just say there's not really any controls over the model, it can take some time to train, etc. I'm about ready to give up on this experiment, but wanted to see if anyone has actually seen these work as intended in real world scenarios?
I wrote a step-by-step tutorial last week titled "How to handle bounces & complaints with AWS SES & SNS". It is a must to handle bounces and complaints if you ever want to get production access.
I thought it would be useful for some people here.
Anything you'd add?