/r/devops
/r/DevOps is a subreddit dedicated to the DevOps movement where we discuss upcoming technologies, meetups, conferences and everything that brings us together to build the future of IT systems
What is DevOps? Learn about it on our wiki!
All articles will require a short submission statement of 3-5 sentences.
Use the article title as the submission title. Do not editorialize the title or add your own commentary to the article title.
No vendor spam. Buy an ad from reddit instead.
Find a DevOps meetup near you!
/r/devops
Hi i am looking to work under devops engineer to improve my devops skills and to gain handson
While developing BuildBudget (a GitHub Actions cost analyzer), I discovered React Native's test workflow costs $20k/month - 80x more than typical workflows. The analysis reveals interesting patterns about matrix builds, macOS runners, and minute-based billing. Thought you might find the deep dive interesting.
Running containers with ECS? This article explores:
• Fargate vs. EC2 for capacity providers
• How to configure warm pools for faster scaling
• Common pitfalls (and how to avoid them)
Read here!
There’s an internal change management tool I’m required to use as a developer to track any code or infrastructure changes, whether in production or a testing environment. As an SRE, I frequently make small fixes across codebases, which means I need to create detailed records documenting every action taken to resolve issues.
The tool itself wouldn’t be so frustrating if it were intuitive and efficient, but mostly it's the opposite.
One major problem is the overly specific yet incomplete categorization system. While there are very granular options for some scenarios, they rarely apply to the situations I deal with, leaving me to default to vague categories like “Other” more often than I’d like.
Assigning changes to other teams is another headache. Instead of clear team names, the system relies on team codes, which don’t mean much unless you’re already familiar with them. To assign a record, I usually have to identify an individual team member and then backtrack to figure out their team name, a tedious and unnecessary step.
All of these issues make my experience worse than it was a few years ago when I wasn’t forced to rely on this tool. To make matters worse, the interface design is severely outdated. It looks like it belongs in the Windows 98 era, which doesn’t exactly help with usability or morale.
As an SRE in a GitOps world, do you still have to track your changes in parallel in another tracking system? Or do you use some magic git connector to Jira or other tools? Are there other people working with Kubernetes stuck in some tooling from the early 00's?
I have been applying to devops intern and entry level positions for a while now (location: nepal and remote) and i am not even able to secure interviews except for 2 places, one was for a senior role and the other in india for a bare minimum pay.
What should i do to improve my chances? I have no professional experience but have done projects using jenkins to build cicd, aws infrastructure setups using terraform and deployment on kubernetes
https://www.youtube.com/watch?v=ZJ09dPNJ8G8
__
This video aligns with the opinions I shared in the post titled 'Another DevOps Rant.'
https://www.reddit.com/r/devops/comments/1h5156p/just_another_devops_rant/
You're probably already familiar with this channel, but I recently discovered it.
If anyone is interested in diving deeper into the reasons behind this thinking, I believe you'll enjoy watching it.
Cheers!
We have 4 primary postgres databases - each has a hot standby.
Our applications allow customers to sign up to our service, and when they do, they will have a database created in one of these databases - which stores all their information and schemas they can interact with.
We have opted for multiple postgres databases because we believe this would ease the load of off just one point of failure.
However, as our databases grow in size, we just scale horitzontally by adding another database ( and hot standby). This is all done on a vps - each instance on its own server - and can take some engineering time.
We have considered instead to just have one very large Primary and one very large hot standby. Is there any benefit to keeping the multiple postgres instances instead?
I've tried to get a breakdown of cost, and I don't believe it would cost much less to go just go to one instance.
Additionally, we can't do much with pooling because each customer goes to their respective database within their respective postgres cluster.
Whenever we join a new primary/stand by we point our new customers to this database, so this is where newest customers will be created.
With previous customers their data can be stored on one of any four of the other postgres DBs, and we use application logic to findout their database instance before connecting them to it.
I recently started as a junior developer with fewer experiences than my colleagues. Is it better to have your own VPS to play with and experiment with, or is it better to experiment with different services provided by the same cloud service provider? (I'm willing to do the latter, even though getting your own server is better.)
Reveal your favorites: Alembic, Django Migrations, or entirely different? Hit out the pros, cons, and extra-ordinary gems in the comments!
My favoutite is Alemic, because is a lightweight database migration tool for Python
https://medium.com/@rasvihostings/alembic-is-a-lightweight-database-migration-tool-for-python-d84220c0e0dc
assuming that you did make this transition at some point.
Asking because I want to explain why I want to be an infrastructure engineer with my background in software dev and data science, other than the actual answer of "this is the position that responded to my application'
I recently saw https://modal.com/ which is a platform that provides A10G(24 GB) GPU for $1.16 /hour. I don't know how much AWS or Azure charges for the same GPU. I want to run an AI model with a similar infrastructure as it'll be idle for most of the time. But I am have a few questions on how to achieve this.
Hi guys, I’m an Indian who is currently living in Ireland. I have been hunting for a job for sometime. Background: 1 year work ex in India as ML Engineer, recent masters in CS from UCD, graduated 2024. I only have experience in ML/AI in full time and internship roles.
I’m currently interviewing for the role for MLOps engineer in India for a company which is reputable and I’m sure a lot of you must have heard. about it or used its product. I have also been approached by Microsoft and IBM for the role of SWE in Ireland.
Here’s my dilemma: I’m sure that I will get a result for MLOps engineer much sooner than my interviews will progress at Microsoft and IBM, and I will definitely be asked to move back to India before I get a response from Microsoft/IBM. I think MLOps is a great role for someone like me and due to personal reasons I do not mind shifting back to India. At the same time, I’m not that confident about SWE interviews because my DSA is not that strong so I might just lose a great opportunity as well. However, I’m also worried that my career path could get stuck at MLOps and I back myself into a corner with limited career options down the line, which compared with SWE opens up several options for me. Had both these interview processes been running at the same time, I would have decided to go with Microsoft/IBM, but given my current circumstance I don’t want to risk turning down MLOps with just a chance of maybe getting selected in SWE. Any advice?
I was looking for a job change for sometime but unable to get any calls. Please let me know if there's anything to be modified in my resume. Link
Hey, a startup gonna give data solutions as a vendor so which you recommend cloud or physical servers
** Note that we are the early beginning**
The metric iam focusing on :
If a physical server which on you recommend
Hello r/DevOps! 👋
I’m working on a project to understand the challenges people face when trying to upskill for their careers and how they improve their qualifications. Whether you’re a working professional, a student, or someone exploring new skills, your input will be invaluable in shaping resources that can make learning and growth easier for everyone.
If you’ve ever struggled with finding time, motivation, or the right learning platform, this quick survey is for you! It’ll only take 5 minutes of your time, and your feedback will directly contribute to making upskilling more accessible and effective.
https://forms.gle/B1ej715JQYzcEDE88
Your voice matters! Feel free to share this post with anyone in your circle who might be interested. Thank you in advance for helping out! 😊
P.S. In case you are a teacher or instructor, I will love to have a quick chat with you. Please reach out in DMs
Currently, my leaders are asking about DORA metrics for better visualization of process improvement, but the system ultimately being deployed to is literally a remote, disconnected system. I can measure lead time to release, but not deploy because all development is required to be bundled to a pre-staging area, where once every quarter that bundle is final tested before being physically transported to a remote location for on-site testing and deployment.
By definition this makes our deployment frequency every 3 months and completely throws off attempting to track change failure rate and TTR. The entire system of DORA is pretty much based around attaining a certain volume of deployability, and then working to improve that.
My question is: Is there a way that DORA metrics are still usable in a situation like this, or are there a different set of industry recommended metrics I haven't heard about to offer instead?
What other cloud services have 100% SLA?
As the title says, I've been in the industry about 14 years now. Left my last position unexpectedly due to some restructuring. Landed a new role as a Manager for a DevOps/Platform team initiative the company is expanding out.
Here to offer questions or advice to anyone that might need it, or feels dissuaded from their search.
To put into perspective. I'm 37, and have 0 experience as a manager, my last two roles were senior and lead roles. I've worked in multiple industries (healthcare, media, fintech, retail sales, msp). I'm not particularly "amazing" at anything, but have a pretty wide, skill set and learn things very quickly when self tasked.
Got some questions? Need some pointers or advice? Ask away, ill see if i can help.
I keep reading posts emphasizing the need for newbies to have good knowledgeable of networking so can u guys share any resources/guideline to learn ?thanks.
It's just as the the title says, I'm looking for some resources to learn and gain some hands-on experience in puppet. Please suggest some good resources and roadmaps.
Thank you!
Hey r/devops,
I wanted to share my recent experience as a DevOps professional navigating the job market, in hopes it resonates with some of you and maybe even sparks a conversation.
Currently, I’m employed as a DevOps Engineer and have been working in the field for about 1.5 years. Due to recent circumstances, I decided to explore new opportunities and aim higher. I even cleared an interview for a role that required 3 years of experience—a milestone that felt validating for my skills and growth.
I made it through multiple rounds of interviews for a promising role, only to get rejected at the final stage. What stings more is that I was honest about my unconventional background—pursuing a BSCS while already working in DevOps. My technical skills and experience were enough to pass every challenge they threw at me, but in the end, my degree (or lack thereof) became the deal-breaker.
Here’s a bit of my background: Before transitioning into tech, I was pursuing Chartered Accountancy (CA). Life took unexpected turns, and I had to pivot. The skills I gained from that journey—discipline, analytical thinking, and resilience—have shaped who I am today.
But for some companies, that time doesn’t count as "relevant experience." It’s frustrating because by the time I graduate, I’ll have around 3 years of solid hands-on DevOps experience, yet many organizations may still see me as a “fresher.”
Despite the setback, I’m not giving up but tbh it stings. Rejections like these remind me why I chose this field in the first place—continuous learning, problem-solving, and the thrill of building scalable systems but sincerely, rejection hurts. I know my worth, and I sincerley hope there’s a company out there that values skills and grit over paperwork.
To my fellow DevOps enthusiasts: Have any of you faced similar challenges? How do you navigate hiring processes when your path isn’t “traditional”? I’d love to hear your stories or any advice you have to share.
Thanks for reading. Here’s to bouncing back stronger and finding the right fit!
With AI-driven tools optimizing CI/CD pipelines, automating infrastructure management, and improving monitoring systems, it’s clear that AI is enhancing efficiency in DevOps. However, I’m curious to hear the community's thoughts on the potential downsides. Do you see AI as a threat to traditional DevOps roles, or is it more of an opportunity to evolve and focus on higher-level responsibilities? What specific skills should DevOps professionals focus on to stay relevant in this AI-driven future!?
High-level application deployment on Kubernetes has been simplified by helm charts. How a helm chart can be created, packaged, and hosted in GitHub.
Helm helps in the simultaneous deployment of several applications, which is a distinct advantage in today’s environments.
https://medium.com/@rasvihostings/host-a-helm-chart-on-github-0012db444670
When we think about large volume streaming data pipeline three things come to our mind
I designed a solution which can scale easily, use much as possible GCP managed services and finally reducing the cloud cost 😉
https://medium.com/@rasvihostings/fraud-detection-data-pipeline-etl-on-gcp-2b15b8f3d65b
Hi everyone,
I have some funding left from my company and I am not sure where should I spend on. I realize the tech world has been shifted to almost everything in kubernetes for microservices, less and less are deployed via systems for the trend.
I am planning to get the CKAD next year (I already had the CKA) by using those fund but I think I would still have around 500CAD left to be used.
May I know if there is any good course I can take to learn cilium/argocd? I played with it at homelab before. By any chance if there is some good enterprise-level courses I could take to learn more like the 'right' way to so?
Thank you for your suggestion!
Can someone suggest any popular or recommended resources that is FREE to learn Jenkins?
I have 4 months off work due to childbirth. I found a bootcamp that I wanted to do. I’ve done sql, python, JavaScript before.
Should I sign up for a boot camp? I have full stack or back end as options… OR as someone with experience with this should I just:
Get on code academy, do a bunch of classes and build my own portfolio?
Thanks y’all.
And I’ve been in tech for 4 years but the past two years it’s been more like networking and cloud infrastructure..
Just trying to figure out what to do with my brain while I’m off.
Hi there - looking for some recommendations. We're looking to move a large amount of our storage to the cloud and would like to split it up across 2, 3 or even 4 physical locations (globally). Looking at nearly .5 PB in total.
Due to the very active nature of the data being stored, products like AWS Glazier are not an option.
Azure, AWS and similar vendors seem to be very, very expensive at this level, so I'm looking for alternatives. Could be a single global operation, or a couple smaller regional. Really just need Linux VMs with limited compute, but access to large amounts of cheap but relatively fast storage.
Any vendor recommendations?
Probably a stupid question
I'm looking at
But am I purchasing the CKAD exam, or something called LFD259.
I want to do the CKAD, and want to make sure this is actually the CKAD and not something else.
Could someone let me know if this is actually the CKAD, or is it just training for it, or something else?