/r/devops

Photograph via snooOG
Welcome to /r/DevOps

/r/DevOps is a subreddit dedicated to the DevOps movement where we discuss upcoming technologies, meetups, conferences and everything that brings us together to build the future of IT systems

What is DevOps? Learn about it on our wiki!

Traffic stats & metrics

Rules and guidelines

Be excellent to each other!

All articles will require a short submission statement of 3-5 sentences.

Use the article title as the submission title. Do not editorialize the title or add your own commentary to the article title.

Follow the rules of reddit

Follow the reddiquette

No editorialized titles.

No vendor spam. Buy an ad from reddit instead.

Job postings here

More details here

Social & Fun

@reddit_DevOps

##DevOps @ irc.freenode.net

Find a DevOps meetup near you!

Icons info!

General Information

https://github.com/Leo-G/DevopsWiki

/r/devops

321,867 Subscribers

1

How do I deploy a Flask server on a WSL machine?

I have a Flask REST api that handles POST requests and responds with image html templates. What would be a more introductory approach to deploy it just for light stuff like showing it off to some friends or potential employer coming across my portfolio? Nothing fancy, no frills, just serve the html on request. I should mention that I'm running windows 10 enterprise and I don't think I can port the whole thing to linux, setting up all the dependencies took days and it was quite a headache, but I have been trying to work with WSL lately, albeit unsuccessfully ...

0 Comments
2024/05/02
17:46 UTC

1

Need advice on how to build my portfolio as a junior cloud/devops engineer

I'm currently about to graduate from an engineering school within the french system. I'm also in an end of study internship ( pretty much doing automation tasks/ ci cd pipelines : ansible, gitlab..). I think this is the time I should be looking for a job as a my internship is about to end ( mid july ) and I'm confused on how to build a portfolio to apply for jobs on LinkedIn, I'm mainly going to apply for jobs In Germany/austria/France, this is the tools that I know : docker/K8s/ansible/jenkins/gitlab , and a little bit of terraform. as for the cloud providers, I only know how to work with aws as most of the tests and practices were on it.

any advice would be appreciated !

6 Comments
2024/05/02
17:04 UTC

4

Specializing within DevOps

There really is too much to know these days, what areas are there to specialize in?

My thoughts:

Kubernetes - I can see why some engineers love it. An awesome paradigm at the base layer and so much interesting built on top of it.

Observability - almost a science in itself and plenty to get into (or related to) be it monitoring, alerting, analytics, service management.

Platform management - building out a consumable platform, kinda like being a developer for developers.

Architect - the problem I have with this is developers are going to have their own software architects doing system design that the may overlap already with the infra side. Also many expect engineers to have software architects skills anyway. So where does that leave the cloud/DevOps architect? I feel there is not much mileage in this path.

Any others? As each year passes the more I think it is not a good idea to stay in the middle as a generalist and time to pick a path.

0 Comments
2024/05/02
16:52 UTC

0

Should You Use Comment Prefixes for Code Reviews?

At Doppler, we use comment prefixes when we review code. We've found that this helps streamline our review process and improve our team's communication. Does your team do something similar? If so, how do you do it differently (or why not)?

I wrote a short blog on what exactly we do: https://www.doppler.com/blog/code-review-comment-prefixes-for-clearer-feedback

6 Comments
2024/05/02
16:40 UTC

1

Help with alertmanager & webex

Hello colleagues,

does anyone have experience with migration of alertmanager alerts to webex teams? Currently we are in transition from slack to webex (don't ask me why) and we are migrating all of the slack alerts/notifications to webex. This is current configuration (relevant part of it) of alertmanager:

....    
    receivers:
  - name: default
  - name: alerts_webex
    webex_configs:
      - api_url: 'https://webexapis.com/v1/messages'
        room_id: '..............'
        send_resolved: false
        http_config:
          proxy_url: ..............
          authorization:
            type: 'Bearer'
            credentials: '..............'
        message: |-
          {{ if .Alerts }}
            {{ range .Alerts }}
              "**[{{ .Status | upper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Event Notification**\n\n**Severity:** {{ .Labels.severity }}\n**Alert:** {{ .Annotations.summary }}\n**Message:** {{ .Annotations.message }}\n**Graph:** [Graph URL]({{ .GeneratorURL }})\n**Dashboard:** [Dashboard URL]({{ .Annotations.dashboardurl }})\n**Details:**\n{{ range .Labels.SortedPairs }} • **{{ .Name }}:** {{ .Value }}\n{{ end }}"
            {{ end }}
          {{ end }}
....

But the bad part is that we receive 400 error from alertmanager:

msg="Notify for alerts failed" num_alerts=2 err="alerts_webex/webex[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: {\"message\":\"One of the following must be non-empty: text, file, or meetingId\",\"errors\":[{\"description\":\"One of the following must be non-empty: text, file, or meetingId\"}],\"trackingId\":\"ROUTERGW_......\"}"

The connection works, as the simple messages are sent, however these "real" messages are dropped. We also thought about using webhook_configs, but the payload can't be modified (without proxy in the middle).

Anyone with experience with this issue? Thanks

0 Comments
2024/05/02
15:57 UTC

0

Really need a DevOps or Cloud job in India or remote foreign work: Any referals?

I've just 10 months of DevOps Engineer experience. I know it's quite less, but I'm quite enthusiast and passionate about Linux and DevOps world. I've been using Linux for more than 5 years and I'm quite good with it.

Not only that, in my such a short professional career (10 months), I've delivered production grade OpenStack setup (private cloud) for my current company, created an Ansible project for their complex clustered environment setup, automated AWS infra using Terraform, improved their existing Jenkins pipeline and Dockerfile, and many other work.

Many times, I've figured out and solved a lot of production related issue in quite less time which was affecting their business and their 10 more years experienced developers and team leads were not able to find. I'm not exaggerating, just telling the truth.

Their entire DevOps dependency is on me, and even I've delivered more than their expections. But the appraisal I got was not even peanuts. Even peanuts are costly.

I really need a DevOps, Cloud, Linux related job. I've just 10 months experience but please judge me at my skills. Location prefered Delhi NCR, but open to all India.

Thanks for reading.

5 Comments
2024/05/02
15:45 UTC

1

k8sAI - my open-source GPT CLI tool for Kubernetes

I wanted to share an open-source project I’ve been working on called k8sAI. It’s a personal AI Kubernetes expert that can answer questions about your cluster, suggests commands, and even executes relevant kubectl commands to help diagnose and suggest fixes to your cluster, all in the CLI!

As a relative newcomer to k8s, this tool has really streamlined my workflow. I can ask questions about my cluster, k8sAI will run kubectl commands to gather info, and then answer those question. It’s also found several issues in my cluster for me - all I’ve had to do is point it in the right direction. I’ve really enjoyed making and using this so I thought it could be useful for others. Added bonus is that you don’t need to copy and paste into ChatGPT anymore!

k8sAI operates with read-only kubectl commands to make sure your cluster stays safe.

All you need is an OpenAI API key and a valid kubectl config. Start chatting with k8sAI using:

$ pip install k8sAI
$ k8sAI chat

or to fix an issue:

$ k8sAI fix -p="take a look at the failing pod in the test namespace"

Would love to get any feedback you guys have!

Here's the repo for anyone who wants to take a look

0 Comments
2024/05/02
15:45 UTC

3

Automate All The Things DevOps project now with GitHub Actions

As suggested by someone from this community, I've moved the pipelines on Automate All The Things from Azure Devops to GitHub Actions.

I didn’t know GitHub Actions was free when the repo is public. This makes it SOO much easier to get started with the project, so thank you stranger who suggested this!

https://github.com/tferrari92/automate-all-the-things

The Azure DevOps version is still there in its own branch.

Any other feedback or suggestions are always welcomed! There's always room for improvement.

Also.. Nirvana Edition with Backstage.io is coming soon!

6 Comments
2024/05/02
15:36 UTC

0

How to fix Kubernetes CrashLoopBackOff

Have you ever deployed a pod in Kubernetes, only to watch in horror as it gets stuck in an endless restart loop?

You're not alone. The "CrashLoopBackOff" error strikes fear into the hearts of Kubernetes users everywhere. When a pod enters this state, it's trapped in a cycle of crashing, restarting, and crashing again.

But what exactly causes this dreaded CrashLoopBackOff? And more importantly, how can you troubleshoot and resolve it to get your pods running smoothly again?

In this post, we'll walk through the key insights you need to break free from Kubernetes restart hell.
https://www.perfectscale.io/blog/kubernetes-crashloopbackoff-an-ultimate-guide

8 Comments
2024/05/02
15:11 UTC

0

Looking for a gaming dev in Norway

As the title says anyone here?

1 Comment
2024/05/02
15:07 UTC

4

Generating IaC with drag-and-drop interface

Hello everyone, a couple of months ago, I wrote here to ask for your opinion on the tool I've developed, which allows you to generate IaC from a drag-and-drop interface. I've implemented several suggestions I received, including extending the number of components (the tool now covers all AWS RDS offerings except Oracle), adding VPC endpoint support, and improving architecture validation.

It would be great if you could check it out and maybe suggest some more features it's missing: https://app.archformation.com/

7 Comments
2024/05/02
14:45 UTC

1

grafana agent and k8s attributes

Ive been trying to find a way around this, but not really coming up with much. We run grafana agent as a deamonset in flow mode in our EKS cluster. All our apps send their open telemetry traces to it, which is then forwarded into Tempo.

We have a step in the pipeline, which I think should be adding k8s attributes to the spans, so they can be more easily searched in Tempo.

    otelcol.processor.k8sattributes "default" {
    extract {
      label {
        from      = "pod"
        key_regex = "(.*)/(.*)"
        tag_name  = "$1.$2"
      }

      metadata = [
        "k8s.namespace.name",
        "k8s.deployment.name",
        "k8s.statefulset.name",
        "k8s.daemonset.name",
        "k8s.cronjob.name",
        "k8s.job.name",
        "k8s.node.name",
        "k8s.pod.name",
        "k8s.pod.uid",
        "k8s.pod.start_time",
      ]
    }
    output {
      metrics = [otelcol.processor.memory_limiter.default.input]
      logs    = [otelcol.processor.memory_limiter.default.input]
      traces  = [otelcol.processor.memory_limiter.default.input]
    }
  }

The problem is this just ends up with all of the attributes reflecting the grafana agent, not the actual source of the span.

e.g. a app in name qa1 sends a trace, and it ends up in tempo with "k8s.namespace.name=grafana-agent".

Here is an example, this is a ebs-csi pod, in the namespace ebs-csi sending a trace to Open Telemetry.

app.kubernetes.io.instance	
"grafana-agent"
app.kubernetes.io.name	
"grafana-agent"
container.id	
"230dcd1933a94746a2b75f73fe22e9f92772c15ec014eb04287ff0fab5ee4caf"
host.name	
"ebs-csi-node-s48hp"
k8s.daemonset.name	
"grafana-agent"
k8s.namespace.name	
"grafana-agent"
k8s.node.name	
"i-018bb590272435fe2.us-gov-west-1.compute.internal"
k8s.pod.ip	
"10.2.30.17"
k8s.pod.name	
"grafana-agent-2dwwl"
k8s.pod.start_time	
"2024-05-01 21:44:10 +0000 UTC"
k8s.pod.uid	
"9854add6-4f12-4f42-80bd-0567d8934a01"
linkerd.io.control-plane-ns	
"linkerd"
linkerd.io.proxy-daemonset	
"grafana-agent"
linkerd.io.workload-ns	
"grafana-agent"
os.description	
"Amazon Linux 2023 (Linux ebs-csi-node-s48hp 6.1.82 #1 SMP PREEMPT_DYNAMIC Fri Apr  5 22:26:15 UTC 2024 x86_64)"
os.type	
"linux"
process.command_args	
[
    "/bin/aws-ebs-csi-driver",
    "node",
    "--endpoint=unix:/csi/csi.sock",
    "--logging-format=text",
    "--v=2",
    "--enable-otel-tracing=true"
]
process.executable.name	
"aws-ebs-csi-driver"
process.executable.path	
"/usr/bin/aws-ebs-csi-driver"
process.owner	
"root"
process.pid	
1
process.runtime.description	
"go version go1.22.2 linux/amd64"
process.runtime.name	
"go"
process.runtime.version	
"go1.22.2"
service.name	
"ebs-csi-controller"

most of the relevant details you would use to search, show up as the grafana-agent which makes looking this up difficult, especially if you have multiple deployments of the same app(s). Anyone have any idea where im going wrong?

4 Comments
2024/05/02
14:04 UTC

3

LGTM Stack VS Google Cloud Operations Suite

Hello! I was wondering if anyone had any experience using either of these. Right now I have a project with a company to essentially improve the log management they use. Its a large enterprise level company but the team itself and the application they use is for internal staff, and it creates around 80-100GB of logs per week. Its hosted on a Kubernetes cluster.

They're currently using Google Cloud Operations Suite with FluentBit as the log shipper, where logs are sent to Cloud Logging. Metrics are monitored with Prometheus and there's no tracing. Alerts are also dealt with through Google Alerts.

I essentially wanted to implement the LGTM stack considering this has very good integration with Kubernetes running in microservices mode - I can configure tracing through Tempo and OpenTelemetry and also set up metrics through Prometheus for an observability stack showing logs, metrics and traces in Grafana.

However after a lot of research I still can't quite figure out whether this implementation would actually improve anything on thier end. There's no real information on Loki/lgtm stack vs GC Operations suite and I don't know if there would be any big differences in the cost/speed/resources/performance/etc. Is Loki better than Google Cloud Logging at what it does? Are Grafana Alerts better than Google Alerts? Are there alternatives I can use instead? Its a big company so the actual costs of the additional resources really don't matter as long as the solution works.

Thank you for any advice you can give me on this!

2 Comments
2024/05/02
13:42 UTC

4

Scaling Observability & OpenTelemetry @ Skyscanner

Hey everyone👋

If you're in London UK next week and interested in observability & Open Telemetry, I think you'll enjoy this edition of the Observability Engineering Meetup.

Who: Dan is the Observability lead at Skyscanner, a member of the OpenTelemetry Governance Committee, and the author of "Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization."

What: Dan will share some of his experiences leading an observability transformation at Skyscanner, from custom solutions to telemetry standards and from a root cause analysis based on intuition and past experience to one based on context and evidence.

If you can't make it we'll record the talk and post it on this YouTube channel.

1 Comment
2024/05/02
09:34 UTC

0

Review my resume i am not getting any interview calls

Please roastresume Be detailed and provide detailed inputs on what skills am i missing and suggested learning sources if you can to jump my package to 14 lpa to 20 lpa or more in india

26 Comments
2024/05/02
08:53 UTC

0

How to monitor the CO2 emissions of your AWS application

The US and European Union have set ambitious goals to reduce emissions by 40% and 55% by 2030. Yet, many companies lack solid strategies for making their tech stack more sustainable.

Discover innovative methods to create a greener future with Kubernetes clusters without sacrificing application performance. https://www.perfectscale.io/blog/what-is-the-carbon-impact-of-kubernetes

0 Comments
2024/05/02
08:46 UTC

0

Is system design needed for devops engineer

Hello how much system design needed for a devops guys

Please share your experiance.

I am currently reading alex vus system design interview prep book is it good .

I want to switch job soon so need your inputs on topic.

Dear high paying devops engineers please post your experiance on how to become a high paying devops guys with skillset

6 Comments
2024/05/02
08:38 UTC

5

Question on Infrastructure-As-Code - How do you promote from dev to prOD

How do you manage the changes in Infrastructure as code, with respect to testing before putting into production? Production infra might differ a lot from the lower environments. Sometimes the infra component we are making a change to, may not even exist on a non-prod environment.

19 Comments
2024/05/02
07:21 UTC

0

AI in Infrastructure

Have anyone you implemented AI in infrastructure provisioning? If so, how beneficial has it been for your operations? #shareit

6 Comments
2024/05/02
06:02 UTC

1

RF, XRAY and Bitbucket Integration

Good day.

Anyone knows a step-by-step guide/link for integration of Robot Framework house in Bitbucket to XRAY+Jira? The Xray documentaion doesn't really help much. My experience is more on test scripting so this kind of setting up is new to me. Thank you in advance.

0 Comments
2024/05/02
05:23 UTC

0

Trying to move into DevOps - Need help reviewing resume

Hi everyone, I currently work remotely at a small business that does physical labor contracting and I maintain there website and everything considered IT within the business. I have basically automated everything to where I have nothing to do all day other than study and wanted to try and get into the devops field and need help peer reviewing my resume.

https://imgur.com/a/ARYgTEn

Also, please let me know if any changes are necessary and why, so I can learn and improve on making/editing my resume. Thanks in advance for all your feedback!

2 Comments
2024/05/01
23:50 UTC

27

Should I monitor my monitoring stack?

(🤯)
How do you ensure that your monitoring stack is working as expected and you didn't messed up the config?

If you're using a Saas (Grafana cloud, datadog, whatever) do you have another solution that will alert you in case of an outage?

Maybe it's just that there's no simple solution that's worth the effort. ¯\_(ツ)_/¯

41 Comments
2024/05/01
22:38 UTC

0

How to practice CI/CD?

Hello, I am new to devops. I have been watching some video on Youtube on how to get starterd (mostly videos on Gitlab). So far I watched TechWorld with Nana 1 hour video and Automation Step by Step playlist on Gitlab. My question is how should I practice CI/CD. And any other resources preferrably free.

13 Comments
2024/05/01
22:10 UTC

0

Which Cloud provider offer free credits on GPU Nvidia instances?

I want to test and train my AI model on GPU VM but when trying Azure or Google Cloud, they doesnt allow to use free credits on GPU instances (at least those in which I'm interested in). Is there any provider I could use or I will need to pay for this kind of machines from my wallet?

3 Comments
2024/05/01
18:12 UTC

1

APM for react

What APM do you use for React application?

We have already Grafana Tempo, we don't want to install Elasticsearch just for react metrics.

0 Comments
2024/05/01
18:05 UTC

4

GitOps - how to generate manifests in CI step and apply in CD step?

I'm learning CICD with ArgoCD and Argo Workflows, and I want to set up a system for devs where they can deploy new apps just by providing a link to a repository.

The workflow I have in mind is:

  1. Build an image for the code and push it to a registry
  2. Generate manifest files for a new deployment
  3. Push the files to the ArgoCD repository.
  4. Let ArgoCD reconcile the manifests and deploy the app.

What I'm not sure about is how to proceed with steps 2 and 3. I've searched a lot and found many examples for updating an image tag in an existing manifest with things like ArgoCD image updater, but nothing about adding new manifests

What would be the best way to do this? Am I supposed to have the workflow commit the manifests to git? Something like this?

- name: Prepare Kubernetes manifests
  run: |
    # Example: Dynamically creating a Helm values file if it doesn't exist
    if [ ! -f "chart/${{ matrix.app }}/values.yaml" ]; then
      echo "image: myregistry/${{ matrix.app }}:${{ github.sha }}" > chart/${{ matrix.app }}/values.yaml
    fi


- name: Commit and push changes
  run: |
    git config user.name 'GitHub Actions'
    git config user.email 'actions@github.com'
    git add .
    git commit -m "Deploying new version of ${{ matrix.app }} - ${{ github.sha }}"
    git push
2 Comments
2024/05/01
18:01 UTC

153

DevOps School Encourages To Fake Experience

Hello, there is a DevOps school in Chicagoland that encourages the students to fake their experience I mean they literally tell you to pick a company and add 5-8 years of fake experience. They have a fake staffing company with the backup numbers in case when the companies are trying to verify the experience. Are there any ways to stop this nonsense?
Don't get me wrong but it creates a lot of unfair condition to the people who just starting in devops or don't want to fake on their resumes.

66 Comments
2024/05/01
17:50 UTC

0

With all of of the recent news around Hashicorp

I thought some users looking for an alternative would appreciate seeing this from a former Hashicorp Solutions Engineer.

In full transparency, I work for a competitor r/akeyless

7 Comments
2024/05/01
17:31 UTC

2

Resources to learn Kubernetes

Hello everyone, I wanted to learn k8, I visited the documentation and was overwhelmed. Can you please help me with some resources that would help me? Maybe a course on udemy it something?

Thank you in advance

12 Comments
2024/05/01
16:28 UTC

1

Mac in cloud replacement

Hi everyone, we are currently using mac in cloud services in order to run a dedicated mac server, as we have crucial software that is available only on IOS.

We have multiple users login via VNC in order to work simultaneously, and currently we are experiencing some issues with mac in cloud.

I am looking for a new mac server cloud provider, and looking for recommendations.

I saw macstadium and macweb, anyone have any previous experience with them?

5 Comments
2024/05/01
12:54 UTC

Back To Top