/r/devops

Photograph via snooOG
Welcome to /r/DevOps

/r/DevOps is a subreddit dedicated to the DevOps movement where we discuss upcoming technologies, meetups, conferences and everything that brings us together to build the future of IT systems

What is DevOps? Learn about it on our wiki!

Traffic stats & metrics

Rules and guidelines

Be excellent to each other!

All articles will require a short submission statement of 3-5 sentences.

Use the article title as the submission title. Do not editorialize the title or add your own commentary to the article title.

Follow the rules of reddit

Follow the reddiquette

No editorialized titles.

No vendor spam. Buy an ad from reddit instead.

Job postings here

More details here

Social & Fun

@reddit_DevOps

##DevOps @ irc.freenode.net

Find a DevOps meetup near you!

Icons info!

General Information

https://github.com/Leo-G/DevopsWiki

/r/devops

321,729 Subscribers

0

Where do I draw the line between Mid- and Senior-level effort, so that I don't do more work than I'm being paid for?

I recently applied to a senior level role at a company. They ended up making me an offer but at a lower salary than I was expecting (It's a pay cut but it's not exactly crap pay either). I also noticed that my title in HR System does not include senior.

I'm the only full-time infra engineer.

I'm totally capable of going into the company and leading all efforts for their current and future needs, but they're not titling nor paying me to do so apparently.

I've been trying to read through generalized engineering levels on the internet, but most are written about software developers.

So, I'm asking here, too:

How much effort and responsibility would you limit yourself to? and what boundaries would you hold? in order to make sure you weren't doing senior level work for mid pay and title. Thanks!

1 Comment
2024/05/01
22:41 UTC

1

Should I monitor my monitoring stack?

(🤯)
How do you ensure that your monitoring stack is working as expected and you didn't messed up the config?

If you're using a Saas (Grafana cloud, datadog, whatever) do you have another solution that will alert you in case of an outage?

Maybe it's just that there's no simple solution that's worth the effort. ¯\_(ツ)_/¯

1 Comment
2024/05/01
22:38 UTC

1

How to practice CI/CD?

Hello, I am new to devops. I have been watching some video on Youtube on how to get starterd (mostly videos on Gitlab). So far I watched TechWorld with Nana 1 hour video and Automation Step by Step playlist on Gitlab. My question is how should I practice CI/CD. And any other resources preferrably free.

6 Comments
2024/05/01
22:10 UTC

0

Which Cloud provider offer free credits on GPU Nvidia instances?

I want to test and train my AI model on GPU VM but when trying Azure or Google Cloud, they doesnt allow to use free credits on GPU instances (at least those in which I'm interested in). Is there any provider I could use or I will need to pay for this kind of machines from my wallet?

3 Comments
2024/05/01
18:12 UTC

0

APM for react

What APM do you use for React application?

We have already Grafana Tempo, we don't want to install Elasticsearch just for react metrics.

0 Comments
2024/05/01
18:05 UTC

2

GitOps - how to generate manifests in CI step and apply in CD step?

I'm learning CICD with ArgoCD and Argo Workflows, and I want to set up a system for devs where they can deploy new apps just by providing a link to a repository.

The workflow I have in mind is:

  1. Build an image for the code and push it to a registry
  2. Generate manifest files for a new deployment
  3. Push the files to the ArgoCD repository.
  4. Let ArgoCD reconcile the manifests and deploy the app.

What I'm not sure about is how to proceed with steps 2 and 3. I've searched a lot and found many examples for updating an image tag in an existing manifest with things like ArgoCD image updater, but nothing about adding new manifests

What would be the best way to do this? Am I supposed to have the workflow commit the manifests to git? Something like this?

- name: Prepare Kubernetes manifests
  run: |
    # Example: Dynamically creating a Helm values file if it doesn't exist
    if [ ! -f "chart/${{ matrix.app }}/values.yaml" ]; then
      echo "image: myregistry/${{ matrix.app }}:${{ github.sha }}" > chart/${{ matrix.app }}/values.yaml
    fi


- name: Commit and push changes
  run: |
    git config user.name 'GitHub Actions'
    git config user.email 'actions@github.com'
    git add .
    git commit -m "Deploying new version of ${{ matrix.app }} - ${{ github.sha }}"
    git push
1 Comment
2024/05/01
18:01 UTC

45

DevOps School Encourages To Fake Experience

Hello, there is a DevOps school in Chicagoland that encourages the students to fake their experience I mean they literally tell you to pick a company and add 5-8 years of fake experience. They have a fake staffing company with the backup numbers in case when the companies are trying to verify the experience. Are there any ways to stop this nonsense?
Don't get me wrong but it creates a lot of unfair condition to the people who just starting in devops or don't want to fake on their resumes.

34 Comments
2024/05/01
17:50 UTC

0

With all of of the recent news around Hashicorp

I thought some users looking for an alternative would appreciate seeing this from a former Hashicorp Solutions Engineer.

In full transparency, I work for a competitor r/akeyless

6 Comments
2024/05/01
17:31 UTC

0

Resources to learn Kubernetes

Hello everyone, I wanted to learn k8, I visited the documentation and was overwhelmed. Can you please help me with some resources that would help me? Maybe a course on udemy it something?

Thank you in advance

6 Comments
2024/05/01
16:28 UTC

1

Mac in cloud replacement

Hi everyone, we are currently using mac in cloud services in order to run a dedicated mac server, as we have crucial software that is available only on IOS.

We have multiple users login via VNC in order to work simultaneously, and currently we are experiencing some issues with mac in cloud.

I am looking for a new mac server cloud provider, and looking for recommendations.

I saw macstadium and macweb, anyone have any previous experience with them?

5 Comments
2024/05/01
12:54 UTC

0

How much should I ask for to move be on-call?

Hope this isn't too much of a 'career' question for this sub, let me know and I'l delete:

I'm working on salary at a small shop, just me and one SRE on incident response, and to close a large new client they're asking that we have an on-call policy. Previously it was pretty casual, now I'm being asked to cover half the calendar. We generally have less than one incident per week, but there were queueing issues late last year where there were incidents every day.

My manager, who is great, asked me to pick a number for compensation for on-call and I really don't know what to ask for. I like this job so I'm not going to quit or threaten to quit over this, just want some advice.

US salaried employee, making industry median if that's helpful.

15 Comments
2024/05/01
11:25 UTC

0

Unable to access docker container from host

I'm not able to access my python container from localhost:3000. What might be the issue and how do I resolve it .

The docker file is,

FROM python:3.7
RUN apt-get update -y && apt-get upgrade -y
RUN apt-get install \
    gcc nano \
    ffmpeg libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 \
    postgresql postgresql-contrib -y
RUN pip install numpy scipy matplotlib pydub pyaudio psycopg2 flask
WORKDIR /code
EXPOSE 3000
CMD ["python3", "./dejavu/api/api.py"]

The docker compose file is

version: "3"
services:
  db:
    build:
      context: ./docker/postgres
    environment:
      - POSTGRES_DB=dejavu
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    networks:
      - db_network

  python:
    build:
      context: ./docker/python
    ports:
      - "3000:3000"
    volumes:
      - .:/code
    depends_on:
      - db
    networks:
      - db_network

networks:
  db_network:
    driver: bridge

The flask API is running successfully on docker port 3000.

18 Comments
2024/05/01
11:24 UTC

2

ACloudGuru Discount Codes anyone?

Anyone got a working (May 2024) discount code for Cloud Guru membership? If you don't ask... etc.

I have found 50OFF but the Pluralsight redeem page is having none of it.

0 Comments
2024/05/01
10:13 UTC

0

Terraform and Ansible were already Terrible in 2016

"Provison servers, configure servers. IBM might be on to something..."

https://www.tibobeijen.nl/2024/04/30/terrible-and-ansible-were-already-terrible-in-2016/

Blogged a bit about the lifecycle of an in-house tool, as much as it is about DevOps culture within a large(-ish) organization. I hope it makes for an entertaining read!

6 Comments
2024/05/01
04:35 UTC

0

Deploying uWSGI without NGINX (or webservers)

I am currently designing a deployment for a financial app managing 100k pay points.

Options are below
haproxy (load balancer) => nginx/uwsgi/django
or
nginx (load balancer) => uwsgi/django
or
haproxy (load balancer) => uwsgi/django

To reduce latency, I am leaning towards the last option which is removing NGINX and making uWSGI to ask as the reverse proxy as well as the wsgi.

What are the cons, or issues with the option?

2 Comments
2024/05/01
00:18 UTC

42

Where to go after DevOps?

I have been in technology long enough to have experienced the early days of the traditional sysadmins with in house servers, networking, then DevOps, SRE and most recently platform engineering, understanding they are not exactly the same thing. This path brought me a lot of knowledge and professional satisfaction to a certain degree, however, I think I am in a pivotal point where i don't know what makes sense for me to be the next step.

I would like to capture some impressions from people who had transit a similar path, what was next for you?

Update: many valuable comments, thanks 🙏

92 Comments
2024/04/30
21:46 UTC

0

What open-source tools are you using in your org?

Hey all, I recently discussed my company's move from Enterprise clientele to a more SaaS-focused one.

Here's one of the problems I'm out to solve:

  • Open-source tools have become very good these days
  • But they can be difficult to get up and running, especially at larger scales.

I figured we could integrate with popular open-source tools to solve this. The idea is to make the installation and setup as easy as possible for those who want to try the open-source tool for their org.

I have a short list of such tools but I'd LOVE your help in finding out more!

12 Comments
2024/04/30
21:00 UTC

1

Diagram Tools with in-built PDF Viewer?

Hey all,

I'm trying to build a website with an embedded Flow Chart Editor on it.
The Flow Charts should be interactive, meaning i should be able to click on them like a button and make a corresponding pdf pop up next to it. Are there any tools with this functionality? I couldnt seem to find any that can do this.

4 Comments
2024/04/30
20:57 UTC

1

CI/CD tools for RTL/DV work

What CI tools are used at companies that develop chips for RTL and DV code ? Google, Microsoft, Nvidia, AMD etc..

1 Comment
2024/04/30
20:31 UTC

1

.NET Shop, Tools for Performance Analysis

Hello everyone,

I'm fairly new to performance in general, but I see tons of resources on performance analyzing via perf tools in linux like from Brendan Gregg.

In a dotnet shop, how would I go utilizing tools that are equivalent to this for gathering performance related data that can help me debug performance bottlenecks?

I was thinking either gaining access to the actual server and running CLI commands that .NET already has or running a container in linux that has the APIs hosted there, but I assume that wouldn't be accurate performance analysis due to the varying environment differences.

I've been learning how to utilize perfview/WPA and just using the stack trace that is gathered on Application Insight.

Some tools we use for monitoring/logging are:

- Azure/Application Insight

- Splunk

I'm quite lost on which direction to go, but if anyone can help. Thank you!

0 Comments
2024/04/30
17:59 UTC

4

Developer experience session with Confluent's platform team

We are hosting a Zoom session with Confluent to talk about their internal pluggable service runtimes - language-specific, plugin-based component frameworks with the most common components that teams need included out-of-the-box.

About the guest:
Cody is an OG at Confluent currently managing the Platform Engineering Team (Service Foundations team). https://www.linkedin.com/in/codyaray/

When: Friday May 3rd, 2024 at 11am PDT | 2pm EDT
RSVPhttps://forms.gle/Td1xzX8iFXTbdmAq9

1 Comment
2024/04/30
17:37 UTC

0

Best way to supply updates to a fork of something?

For example, I have a framework that I use that has a basic template for building websites. I then deploy the websites standalone, however is there a way to update the original framework/template repository and then deploy those updates to the sites that have modified and built ontop of the framework?

1 Comment
2024/04/30
16:37 UTC

254

OpenTofu 1.7.0 is out with State Encryption, Dynamic Provider-defined Functions

Hey there, technical lead of the OpenTofu project here!
We’re proud to announce that OpenTofu 1.7.0 is now officially out!
It includes State Encryption, Provider-defined Functions, Declarative removed blocks, loopable import blocks, and much more!
You can find the launch post here, as well as the release itself here.
Looking forward to hearing what you think!

50 Comments
2024/04/30
16:01 UTC

0

Datadog only alert after x number of threshold alerts

Sorry if this is obvious, but I'm new to Datadog. I'm looking to set a Datadog monitor to only send an alert if it exceeds a threshold x amount of times. Is it possible to do this in a monitor?

Or is it possible to have a monitor for a monitor to count the number of alerts in a given time frame? I only bring this up because I saw a post somewhere that mentioned doing this, but I haven't been able to figure out how they did it

Edit: adding context, I'm looking to see if CPU/memory usage is exceeding 80% of the max available 10 times in 30 minutes. I've already been able to use queries to find the percentage used, now I'm just trying to only trigger an alert if it's happening 10 times in 30 minutes. If it only happens one or two times that's fine, I'm looking for if there's consistent high usage

6 Comments
2024/04/30
15:58 UTC

2

Managing Multiple Docker Containers on a Virtual Machine

How do you effectively manage the deployment of multiple Docker containers on a virtual machine, and are there any services you recommend for streamlining this process?

20 Comments
2024/04/30
15:50 UTC

0

Security Considerations in the Time of AI Engineering

Before working with AI developer tools, understand what it could mean for you, your product, and your customers.
Read more…

1 Comment
2024/04/30
15:44 UTC

31

How do you get your mojo back after burnout?

A few years ago I burnt myself out, like really burt out. I spiraled for a bit, had a leave of absence but eventually got my feet back under me.

I'm now trying to get back in my groove but I can't find it for the life of me. I've tried a few different positions at different companies but no matter what the work is I just can't do more then the bare minimum. I admit part of my spiral was due to the lack of recognition for the work that put me into it. I think this plays a big role in this but I just can't get my mojo back.

29 Comments
2024/04/30
15:42 UTC

1

Open Source Datadog Guide

We published an open source guide to help customers make sense of datadog. It's meant as a reference for engineers and organizations that want to make the most of their datadog usage and avoid serious gotchas that could result in significant higher costs or effort. It's still early days and the guide is far from comprehensive - please help by contributing 🙏

https://github.com/nimbushq/og-datadog

0 Comments
2024/04/30
15:06 UTC

2

Looking for something like Prometheus but for development debugging rather than monitoring

Finally got around to playing with Prometheus+Grafana. Both excellent tools. I'm building Prometheus metrics into a Golang service I'm developing. I find the Go runtime data it exposes to be very useful. However, I find myself trying to shoehorn the charts into a debugging role, for example if I think I have a memory leak I find myself checking my Grafana charts for clues.

This isn't very effective because Prometheus only updates every 15+ seconds so my iterations are slow. What I really need is immediate data. I'm going to try cranking up the Prometheus polling rate, but I figured I should also check and see if there are any tools that are more designed for this sort of thing that I should be looking into.

9 Comments
2024/04/30
14:41 UTC

1

Packer Azure Windows image for VMSS

Hi there,

So I'm trying to create a Windows Image for an existing VMSS. The build via Packer is going quite smooth and it eventually puts the image in Shared Image Gallery where the VMSS can pick it up. The whole proces is working apart from the fact that the VMSS is not creating the instance properly. It is staying on the Creating (Running) status and nothing ever happens. The weird thing is that I can actually deploy a stand-alone VM with this image and it works just fine.

So I'm a little bit lost as to what I can do to figure it out because the VMSS is not showing any logs as to why it cannot succeed with the deployment of the VMSS instance.

My Packer config is as follows:

source "azure-arm" "packer_image" {

azure_tags = {
Image = "VMSS"
CreatedBy = "Packer"
Owner = "**"
}
client_id = var.ARM_CLIENT_ID
client_secret = var.ARM_CLIENT_SECRET
tenant_id = var.ARM_TENANT_ID
subscription_id = var.ARM_SUBSCRIPTION_ID
image_offer = var.image_offer
image_publisher = var.image_publisher
image_sku = var.image_sku
location = var.location
shared_image_gallery_destination {
resource_group = var.managed_image_resource_group_name
gallery_name = var.gallery_name
image_name = var.image_name
image_version = var.image_version
target_region {
name = var.location
}
}
managed_image_name = var.managed_image_name
managed_image_resource_group_name = var.managed_image_resource_group_name
os_type = var.os_type
vm_size = var.vm_size
communicator = "winrm"
winrm_insecure = true
winrm_timeout = "5m"
winrm_use_ssl = true
winrm_username = "packer"
}
build {
sources = ["source.azure-arm.packer_image"]
provisioner "powershell" {
inline = [ "while ((Get-Service RdAgent).Status -ne 'Running') { Start-Sleep -s 5 }", "while ((Get-Service WindowsAzureGuestAgent).Status -ne 'Running') { Start-Sleep -s 5 }", "& $env:SystemRoot\\System32\\Sysprep\\Sysprep.exe /oobe /generalize /quiet /quit", "while($true) { $imageState = Get-ItemProperty HKLM:\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Setup\\State | Select ImageState; if($imageState.ImageState -ne 'IMAGE_STATE_GENERALIZE_RESEAL_TO_OOBE') { Write-Output $imageState.ImageState; Start-Sleep -s 10 } else { break } }"]
}
}

I'm deploying this OS type:
image_publisher: MicrosoftWindowsServer
image_sku: 2022-Datacenter-g2
os_type: Windows
vm_size: Standard_DS2_v2

And again, the image is usable for a stand-alone VM. So what else can I check? The image gallery is using Hyper-V2.

Hopefully someone can point me in the right direction.

Thanks!

0 Comments
2024/04/30
13:51 UTC

Back To Top