/r/cloudcomputing
News, articles and tools covering cloud computing, grid computing, and distributed computing.
News, articles and tools covering cloud computing, grid computing, and distributed computing. For all your public cloud, multi-cloud, hybrid cloud and private cloud needs.
Resources:
Other subreddits you may like:
Does this sidebar need an addition or correction? Tell me here
/r/cloudcomputing
Hi r/cloudcomputing,
I’m seeking advice on how best to approach a AI project with given my constraints and goals. Here’s my situation:
Background:
I work as an "AI engineer" at a small manufacturing company (~105 employees), tasked with improving our sales processes. Specifically, I’ve been asked to develop an AI sales bot to assist our sales team in real-time by navigating complex product configurations.
The bot should allow salespeople to interact with it during calls, answer queries about product options, and provide additional guidance, like warnings or exceptions for certain configurations. My ultimate question is whether Azure services (like Cognitive Search) and current AI tools are sufficient to meet my needs internally, or if I should outsource this work.
Experience:
Resources:
Data Overview:
What I Need the Bot to Do:
Proposed Approach:
Why This Approach?
Challenges:
My Questions:
Appreciate any guidance or suggestions on this! Thank you!
I'm working on a AI project that requires significant GPU and CPU resources It is for AI flux model creating images with AI. I'm looking for a tool or service that can compare prices between AWS, Google Cloud, Azure, and other cloud providers.
Ideally, I'd like to specify my requirements (e.g., GPU model, CPU min/max, memory) and get a comparison of prices across different providers. I recall using a Python library in the past that did something similar, but I've forgotten the name. The library would choose the cheapest option based on your requirements and help you easily deploy it.
Does anyone know of a similar library or tool that can help me optimize my costs? I'd appreciate any suggestions or recommendations.
Some specific requirements I'm looking for include:
* Support for multiple cloud providers
* Ability to specify custom GPU, CPU, Storage, VRAM and RAM requirements
* Price comparison and optimization
* Is it plus but is not needed?: Integration with Python (ideally through a library or API)
I'm using Gandi as my domain registrar for pease.com, but the actual server is hosted elsewhere. Currently, I have a subdomain sub.pease.com pointing to sub2.pease.com via a CNAME, which ultimately points to my server using an A record. I want to implement Cloudflare's WAF (Web Application Firewall) so that traffic to sub.pease.com is routed through Cloudflare for protection. However, I don't want to change the NS (nameservers) in Gandi, as I only need the WAF for this specific subdomain. Does anyone know how to achieve this setup? Any advice would be greatly appreciated!
Domain registrar: Gandi for the domain pease.com.
Current setup: sub.pease.com points to sub2.pease.com via CNAME. sub2.pease.com has an A record pointing to the actual server.
Goal: Implement Cloudflare's WAF for sub.pease.com to route traffic through Cloudflare for security. Avoid changing the NS (nameservers) in Gandi.
Challenge: How to configure Cloudflare's WAF for sub.pease.com without migrating all DNS management to Cloudflare?
Question: Does anyone have experience or ideas to achieve this setup?
Lawrence Livermore National Laboratory (LLNL) is set to launch El Capitan in 2024, the first U.S. exascale supercomputer with a staggering computing capacity exceeding 2 exaflops. Designed to ensure national security and drive groundbreaking scientific research, it features AMD’s cutting-edge MI300 APUs, a Slingshot interconnect, and unparalleled energy efficiency. Beyond its defense applications, El Capitan will also tackle challenges like climate change, material discovery, and cancer research. With its debut, it marks a new era in computing power and innovation. What are your thoughts on its potential to shape science and technology?
Reference and more información in Cloud News.
Per title, would greatly appreciate any insights/comments. I’m new to this space, so apologies if my question is too simple/obvious.
I don't know about yall, but managing GPU resources for ML workloads in Databricks is turning into my personal hell.
😤 I'm part of the DevOps team of an ecommerce company, and the constant balancing between not wasting money on idle GPUs and not crashing performance during spikes is driving me nuts.
Here’s the situation:
ML workloads are unpredictable. One day, you’re coasting with low demand, GPUs sitting there doing nothing, racking up costs.
Then BAM 💥 – the next day, the workload spikes and you’re under-provisioned, and suddenly everyone’s models are crawling because we don’t have enough resources to keep up, this BTW happened to us just in the black friday.
So what do we do? We manually adjust cluster sizes, obviously.
But I can’t spend every hour babysitting cluster metrics and guessing when a workload spike is coming and it’s boring BTW.
Either we’re wasting money on idle resources, or we’re scrambling to scale up and throwing performance out the window. It’s a lose-lose situation.
What blows my mind is that there’s no real automated scaling solution for GPU resources that actually works for AI workloads.
CPU scaling is fine, but GPUs? Nope.
You’re on your own. Predicting demand in advance with no real tools to help is like trying to guess the weather a week from now.
I’ve seen some solutions out there, but most are either too complex or don’t fully solve the problem.
I just want something simple: automated, real-time scaling that won’t blow up our budget OR our workload timelines.
Is that too much to ask?!
Anyone else going through the same pain?
How are you managing this without spending 24/7 tweaking clusters?
Would love to hear if anyone's figured out a better way (or at least if you share the struggle).
Hello! Not sure if this is the right subreddit, if not please tell me where I should ask this question.
I am part of a high school computational research group and we have a molecular dynamic simulation in OpenMM. One of the major issues right now is being able to run enough replications (simulations) for it to be a strong research paper and get proper results. Our current simulation time is ~8 hours with a RTX 4060 ti and Ryzen 5 5700h. We only have this week to get, analyze the results, and finish the paper for submission to a contest. One of the solutions our advisor gave us was to use Amazon Web Services (AWS) to do this, but we're worried that it would cost a lot or that it would be too slow for us to make it to the deadline. Not to mention that none of us are experienced with cloud services and we're not sure where to begin.
So my question to you all is how do I do this? How much would it cost? How long would it take to run one simulation? Time to setup (Code is already completed, just the time to set up the service along with changing the code for it to be compatible)? Does AWS allow other python packages to be imported? Any tips for a first time beginner? (I did do a little bit of research on this, but not much so any info would be appreciated).
Simulation info:
Coding Language: Python
Packages and Modules: OpenMM, PyRoseTTA, some built in python ones
Simulation details: https://www.reddit.com/r/comp_chem/comments/1gyxjvj/minimum_trials_for_molecular_dynamic_simulation/ (Mainly bc I don't want this post to be too long nor is this a Computational Chem subreddit, I'll change this link if you'd rather see the info and not the post)
Memory Usage when running: 512 MB to 1 GB of Memory
How is AI transforming cloud computing services in terms of efficiency, innovation, and scalability?
Hey,
I'm a working student, during my web dev experience, I noticed a major gap in headless CMS solutions. And that's funny, because CMS is like JS framework, every day is new created. But let's be serious, imagine you are working on information system and after the work is started, maybe near the end of the project, customer changes his requirements and desires some blog-like functionality, or something that requires CMS (or a lot of work to make fancy CRUD). So you decide to use CMS, as the customer wants to save on dev work. And the problem is, that the CMS requires some tech stack, and to be honest, that tech stack never matched the existing stack of the system (like GCF, Firestore, ...).
Since I really needed CMS for my tech stack, I decided to write my own, but then I realized, I'm reinventing the wheel and polluting this world with another CMS. So I decided to make platform-agnostic CMS as my bachelor's thesis. I'm working on it more than a year (not every day), and I have working prototype (but until I nail down few things, I will keep it close source), which allows adapting the CMS to almost any platform. And not just that, but the DB can be DynamoDB on AWS, storage can be at Azure, and the CMS UI can be hosted on Cloud Run. And this flexible has its own pros.
But now I'm facing a dilema, since it's still easy to do, should I redesign the system to have ability to use one type of service on multiple clouds? Like having three buckets, one on GCP, the second one on AWS, and the last one on Azure. Also the ability to work with multiple databases on multiple clouds.
This feature would be 100% cool, but to be honest, I never needed it. Although the fact I didn't need it, doesn't mean, that someone else didn't need it. So I would like to hear your opinion.
Blog on securing access, providing governance and visibility here https://blog.strato-cloud.io/2024/11/04/strato-cloud-to-secure-access-provide-governance-and-visibility-for-multicloud/
More details at https://strato-cloud.io and https://x.com/stratocloudio
Would also like input and feedback from this forum on the painpoints with multicloud or feedback
appreciate!
For a concise overview of Cloud Composer check out this article: https://differ.blog/p/cloud-composer-a-quick-overview-of-gcp-workflow-orchestration-c835a0
If this has already been discussed elsewhere, I would appreciate someone pointing me in the right direction.
I'm curious what the role of SIs are in today's cloud ecosystems. Like if I click on any of the hyperscalers or cloud-native ISVs partner pages, I see lists of hundreds of SI partners (from big ones like Accenture and Deloitte to vertical/system-specific SIs like TTEC to tiny SIs with just a few customers). My understanding is that there is a pretty low barrier to entry to becoming a SI, but it's very much a relationship and scale business so the biggest guys have the largest client networks/connections and work with the most partners and make the most money. Is this right? How much scale do you have to get to in order to make good money and what kind of operating margins are the norm here? I'm guessing there are nuances with the niche or sector but curious if there are any generalities / rules of thumb here.
Separately but related to this, from the SI's perspective, how does the economics for implementing a Cloud migration and selling cloud offerings compare to selling on-prem? My guess/understanding is the SI gets a % commission upfront and they usually get paid for maintenance. If that's the case then they would get paid much more upfront for selling on-prem offerings since it's a much bigger sale, and they would get paid for the annual maintenance. Whereas for cloud they get much less upfront for the same sales effort so both the dollars and margins are lower, but they get a cut of the annual subscription revenue, which likely exceeds the annual maintenance for on-prem, so they end up making more recurring revenue? The net result from SI's POV is probably still that cloud is worse than on-prem but there is a much bigger runway for cloud migrations and implementations?
Finally, does gen AI change how cloud offerings are sold (e.g., less complexity, more DIY, so less need for SIs)?
I don’t have a real computing device (I have an iPad which I can use to remote into other devices.
I have licensed copies of Retro OS like Windows 3.1/95/98/ME etc. I would like to run them somewhere for fun.
If I rent a Windows VM somewhere, can I install a hypervisor in it and run these OS? Or does VM inside a VM doesn’t work well? If it can work, what service and hypervisor would you recommend?
I really don’t want to buy another device and would prefer to do everything on the cloud. Bandwidth is no concern.
I work as a data scientist and have been using posit.cloud for my R projects. I love the fact that I can hop between projects and, whenever I log in a project, every object stays there like I have never left. This is done without having to consciously saving any image, session, etc. Is there such a thing for Python? Thanks!
Hello, is any one knows the proper way to mount an EFS mount from a different region. I have done the vpc peering and enabled dns resolution. Private hosted zone is already created. I have enter the FS ID of the different region efs in the storage class also. When creation of the pvc , seems it cannot find the FS ID . So the pod is not starting up and pvc is in pending status. How to fix this issue.
I have set up a Dataproc cluster on GCP to run spark jobs and the spark job resides on a GCS bucket that I have already provisioned. Separately, I have setup kafka on AWS by setting up a MSK cluster and an EC2 instance which has kafka downloaded on it.
This is part of a larger architecture in which we want to run multiple microservices and use kafka to send files from those microservices to the spark analytical service on GCP for data processing and send results back via kafka.
However I am unable to understand how to connect kafka with spark. I dont understand how they will be able to communicate since they are on different cloud providers. The internet is giving me very vague answers since this is a very specific situation.
Please guide me on how to resolve this issue.
PS: I'm a cloud newbie :)
https://www.infoq.com/news/2024/11/allegro-dataflow-cost-savings/
Allegro achieved significant savings for one of the Dataflow Pipelines running on GCP Big Data. The company continues working on improving the cost-effectiveness of its data workflows by evaluating resource utilization, enhancing pipeline configurations, optimizing input and output datasets, and improving storage strategies.
^ Basically, what the title says, I am only asking to understand if the Cloud is essentially about lending a virtual computer (aka VM). Therefore, all the extra services that are better specialized/optimized to handle your specific use case (e.g., storing objects/files) are ultimately on a VM.
Edit:
By cloud services, I mean specifically services related to cloud computing.
I just signed up for multcloud and am considering setting up a sync task to keep iCloud photos and my DS220+ in sync. The only issue I have is that multcloud uses FTP for NAS sync. At the same time, this seems like a bad security practice for obvious reasons. Is there a better way?
Hi is anyone using Databricks or thinking of using it? Please let me know your experience.
Hello , Someone has some resources about how to troubleshoot and improve slow responses times for applications hosted on Amazon ECS?
I'm planning on making a mobile phone app and need to pick a cloud provider to handle the backend but I'm having problems deciding between the three.
I'd like use cloud functions via REST API, hosted PostgreSQL, storage and authentication.
How do you decide between the three? They seem to offer similar services so picking one over the others is tricky for someone like me who doesn't have much experience.
What are your must-have online resources, utility, tools, and repos for cloud engineers?
Hello, Just wanted to know whether getting Cloud certifications and Linux certifications are enough for going into DevOps? Or do I need to get a CCNA certification as well ? My background is from CS and CE, I do know the basics of networking. Thanks for reading.
Curious to know how your startup or company manages AWS costs. Do you stick with AWS’s native tools like Cost Explorer, or use third-party solutions? Any tips, tools, or strategies that have worked well for keeping costs down?
How many times like do you check your costs?(Monthly weekly daily )
Soo, I'm working on a VDP & while doing recon I found a request that was been made to some Microsoft service, later I found that the site is hosted on Azure, so it makes sense that the request was related to the cloud instance... Is it that easy to find the cloud IP ?? Cause before also I had found an AWS instance IP with the same method ?? What are your thoughts ?
Hey everyone! I'm a backend developer working on a NestJS and Angular app, using Socket.IO for WebSocket connections. The app is already deployed and running over HTTPS, but WebSocket connections are failing with mixed-content blocking errors in the browser console. I’m using wss://
on the frontend, but it still fails.
I’ve configured CORS and is set to allow requests from the frontend. The WebSocket URL is set to wss://
, but the connection gets blocked.
Could anyone suggest what I might be missing on the backend? Also, any deployment-level fixes for WebSocket support ?
Thanks in advance for your help!
A question to spark discussion on serverless computing, especially for heavy or large-scale applications. How is the scalability and cost-effectiveness?
Does anyone know about any decentralized data storage solution? I just found AITECH Solidus through its Social Hub launched by DAOlabs, and I feel we might need to look at it. Is there any discussion around a decentralized data storage facility right here?
I recently did an exploration of the various cloud-native technologies and architectures. I put the uncovered information in a series of PDFs that I'm sharing below 👇 with you:
Feel free to explore all of them and don't forget to let me know your comments:
AI/LLM
Harness Proprietary Data with Foundational Models and RAG https://mveteanu.me/pdf/rag.pdf
A visual presentation of Leading AI Studios https://mveteanu.me/pdf/ai_studios.pdf
A Tour of Azure AI Services https://mveteanu.me/pdf/azure_ai.pdf
OWASP Top 10 for LLMs https://mveteanu.me/pdf/llm_security.pdf
Cloud
Core Services Across Azure, AWS, and GCP https://mveteanu.me/pdf/cloud_core.pdf
Select the right cloud-based DB for your project https://mveteanu.me/pdf/cloud_db.pdf
21 Tips for Designing Web APIs https://mveteanu.me/pdf/webapis.pdf
Leadership
25 Challenges Every R&D Leader Faces https://mveteanu.me/pdf/rd_challenges.pdf
Physical Product Design
Power Presenter: An OBS and PowerPoint clicker https://mveteanu.me/pdf/power_presenter.pdf
Stay Active: An AI solution for controlling TV time https://mveteanu.me/pdf/stay_active.pdf
Coral Micro: A dedicated coding computer https://mveteanu.me/pdf/coral_micro.pdf
Cloud architecture
SaaS vs IaaS vs PaaS https://mveteanu.me/pdf/saas_iaas_paas.pdf
Exploring Multi-Tenant Architectures https://mveteanu.me/pdf/multitenant_architectures.pdf
Pitfalls of Microservices https://mveteanu.me/pdf/pitfalls_microservices.pdf
Docker Tips https://mveteanu.me/pdf/docker_tips.pdf
Industry quotes
Key Quotes Driving the Software Revolution https://mveteanu.me/pdf/quotes.pdf