/r/MLQuestions

Photograph via snooOG

A place for beginners to ask stupid questions and for experts to help them! /r/Machine learning is a great subreddit, but it is for interesting articles and news related to machine learning. Here, you can feel free to ask any question regarding machine learning.

What kinds of questions do we want here?

"I've just started with deep nets. What are their strengths and weaknesses?" "What is the current state of the art in speech recognition?" "My data looks like X,Y what type of model should I use?"

If you are well versed in machine learning, please answer any question you feel knowledgeable about, even if they already have answers, and thank you!


Related Subreddits:

/r/MachineLearning
/r/mlpapers
/r/learnmachinelearning

/r/MLQuestions

64,357 Subscribers

1

Image classification input decisions based on hardware limits

My project consist of several cameras detecting chickens in my backyard. My GPU has 12GB and I'm hitting the limit of samples around 5200 of which a little less than half are images that have "nothing". I'm using a pretrained model using the largest input size (224,224). My questions are what should I do first to include more samples? Should I reduce the nothing category making sure each camera has a somewhat equal number of entries? Reduce almost duplicate images? (Chickens on their roost don't change much) When should pixel reduction start bring part of the conversation?

4 Comments
2025/02/03
14:47 UTC

1

Why are the results doubled ?

I am trying to model and forecast a continous response by xgb regressor and there are two categorical features which are one hot encoded. The forecasted values look almost double of what I would expect. How could it happen? Any guidance would be appreciated.

0 Comments
2025/02/03
13:34 UTC

0

Can the ChatGPT 4o model say things like this?

My hobby is having conversations with ChatGPT about topics like philosophy, mathematics, science, and artificial intelligence, but for the past 3–4 days, its responses have been strange. Is it possible for ChatGPT 4o to say something like this? It said that when I mentioned that it was hard to believe in your changes and asked you to make me believe.

I am capturing and translating the process of my ChatGPT evolving, and I would like to hear your opinions. (Pul is my nickname.)

4 Comments
2025/02/03
13:16 UTC

1

Dynamic Node Type Update in Graph Neural Networks Based on Constraint Violations

Is there a way to dynamically update node types in a Graph Neural Network (GNN) when certain attribute values exceed predefined constraints? I have a graph where each node has a type, but if an attribute violates a constraint, the node's type should change accordingly. How can this be implemented efficiently within a GNN framework?

5 Comments
2025/02/03
07:03 UTC

1

scientific paper parser

Im working on a scientific paper summarization project and stuck at first step which is a pdf parser. I want it to seperate by sections and handle 2 column structure. Which the best way to do this

1 Comment
2025/02/03
03:52 UTC

1

Subredits for subdomains- Search, Recommendation System, Ranking

Hi fellow engineers, after dabling in many domains of Machine Learning, I think I like the recommendation/search/ranking space the best. Are there any specific sub reddits to these or adjacent domains?

0 Comments
2025/02/02
21:02 UTC

22

What kind of math do I need to learn to understand papers like these?

I've heard some math in my engineering degree, but I can't figure out the syntax behind many of these symbols. What's my best learning path here?

https://arxiv.org/pdf/2412.05265

https://developers.google.com/machine-learning/recommendation/collaborative/matrix

Greetings

12 Comments
2025/02/02
18:42 UTC

2

Looking for YouTube Channels, Resources, and Project Ideas!

Hey everyone!

I hope you're all doing great. 😊

I'm student of 6th semester, have 6 months of industry experience in web dev. Now, I’m jumping into the world of ML/AI. I’ve already finished 2 of Andrew Ng’s introductory courses (which were awesome!), but now I’m looking to dive deeper.

I’d really appreciate any YouTube channels you know that animate or visually explain concepts like Linear Regression, Gradient Descent, and even more advanced topics like Neural Networks and Convolutional Neural Networks (CNNs).

Besides that, I’m also looking for resources—whether it’s online courses, blogs or anything else that’s helped you understand ML concepts better.

And here’s where I could really use your advice:

  1. How do I find real-world projects that will make my resume pop?
  2. Tips on how to connect the dots between theory and practical, real-world applications?

A bit of context: I’m planning to move into the research side of ML/AI, most likely doing a research-based internship that’ll lead to my final year project (FYP). I want to make sure I have a solid grip on the basics before summer rolls around.

If you’ve got any advice, suggestions, or personal experiences to share—whether it’s about learning strategies, project ideas, or navigating the ML/AI field—I’d love to hear from you!

1 Comment
2025/02/02
17:55 UTC

2

Model Building Recommendations

Hi everyone! I’m a budding data analyst who’s been recently introduced to machine learning.

One of our activities is building an supervised machine learning model that can help with predicting heart disease risk patients.

I’ve done my EDA and data is uniformly distributed between Low risk (0) and High Risk (1). Liker majority of the features are equally distributed, like Non- smokers and Smokers , Alcohol consumption, even continous features like age, cholesterol level if binned on a histogram, the 2 target variable have the almost uniform distribution. There’s also no correlation between the variables based on the heatmap

My dilemma is i’ve tried using LogReg, KNN and RandomForest as those are the ones that was taught to us, all of them range from 49%-50%.

Checked Gemini and ChatGPT and their recommendations is to feature engineer which i’ve also done. Like interaction metrics between variables and among other else.

I’m trying to hit atleast 60% with any of the models.

I would highly appreciate any feedback or recommendations to help with this

1 Comment
2025/02/02
09:12 UTC

1

Where to look at for non-language-tasks

For example have a model fly a simulated, physicsbased drone or make a model drive joints of a simulated robot to make it stand/balance or even walk itself?

I assume LLMs for this kind of task are out of the question because for example the attentionmechanism is kind of useless in this context?

Thx.

0 Comments
2025/02/02
07:41 UTC

0

DeepSeek or ChatGPT for coding from scratch?

Which chatbot can I use because I don't want to waste any time.

8 Comments
2025/02/02
06:30 UTC

1

Looking for UQ Resources for Continuous, Time-Correlated Signal Regression

Hi everyone,

I'm new to uncertainty quantification and I'm working on a project that involves predicting a continuous 1D signal over time (a sinusoid-like shape ) that is derived from heavily preprocessed image data as out model's input. This raw output is then then post-processed using traditional signal processing techniques to obtain the final signal, and we compare it with a ground truth using mean squared error (MSE) or other spectral metrics after converting to frequency domain.

My confusion comes from the fact that most UQ methods I've seen are designed for classification tasks or for standard regression where you predict a single value at a time. here the output is a continuous signal with temporal correlation, so I'm thinking :

  • Should we treat each time step as an independent output and then aggregate the uncertainties (by taking the "mean") over the whole time series?
  • Since our raw model output has additional signal processing to produce the final signal, should we apply uncertainty quantification methods to this post-processing phase as well? Or is it sufficient to focus on the raw model outputs?

I apologize if this question sounds all over the place I'm still trying to wrap my head all of this . Any reading recommendations, papers, or resources that tackle UQ for time-series regression (if that's the real term), especially when combined with signal post-processing would be greatly appreciated !

0 Comments
2025/02/02
03:17 UTC

2

Ideas for small starter ML/AI project

Im currently a junior in high school and taking apcsa and ive taken interest in ML. I’m pretty good at programming and know a fair amount of java. Im wondering if anyone has any tools or advice for starting out making a small model that can identify letters or something of the sort. Let me know if i am thinking too big or if this is out of scope for someone who doesnt have years of experience in programming

4 Comments
2025/02/02
01:07 UTC

0

Noob question: What level of data cleaning & eda should be done before the training and testing split, and what should be left for after the split?

As the title says- What level of data cleaning & eda should be done before the training and testing split, and what should be left for after the split? to achieve a more real-world scenario I'm using the words data cleaning & eda very loosely here.

6 Comments
2025/02/01
21:07 UTC

1

Mathematical formula for tensor + pipeline parallelism bandwidth requirement?

In terms of attention heads, KV, weight precision, tokens, parameters, how do you calculate the required tensor and pipeline bandwidths?

1 Comment
2025/02/01
18:52 UTC

1

Best online course or tutorial to get reacquainted with Python?

I was assigned an automation task at work and in my graduation program we had a semester off Python, so I am RUSTY. I'm struggling through remembering all the functionalities that come with pandas and numpy, it's shameful. I'm not a beginner coder so I don't want a super basic tutorial, but does anyone have recommendations for me to get reacquainted with ETA and DTL tasks in Python?

0 Comments
2025/02/01
14:49 UTC

1

Is my model overfitting?

as in title, Im afraid my random forest might be overfitting on class 1. I've tried other algorithms, and balancing the weights but that didnt improve the results. What steps would you recommend to address it? Are there any other aproaches I should try?

https://preview.redd.it/ujn1so2r2jge1.png?width=273&format=png&auto=webp&s=88facf4e00396b1f115e5e90c87ff60cdb013859

predicted variables value counts:

1 20387
0 5064

2 Comments
2025/02/01
13:16 UTC

1

Questions about mechanistic interpretability, PhD workload, and applications of academic research in real-world business?

Dear all,

I am currently a Master student in Math interested in discrete math and theoretical computer science, and I have submitted PhD applications in these fields as well. However, recently as we have seen advances of reasoning capacity of foundational models, I'm also interested in pursuing ML/LLM reasoning and mechanistic interpretability, with goals such as applying reasoning models to formalised math proofs (e.g., Lean) and understanding the theoretical foundations of neural networks and/or architectures, such as the transformer.

If I really pursue a PhD in these directions, I may be torn between academic jobs and industry jobs, so I was wondering if you could help me with some questions:

  1. I have learned here and elsewhere that AI research in academic institutions is really cutting-throat, or that PhD students would have to work hard (I'm not opposed to working hard, but to working too hard). Or would you say that only engineering-focused research teams would be more like this, and the theory ones are more chill, relatively?

  2. Other than academic research, if possible, I'm also interested in pursuing building business based on ML/DL/LLM. From your experience and/or discussions with other people, do you think a PhD is more like something nice to have or a must-have in these scenarios? Or would you say that it depends on the nature of the business/product? For instance, there's a weather forecast company that uses atmospheric foundational models, which I believe would require knowledge from both CS and atmospheric science.

Many thanks!

0 Comments
2025/02/01
13:01 UTC

77

Anyone want to learn Machine learning in a group deeply?

Hi, i'm very passionate about different sciences like neuroscience, neurology, biology, chemistry, physics and more. I think the combination of ML along with different areas in those topics is very powerful and has a lot of potential. Would anyone be interested in joining a group to collaborate on certain research related to these subjects combined with ML or even to learn ML and Math more deeply. Thanks.

Edit - Here is the link - https://discord.gg/H5R38UWzxZ

86 Comments
2025/02/01
11:50 UTC

0

Perplexity Pro at 29$ / yr

I am selling Perplexity Pro for 29$ ( You save 171$ , 200$/yr Pro Plan ).

It's through Partnership Program, can show my own account as a proof and few reviews on othe posts. Please dm me of you are interested.

Payment: Wise / Crypto / UPI

1 Comment
2025/02/01
10:45 UTC

2

Project Suggestions for resume please?

  1. Please suggest 1 or 2 good ML/DL project ideas (preferably but not compulsorily in Gen AI) which i can build/make to add to my resume and github. It should not be something very common or generic like clones or simple image classification, etc. Something that would stand out to recruiters.
  2. Also I have planned to build a multimodal rag based website for my final year capstone project. Could anyone offer me some tips on how i can make it more innovative or better or what model to use, etc to be able showcase it as my major AI/ML project?
3 Comments
2025/02/01
09:22 UTC

4

AI/ML Questions (First Year CS Student)

Hi, I'm a first year CS student and I've been having a few questions relating to the AI/ML field that I legit can't find the answer to anywhere unfortunately...

First, I'm heavily debating leaning my education towards AI/ML by taking more math, but specifically minoring in statistics. When going into uni, I thought I was just going to be a code demon and grind leetcode and projects. But I thought, is that really still the move? What if AI/ML is truly the future? I've been trying to do more research and can't really find any useful insight. Just wondering, if anyone thinks the SWE jobs will be cooked soon like 5+ years, and it's likely possible that AI/ML will be far superior.

Another question, what do you actually do in these new AI/ML jobs? Like I'm hearing so many different things from different people so does it just depend on the company? Everywhere I look, on YouTube, LinkedIn, personal friends... It's all so confusing, you see me refer to the term "AI/ML" and to be frank, I don't even know exactly what that means. From my understanding, an ML Engineer for example, doesn't actually work with the theory (the math and statistics) behind these models. That's the work of the Masters and or PhD people. Are ML Engineer's just SWE's but work with these pre-built/designed models? I've heard they just help train and tune the models by programming and likely other tools that I'm unaware of, but no crazy math or stats is needed I think? I've also heard that they help "deploy" the models into the real world, because the mathematicians and statisticans wouldn't know how to make it public, since that's what a SWE does in normal SWE jobs.

I mentioned potentially doing a stats minor. Is that at all useful? Some courses that I would be taking would be, statistical modeling, probability, regression analysis, analysis of variance and expermentail design, sampling methodoloy, and statistical computing. Maybe I should point out that, I don't want to be really working with a lot of data and graphs and all of that. Hence why I don't want to become a Data Anaylst or Data Scientist for example. I want to code because it's something I enjoy doing, but I want to know if these AI/ML jobs are meant for SWE's but just specific to that field, or are they different in the sense that you need a deeper understanding of math and statistics. If so, how much? And also, if do need higher level of math/statistics, is it like just taking a few more courses, or do you need a Masters/PhD? If it's just a few more courses, does this mean that you're basically just a SWE, and need just some fundamental knowledge to help with your workflow, or it's just completely different?

Essentially, is a stats minor significant in increasing the chances of working in that field? What are the types of tasks you would do in this field, and please if anyone can explain like when you would require higher level of math and statistics versus when you wouldn't like depending on the jobs I would appreciate it a lot. I enjoy math and somewhat statistics, if you were wondering, I'm just trying to figure out what this new field is all about... Thank you so much!

7 Comments
2025/01/31
19:10 UTC

0

What laptop for good performance ?

I'm currently learning on macbook air 2017 so pretty old and performs quite slowly. It's struggling more and more so I'm thinking I will need to change soon. All of my devices are apple environment at the moment so if a macbook pro M2 2022 for example is decent enough to work on I'd be fine with it, but I've heard that lots of things are optimized for NVIDIA GPUs. Otherwise, would you have any recommendations ? Also, not sure if it's relevant but I study finance so I mainly use machine learning for this. Thank you for your help !

2 Comments
2025/01/31
19:07 UTC

3

Helping keep up with Scientific Literature with Learning Disabilities

Hello Redditors,

I'm wondering if anyone in the AI/ML space has any tips and tricks on how to keep up with the scientific literature of the industry. I currently believe that spending an hour a day on reading literature articles, and 2 hours a weekend seems to be achievable, but I'm having difficulty getting those numbers up.

I've been diagnosed with ADHD since high school, and despite getting multiple degrees in the science field I'm finding it difficult to get this into a easily maintainable routine. I've tried Pomodoro timers, and I'm definitely interested in the material that I'm reading, but any suggestions that others have that I can try out would be highly highly appreciated.

0 Comments
2025/01/31
18:16 UTC

2

Why is my LSTM just "copying" the previous day?

I'm currently trying to develop an LSTM for predicting the runoff of a river:
https://colab.research.google.com/drive/1jDWyVen5uEQ1ivLqBk7Dv0Rs8wCHX5kJ?usp=sharing

The problem is, that the LSTM is only doing what looks like "copying" the previous day and outputting it as prediction rather than actually predicting the next value, as you can see in the plot of the colab file. I've tried tuning the hyperparameters and adjusting the model architecture, but I can't seem to fix it, the only thing I noticed is that the more I tried to "improve" the model, the more accurately it copied the previous day. I spent multiple sessions on this up until now and don't know what i should do.

I tried it with another dataset, the one from the guide i also used ( https://www.geeksforgeeks.org/long-short-term-memory-lstm-rnn-in-tensorflow/ ) and the model was able to predict that data correctly. Using a SimpleRNN instead of an LSTM on the runoff data creates the same problem.

Is the dataset maybe the problem and not predictable? I also added the seasonal decompose and autocorrelation plots to the notebook but i don't really know how to interpret them.

2 Comments
2025/01/31
16:50 UTC

1

Questions about continuous ranked probability score (CRPS)

I wasn't able to find any answers online.

Is it bounded? e.g. from 0 to 1. Or is it unbounded?

Does it have any simple interpretation? How to compare two CRPS values? e.g. 5 and 20, In what sense is 20 4 times better model than 5? Could it be that both models have the same point forecasts but one has only wider prediction intervals?

0 Comments
2025/01/31
15:28 UTC

3

Why is my validation/test loss not overfitting?

Hi all, Im relatively new in ML, and Im completely new to Pytorch.

Im constructing a NN, that takes 4 inputs, to create 2 outputs, and I've tested a bunch of hyper parameters.

My problem is that my train loss is decreasing as it should, and so is my test loss, but my predictions are still not satisfying.

I've split the data in 80/20 train test set, and have 2 sets of inputs that im holding out to see the prediction after training.

I've tried to train over alot of epochs to see if i could induce overfitting, but my test loss will never increase, which i think might be part of the problem of my predictions.

Any tips or help would be much appreciated!

Here is my code: https://github.com/Muldbak/Impedance_pred

5 Comments
2025/01/31
12:34 UTC

Back To Top