Photograph via snooOG

A subreddit dedicated to learning machine learning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

  • Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.
  • Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.
  • Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.


Official Discord Server


Getting Started with Machine Learning


Related Subreddits





Machine Learning Multireddit



394,120 Subscribers


ML Specialization by Andrew Ng worth the Coursera subscription?

I’m quite interested in ML currently and I want to dive deeper into it. Seen Andrew’s course recommended basically everywhere, but it gets paywalled with a subscription.

Is it worth it or are there other decent and beginner friendly courses? Preferably free, but I wouldn’t mind too much buying a course. I just don’t like the subscription models much.

09:56 UTC


Dimensionality reduction and precomputed distance matrix

I have a question about dimensionality reduction. I want to understand how methods like MDS and t-SNE work. In particular, I'd like to understand the difference when I precompute the distance matrix or not.

To try to understand it, I took 10 images from the MNIST dataset, and created 10 copies of each one of them. Then I created a distance matrix with the Modified Hausdorff Distance between all the images. Finally, I used the TSNE and MDS functions available in sklearn.manifold to project them into a 2D space. I ran both functions twice, once with the precomputed distance matrix specified (metric='precomputed') and once with the default value.

So, I have 10 images that are repeated 10 times each. I would expect to see 10 points (one per different MNIST image) composed of 10 points (10 repetitions of each image) overlapping each other in both t-SNE and MDS spaces because the distances are repeated. I only see this when I don't specify the metric as 'precomputed', not in the other case where I see dispersion.

Why is this happening? I don't know if there's any article talking about this. I would very much appreciate it if you can help me with this question.

09:07 UTC


[R] 📘 'Dive Deep into Conformal Prediction with This Ultimate Resource Compilation!'

Hello r/learnmachinelearning enthusiasts!

Are you interested to learn about Conformal Prediction (CP) and to explore its depths? Whether you’re a budding learner, a seasoned researcher, or a practitioner in the field, we’re excited to share something that's bound to pique your interest and expand your knowledge!

🚀 W**e Present: 'Awesome Conformal Prediction' - The Ultimate Collection of Conformal Prediction Resources **🚀

Our passion for CP has driven us to curate an exhaustive list packed with the finest materials on the subject. This treasure trove is designed to guide you through the intricate landscape of Conformal Prediction, covering every aspect you could imagine:

  • 📺 Engaging Videos & Tutorials: Visual learners, rejoice! Find tutorials and talks that break down complex concepts into digestible pieces.
  • 📚 Insightful Books & Papers: From introductory texts to advanced research papers, deepen your understanding of CP’s theoretical foundations and innovative applications.
  • 🎓 Academic Theses: Explore cutting-edge discoveries through PhD and MSc theses dedicated to pushing the boundaries of CP.
  • 📰 Enlightening Articles: Stay updated with articles that shed light on the latest trends, challenges, and breakthroughs in Conformal Prediction.
  • 💻 Open-Source Libraries: Get hands-on with CP by accessing a variety of open-source libraries, perfect for learners and developers keen on practical implementation.

Whether you’re looking to solidify your foundational knowledge, stay abreast of the latest research, or apply CP in your projects, this list is your all-in-one destination.

Dive into this curated collection and join the forefront of those mastering Conformal Prediction. Let’s embark on a journey of learning, discovery, and innovation together in the realm of machine learning!

Happy exploring!


08:14 UTC


How many parameters should my model have?

I'm building an unguided diffusion model from scratch, training on a dataset of 3 x 64 x 64 airplane images. My results so far haven't been great and I feel its probably due to the model not being big enough, but I have no reference point to know how many parameters I should be optimizing in order to see decent results? Any general rule one should follow? Running on a 2070super, my current UNet with time embeddings implementation has about 40k parameters.

If anyone could give some input it'd be greatly appreciated!

1 Comment
07:55 UTC


Ai Learns Geoguessr

Hey everyone,

For the past few weeks I have been working on training a competitive multimodal Ai model to learn the game GeoGuessr.

If you are not familiar with the game, the goal is to guess exactly where in the world a photo has been taken (sourced from Google streetview).

The model is a 152 layer ResNet CNN with a transformer head, organised in an ensemble architecture to process both text and image input.

I created two variations for this test:

  • The first predicts the latitude and longitude of the given location

Prediction of coordinate based on input image

  • The second divides the world in to a grid-based classification task and predicts the exact region in which the photo has been taken

Region-based location prediction

I captured some pretty interesting visualisations during the training process that I have animated together into one consolidated video.

You can check out the full explanation of the models and the training visualisation in the full video here:

06:55 UTC


Which CS229 To Watch

I have so far found three recent versions of CS229 from Stanford on YouTube - Autumn 2018 taught by Andrew Ng, Summer 2019 taught by Anand Avati, and Spring 2022 taught by Tengyu Ma. Which one should I follow along with? I hear people talk about Andrew Ng's course a lot, but then i realize his 2018 course has already been six years from now lol so i just wonder if the course will be too old for the current industry. Thanks!

1 Comment
06:11 UTC


[D] How should I proceed with this NLP project?

I have a task in which I want to use two different machine learning models depending on the received text ? (planning to use voice to text conversion) . if the received text matches to a dataset about first aid kit then we will use a first aid kit model , if it matches the disease dataset more we will use the disease dataset, additionally after classification, the "symptoms" or input labels must be extracted from the text. how should I proceed?

I have a few questions such as should I create a vector database of the whole database? or just the input labels? should I first identify the entities which will then be used to calculate vector distance? or should I match the whole sentence?

The first aid database is a of input labels being emergencies as cut burn etc whereas the disease one is just a normal dataset for classification. The first aid kit dataset contains instructions on what to do for each injury should I be using a decision model and print out the decision or construct a language model? For the disease dataset I have an extra dataset containing information on how to "treat" those diseases, I plan to match the resulting disease decision to that dataset, here again, is it better to use a language model or just print out the decision? Please guide me.

I have not tried anything yet because initially I was planning on using smaller llms and simplify the problem however I was told to make this simple to run however I just cannot sacrifice performance, I still want this to output viable language .

04:44 UTC


Need help - starting to learn ML

Hello 👋🏼

I needed some help from anyone who’s learning/knows their way around ML. I want to start learning it and I have zero knowledge about it (apart from some theoretical stuff because of classes).

  1. Are there any prerequisites? If yes then what?
  2. What are some GOOD resources? (both free & paid, priority to the free ones)
  3. How much time would it generally take for me to even be slightly good at it?

(Add whatever else you feel is necessary to know even if I haven’t asked it)

I do get stressed and a little hopeless if I’m not seeing progress so it’d be even better if any of you can mentor me through it and keep a check regularly so that I can be accountable to someone :)

04:35 UTC


Laptop for Master’s Student

Hi everyone, I was recently admitted to Duke for its MEng program in AI. I’m looking to buy a laptop now (in the US) for the course and my budget is around $1500. Please tell me the best laptop I could buy for this price along with how much actual coding is done on our laptops while we’re at university. Also do our laptops need GPUs? Thanks!

03:56 UTC


Is it worth going to grad school (MS CS)?

I've been working as a new grad SWE for around 9 months now and I honestly do not enjoy traditional SWE work, so I'm trying to transition into an ML role. I applied to MS CS programs and got into a couple and I'm considering whether or not to attend in the Fall. I really enjoyed research in college (and still do research part-time), so I'd like to get more experience by going back to school, and potentially do a PhD as well.

My biggest concern is that from what I've heard, prestige matters a LOT when it comes to MS CS programs, especially in relation to your undergrad school. I was fortunate enough to go to a T1 undergrad for CS, so even the good schools I got into (e.g. UCSD) are a decent step down from my undergrad. Also, although I don't like my job, it does pay a lot and the company is doing really well in spite of market conditions, which makes it even riskier leaving.

Overall I just don't know what to do next. Should I go to a master's program in the Fall? Wait another year and apply for a PhD? Or hope that my current research experience and knowledge is enough to luck into an ML role at another company?

03:26 UTC


where can i find my Best MSE score

am new to ML so can someone help me

21:41 UTC


Materials for revising machine learning pre-interview? [D]

Hello! I completed the IBM Professional Machine Learning Certificate a few months ago and it was my first formal exposure to machine learning (although I am a scientist by background so I was already very comfortable with statistics and programming in Python). Now I have a job interview coming up and machine learning experience was one of the listed requirements so I want to revise. The IBM course was mostly videos and I much prefer reading notes.

Does anyone have any recommendations for notes or books I could use to revise? I am quite a fast reader so it doesn't matter if it's a couple of hundred pages but I don't need any in-depth material on the underlying maths or statistics or basic coding. Basically, I need a means of revising the different common machine learning tools, their relative pros and cons in different scenarios, and their important variables. Ideally for both unsupervised and supervised learning and deep learning and with a Python-based approach.

Extra points if you point me to something I can read on Kindle but that's not critical :) Thanks so much!

20:59 UTC


Materials for revising machine learning pre-interview?

Hello! I completed the IBM Professional Machine Learning Certificate a few months ago and it was my first formal exposure to machine learning (although I am a scientist by background so I was already very comfortable with statistics and programming in Python). Now I have a job interview coming up and machine learning experience was one of the listed requirements so I want to revise. The IBM course was mostly videos and I much prefer reading notes.

Does anyone have any recommendations for notes or books I could use to revise? I am quite a fast reader so it doesn't matter if it's a couple of hundred pages but I don't need any in-depth material on the underlying maths or statistics or basic coding. Basically, I need a means of revising the different common machine learning tools, their relative pros and cons in different scenarios, and their important variables. Ideally for both unsupervised and supervised learning and deep learning and with a Python-based approach.

Extra points if you point me to something I can read on Kindle but that's not critical :) Thanks so much!

20:01 UTC


🚨 Need Help with Undersampling Techniques in Machine Learning Project 🚨

Hey everyone,

I'm currently facing an issue with undersampling techniques in my machine learning project. Despite using both NearMiss and RandomUnderSampler from the imbalanced-learn library, I'm not seeing any reduction in the size of my training set. I've double-checked my code, but I can't seem to figure out what's causing this issue.

from imblearn.under_sampling import NearMiss, RandomUnderSampler
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from imblearn.pipeline import make_pipeline as imbalanced_make_pipeline
from sklearn.linear_model import LogisticRegression
import numpy as np

# Define my data
undersample_X = df.drop('Class', axis=1)
undersample_y = df['Class']

# Define evaluation metrics
undersample_accuracy = []
undersample_precision = []
undersample_recall = []
undersample_f1 = []
undersample_auc = []

# Define cross-validation strategy
skf = StratifiedKFold(n_splits=5, random_state=None, shuffle=False)

for train_index, test_index in skf.split(undersample_X, undersample_y):
    # Split data into train and test sets
    undersample_Xtrain, undersample_Xtest = undersample_X.iloc[train_index], undersample_X.iloc[test_index]
    undersample_ytrain, undersample_ytest = undersample_y.iloc[train_index], undersample_y.iloc[test_index]

    # Define pipeline with undersampling and logistic regression
    undersample_pipeline = imbalanced_make_pipeline(NearMiss(sampling_strategy='majority'), LogisticRegression())
    # Fit the model
    undersample_model = undersample_pipeline.fit(undersample_Xtrain, undersample_ytrain)
    # Make predictions
    undersample_prediction = undersample_model.predict(undersample_Xtest)
    # Calculate evaluation metrics
    undersample_accuracy.append(undersample_pipeline.score(undersample_Xtrain, undersample_ytrain))
    undersample_precision.append(precision_score(undersample_ytest, undersample_prediction))
    undersample_recall.append(recall_score(undersample_ytest, undersample_prediction))
    undersample_f1.append(f1_score(undersample_ytest, undersample_prediction))
    undersample_auc.append(roc_auc_score(undersample_ytest, undersample_prediction))

# Print evaluation metrics
print("undersample_accuracy: {}".format(np.mean(undersample_accuracy)))
print("undersample_precision: {}".format(np.mean(undersample_precision)))
print("undersample_recall: {}".format(np.mean(undersample_recall)))
print("undersample_f1: {}".format(np.mean(undersample_f1)))
print("undersample_auc: {}".format(np.mean(undersample_auc)))

My dataframe shape:

  • Number of fraud instances: 492
  • Number of non-fraud instances: 284315
  • Total number of instances: 284807

Any insights or suggestions on why the undersampling techniques are not reducing the size of the training set would be greatly appreciated. Thanks in advance for your help!

1 Comment
19:02 UTC


How to Run Machine Learning on Super Computers

18:55 UTC


Statistically sound modified clustering algorithm with domain knowledge for noisy data?

I am currently attempting to use hierarchical clustering on noisy data to assigns data points to 3 different classes that are pre-defined through domain knowledge. Class A and B represent 'extreme' group and class C represent an 'in-between' group. There is many datapoints that I can use intuition and expert knowledge to classify them as being in the in-between C class, however the algorithm classify them as being in the B group. The maximum average silhouette score I can get is 0.35 (perhaps that s good for noisy/real world data?), which is not great but does a pretty good job at classifying most data points that I have intuition on.

I am tempted to manually reclassify those data points myself to what I deem is the correct group but it wouldn't be data driven decision anymore and add subjectivity. Therefore, I am looking for statistically sounds methods or algorithm following rules, that would mostly classify the data in ways that would mostly confirm expert knowledge judgment of classification.

I was thinking of using some metrics of uncertainty and classify the uncertain ones to the 'in-between' class, kind of like a conservative approach.

Any ideas, tips or thoughts about such issues? Thanks

1 Comment
18:37 UTC


[Beginner in ML+Image Captioning] Seeking for initial guidelines to setup an image captioning app for the specified scenario

Disclaimer: I'm a beginner in ML and Image Processing but I have undertaken a theoretical course on Machine Learning Fundamentals. (I had posted this same post a while ago in r\ComputerEngineering for enquiring about the Hardware side of things too). Sorry for the long post. (Assuming that the flair 'HELP' means asking for help and not giving help)

Also, If this post is inappropriate for this sub, before removing it, please redirect me to which is the most appropriate (And fast responding) subreddit to post it.

Project Overview: I'm working on a virtual lab assistant that especially works in a Physics lab, CAPTIONING images of Physics lab instruments and experiment steps (such as Vernier calipers), and acting as a general visual assistant for the Visually Impaired folk. The idea is to make a smart, preferably wearable (like a camera mounted on a spectacle-like equipment) component that makes understanding physics experiments more easier to understand for the visually impaired.

Also I am planning to deliver it as a mobile app that can be connected to the camera and other hardware equipment, used in this project (if possible).

My Questions:

I think, regarding the Hardware and Interfacing side of things, this might not be the right sub but I just added the overall plan for cohesiveness. Regarding the ML side of things, I tried following tutorials on CNN+LSTM kinda architectures, but right now I am kinda blank. Could you lead me into some starting directions on the ML (training + deployment) side of things so that I could have a good start which I can customise? Also the data required for training might be image data of physics lab experiments (like Vernier calipers), just reiterating that for the context.

1 Comment
18:18 UTC


Some advice on moving from the classroom to industry

I recently gave an alumni talk to students at my former grad program on some differences between data science as you learn it from a textbook and data science as practiced in industry. The talk was well-received, so I adapted it into a blog post. I split the post into one section on the need for soft skills—with a focus on communication—and hard skills, specifically around tooling that's typically not encountered until you're on the job. I know this sub has a lot of folks interested in entering or transitioning into the industry, so I hope this might be of some value.

I'm open to feedback and follow-up questions as well.


18:10 UTC


Extracting text and words about a specific topic from a string


I'm graduating in biomedical engineering and I'm working on a little project involving LLMs. It's not important and it's more of a showcase project so it doesn't have to be dead accurate.

Given a string, I need to extract any terms that could potentially have medical meaning, especially diseases and symptoms. I have tried OpenAI's function calls and they work wonders, problem is that I've reached a point where the API is getting expensive and I'd rather not keep using it and move to something that doesn't work as well, but still does the job.

Tried using NLPs (scispacy and medspacy) and while they do the job, I feel like they miss words in a way that's not very consistent (ie in "I have been feeling back pain" it only detects pain, but for some reason if I say "They have been feeling back pain" it detects "back pain". I guess it has to do with tokenization, but the inconsistency makes these a bit useless right now).

I was wondering what other options I have, I'm using Python if it helps. Would running a local LLM for this work? Assuming it would run on my PC of course, which I doubt.

Thank you very much for the help!

1 Comment
17:50 UTC


Generate Synthetic DB

Hey guys, I have been looking at tabular data synthetic data model generation and I have found a few examples:

And a few more from articles I have read. One of my issues is having to expose the underlying sets.

One methodology would be to use Homomorphic Encryption on the dataset https://github.com/zama-ai/tfhe-rs and then pass that through a synthetic generator, and then decrypt the output. However I have concerns that this is not going to work.

How do people here work on creating synthetic databases from origin data without exposing or interacting heavily with the underlying set.

Thanks in advance for those taking the time!

1 Comment
17:04 UTC


Looking for ML expert for lottery algorithm

Currently looking for someone who knows about ML to look over the plans I have for an lottery predictor. I know it can be done. Looking for someone who knows how to build it. I have a dataset and plan mapped out, just need someone to build it. Thanks.

16:58 UTC


Need advice from professionals and people who are in this field for many years

"I just graduated last month and specialize mainly in backend development. I am currently doing a part-time job as a backend developer too, but there are so many things going on in the world right now, like so many things to learn; I don't know what to do. I want to excel in what I do, and I'm unsure how to reach that level. If I start a full-time job, then I know I won't be able to dedicate time to learning new things. Recently, I've been gaining interest in machine learning and data science, but I don't want to breeze through it; I want to learn it from the core level, including learning the math to the model algorithm and so on. I don't know where to start, and it's kind of overwhelming. Machine learning has already reached a new level, and if I start now, I feel like I'm never going to catch up. I've also considered going to grad school to study AI and ML, but the computing course I took in college didn't focus much on math; it just breezed through it, so I'm not confident if I should go to grad school at my current level. Do you have any idea or suggestions? Also, there are so many things to learn and do; how do you filter all this noise and focus on one thing?"

16:57 UTC



What were some videos or blog sites helped you understand the complex architectures of LSTM and RNN

15:28 UTC


Using pertained glove embeddings drop loss and accuracy to 0

I'm trying to build my own transformer for a project - I'm able to properly implement the model just fine, originally I was not planning to use pre trained embeddings but I thought it might help performance.

I'm running into the issue where my model works perfectly fine without the glove weights initialised and I get a slow increase in accuracy and a decrease in loss, however plugging the weights in drops everything to 0.

My architecture is almost identical to the tensorflow transformer tutorial with just some changes in the positional encoder class to allow for the weights.

Any help would be appreciated!

15:24 UTC


Emotion detection

Hi does anybody have a simple emotion detection built with cv2 and deepface? I't would be helpful ig anybody can help me how to build one. On YT are some videos but they dont work. If anybody does have a solution just text me. Thx

14:22 UTC


Learning ML

Hello, I am a fire year undergrad and I wanted to Machine Learning without cutting any corners or such so like with doing math and all, how would that be recommended. I had seen high recommendations for the dive into deep learning book: https://d2l.ai/index.html

Would appreciate if someone can confirm whether this is the best book to learn or if any recommendations for any other.

Thank you for your time.

14:16 UTC


What are some research topics for me to write a paper on in my undergraduate?

I've been planning to do research under a supervisor in my undergraduate for a long while but I never got around to it. I need some ideas. I thought papers needed to be on something really groundbreaking or complex. But, I checked some ML papers written by other undergrad students and it all seems to be really basic stuff like predicting movie blockbusters from box office data or agricultural output predictions. These seem really simple stuff where you just download the data, clean it and try different algorithms.

I wanna try some topic that involves atleast some mathematics. I prefer if its from reinforcement learning, since that is where I'm focusing, but other areas work too. I also wanna work with robotics. I can't think up of any topic good enough, so I'm asking you guys here of any topic simple enough to conduct research on for an undergrad student.

14:08 UTC


Should I do Machine Learning first or Data Structures as a Python Developer?

Hi I am a medium skill leveled python programmer. I want to excel in machine learning down the road as my career. I want to do dsa as well. Should I do machine learning after dsa? or should I do dsa after machine learning? Can someone please help me out with this. By the Way I am in 12th grade looking forward to join a college in a few months.

12:43 UTC


Issues using RNN for drum sound classification

I am trying to use RNNs for classification of sounds (specifically drum categorization) as I described in the following post on Stack Exchange in more detail:


It is inspired in the referenced article (in the post above) and I know RNN in freq domain is not the best option, and convolutional networks would be a better choice, however I want to do both approaches for learning purposes.

I do not expect fantastic results, but I expect it to differ vastly different categories, for example Overhead and Kick Drum.

(Sorry for not making a full post here, but referencing instead, but I try to post at as many platforms I can)

12:18 UTC

Back To Top