/r/learnmachinelearning

Photograph via snooOG

A subreddit dedicated to learning machine learning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

  • Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.
  • Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.
  • Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.

Chatrooms

Official Discord Server


Wiki

Getting Started with Machine Learning

Resources


Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

/r/learnmachinelearning

449,172 Subscribers

3

I dunno what to do next

I'm a clg student seeking to learn ml then dl. I know python , numpy, pandas, matplotlib and maths. I also did Andrew ng's ML specialization course.

Now I'm stuck at this point where i dunno what should i do next. Should I learn eda , preprocessing or start learning ml algorithms? If so, where and how can l learn to do these? I need your guidence guys. Please help me out. Thanks in advance!

( Edit : give some upvotes and make this post floating in top because it would be helpful for ppl like me)

0 Comments
2024/11/10
08:59 UTC

0

Get access of almost all Machine Learning Courses from Coursera at $239 for one year.

Offer Details:

  • Offer Dates: Nov 7, 2024 — December 12, 2024 (26 days)
  • Offer Discount: 40% Off Coursera Plus Annual Subscription ($160 off)
  • Limitations: excluding IN, DE, spanish speaking LATAM

Starting today, Coursera is offering a 40% discount on our annual Coursera Plus subscription. You can gain unlimited access to over 7,000 courses, including Professional Certificates from top industry leaders like Google, Meta, Microsoft, IBM, and more — all for just $239 (regularly $399) for 12 months. Read Main article.

0 Comments
2024/11/10
08:50 UTC

2

Questions for practice

I have just completed mathematics for machine learning book and I am unable to find resources to practice questions. Can anyone suggest me some resources?

0 Comments
2024/11/10
07:41 UTC

2

[Help Needed] Looking for a Machine Learning Dataset

Hi everyone! I'm a student working on a machine learning project and I'm in need of a dataset. Ideally, I’m looking for a dataset that has a few thousand samples with around 15 features that I can preprocess and then use for training ML algorithms. I’ve received suggestions on general sources for datasets, but I’m looking for particular datasets that are well-suited for hands-on learning and experimentation. Any specific recommendations would be very appreciated!

Thank you in advance!

3 Comments
2024/11/10
06:57 UTC

5

[Help] LSTM seq2seq generating same sequence

Kaggle Notebook

I am trying to implement seq2seq model in pytorch to do translation. The problem is model generating same sequence. My goal is to implement attention for seq2seq and then eventually moving to transformers. Can anyone look at my code (Also attached kaggle notebook) :

class Encoder(nn.Module):
  def __init__(self,vocab_size,embedding_dim,hidden_dim,num_layers):
    super(Encoder,self).__init__()
    self.vocab_size = vocab_size
    self.embedding_dim = embedding_dim
    self.hidden_dim = hidden_dim
    self.num_layers = num_layers
    self.embedding = nn.Embedding(self.vocab_size,self.embedding_dim)
    self.lstm = nn.LSTM(self.embedding_dim,self.hidden_dim,self.num_layers,batch_first=True)

  def forward(self,x):
    x = self.embedding(x)
    output,(hidden_state,cell_state) = self.lstm(x)
    return output,hidden_state,cell_state


class Decoder(nn.Module):
  def __init__(self,vocab_size,embedding_dim,hidden_dim,num_layers):
    super(Decoder,self).__init__()
    self.vocab_size = vocab_size
    self.embedding_dim = embedding_dim
    self.hidden_dim = hidden_dim
    self.num_layers = num_layers
    self.embedding = nn.Embedding(self.vocab_size,self.embedding_dim)
    self.lstm = nn.LSTM(self.embedding_dim,self.hidden_dim,self.num_layers,batch_first=True)
    self.fc = nn.Linear(self.hidden_dim,self.vocab_size)

  def forward(self,x,h,c):
    x = self.embedding(x)
    output,(hidden_state,cell_state) = self.lstm(x)
    output = self.fc(output)
    return output,h,c


class Seq2Seq(nn.Module):
  def __init__(self,encoder,decoder):
    super(Seq2Seq,self).__init__()
    self.encoder = encoder
    self.decoder = decoder
  
  def forward(self,X,Y):
    output,h,c = encoder(X)
    decoder_input = Y[:,0].to(torch.int32)
    output_tensor = torch.zeros(Y.shape[0],Y.shape[1],FR_VOCAB_SIZE).to(device)
    # output_tensor[:,0] = Y[:,0] # Set same start token which is "<START>"

    for i in range(1,Y.shape[1]):
      output_d,h,c = decoder(decoder_input,h,c)
      # output shape : (batch_size,fr_vocab_size)
      decoder_input = torch.argmax(output_d,dim=1)
      # output shape : (batch_size,1)
      output_tensor[:,i] = output_d

    return output_tensor # ouput shape : (batch_size,seq_length)


class Seq2Seq2(nn.Module):
  def __init__(self,encoder,decoder):
    super(Seq2Seq2,self).__init__()
    self.encoder = encoder
    self.decoder = decoder
  
  def forward(self,X,Y):
    output,h,c = encoder(X)
    decoder_input = Y[:,:-1].to(torch.int32)
    output_tensor,h,c = self.decoder(decoder_input,h,c)
    return output_tensor

encoder = Encoder(ENG_VOCAB_SIZE,32,64,1).to(device)
decoder = Decoder(FR_VOCAB_SIZE,32,64,1).to(device)
model = Seq2Seq2(encoder,decoder).to(device)

lr = 0.001
optimizer = torch.optim.Adam(model.parameters(),lr=lr)
loss_fn = nn.CrossEntropyLoss(ignore_index=0)
epochs = 20

for epoch in range(epochs):
    running_loss = 0.0
    progress_bar = tqdm(train_dataloader, desc=f"Epoch {epoch+1}", leave=False)

    for X, Y in progress_bar:
        Y_pred = model(X, Y)
      
        # Y = Y[:,1:]
        # Y_pred = Y_pred[:,:-1,:]
        Y_pred = Y_pred.reshape(-1, Y_pred.size(-1))  # Flatten to (batch_size * seq_length, vocab_size)
        Y_true = Y[:,1:]
       
        Y_true = Y_true.reshape(-1)  # Flatten to (batch_size * seq_length)
       
        loss = loss_fn(Y_pred, Y_true)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # Update running loss and display it in tqdm
        running_loss += loss.item()
        progress_bar.set_postfix(loss=loss.item())

    print(f"Epoch {epoch+1}, Loss = {running_loss/len(train_dataloader)}")
  
5 Comments
2024/11/10
02:14 UTC

0

Best Resources & Advice for Getting Started in Machine Learning?

I’m planning to learn machine learning, but I’m at the start of my computer science degree and feeling a bit overwhelmed with all the options out there. I’d love some guidance on where to begin, especially since I want to do a masters in machine learning and I want to be a stand out applicant.

Some questions I have:

  1. Should I focus on learning the math first, or dive into practical ML and learn the math as I go?
  2. What online courses or resources would you recommend?
  3. How can I improve my chances of enrolling onto a good Machine learning masters.

Thank you for your answers in advance :)

1 Comment
2024/11/09
23:55 UTC

1

Algorithm suggestions for tracking noisy measurements?

I'm tracking a value in time with noisy measurements and am interested in knowing both the estimate of the underlying value at a given instant in time, as well as the estimated error in that value at each instant in time. (Essentially, value plus-minus error, both as a functions of time).

For example, if the real value did a step function in time, the measured value would have some transition as it jumps to the new value, and during that transition, the error would spike.

I've been trying a Bayesian linear dynamic system with a Kalman filter. (It's possible that I've implemented this wrong) but it seems to get increasingly certain, even when horribly wrong. Any suggestions for good algorithms to use for this type of problem?

Also, the measurement noise is gaussian and I know about what its distribution is if that helps at all.

0 Comments
2024/11/09
20:52 UTC

1

Correlation of columns

0 Comments
2024/11/09
20:34 UTC

2

compare two roles

hi guys, i got an offer from a small insurance company which is data scientist, which will be working on predict customer behavior and fed into risk equation(will include deployment and monitoring) but i think this role is lack of work life balance. My current role is machine learning engineer and mainly working on research and proof of concept using genAI like GPT, big insurance company with great work life balance. The offer i got is about 10% higher than my current role, please share some advice as I'm struggling to make a wise decision

2 Comments
2024/11/09
20:30 UTC

38

What does a volatile test accuracy during training mean?

While training a classification Neural Network I keep getting a very volatile / "jumpy" test accuracy? This is still the early stages of me fine tuning the network but I'm curious if this has any well known implications about the model? How can I get it to stabilize at a higher accuracy? I appreciate any feedback or thoughts on this.

39 Comments
2024/11/09
20:11 UTC

1

Back propagation correct step

Hello I am trying to figure out the formula of the partial derivatives of loss of MSE (mean square error) with the parameters of a single layer multi perceptron network. There are m input neurons, n hidden neurons. Weights of the hidden layer is defined as m x n matrix. bias of the hidden layer n- dimensional vector. weights of the output matrix n x k matrix. bias of the output neuron is a k dimensional vector.

Given the above dimensions, I define a true output matrix Y which is a matrix that holds true value of Y for all samples. I have N samples and the number of output neurons is n, thus resulting in an n x N matrix. The same can be developed for Ypred which represents our prediction values.

Define a loss function based on MSE as the sum of (Yi - Ypredi)**2.

One can easily see that derivative d (MSE) / d (Ypred) = 2/ N (Ypred - Y).

Now in trying to find d(MSE) / d(Wo), where W0 represents weights of the output layer. I tried to use chain rule to simplify d (MSE / d(Wo) = d(MSE) / d(Ypred) X d(Ypred) / d(W0). And Ypred = W0 ^ T * Xh + B0.... where Xh is output of the hidden layer and B0 represents the bias which includes the value b0 vectors as column N times. However I am stuck here, because derivative of a Matrix (Ypred) with a matrix W0 is a tensor right. How do I simplify the above relationships and continue with the other parameters.

Any help even with only the answers will be appreciated thanks....

0 Comments
2024/11/09
20:01 UTC

12

Math behind Diffusion Models

Does anyone have any good resources that explains the math math behind diffusion models crystal clear?

10 Comments
2024/11/09
17:34 UTC

2

SOTA architecture to build image classifiers which depend on text shown in pictures

I am currently trying to build a simple multi class image classifier. I want to use a pretrained model for image embeddings. However, to reliably differentiate the classes of my task, the model also needs to take into context the text/numbers displayed in the image. The number of texts per image to be classified is not fixed in size.

Most vision encoders have a fairly small input size, which makes text intelligible for the model, requiring the need to extract the required text using a different approach, for example using OCR tools.

My idea would be to run a detection + recognition OCR tool and then embed the recognized text, using a text encoder and then add positional embeddings based on the bounding box location in the image.

However, given the "n" embedded texts + the embedded image, what would be the best way to combine them then and feed them into a classifcation head, for example?

In general, is the approach I am trying to take feasible or are there any other ones which I can apply which ensure that the text in the image is taken into account, in addition to the general image structure?

Thank you guys in advance!

0 Comments
2024/11/09
17:02 UTC

1

The dynamics of SGD

Hello,

I have a background in pure mathematics, and I would like to understand better the dynamics of stochastic gradient descent (SGD), for example speed of convergence, guarantees of convergence, continuous approximations of SGD... but in the stochastic case, that is, not just classical convex optimization where the objective function is fully known.

Would you have any recent references to get up to date? I would prefer recent papers. Thank you very much

0 Comments
2024/11/09
16:24 UTC

1

Have troubles with classificator, need an advice

Hello everyone.

I have a question. I am just starting my journey in machine learning, and I have encountered a problem.

I need to make a neural network that would determine from an image whether the camera was blocked during shooting (by a hand, a piece of paper, or an ass - it doesn't matter). In other words, I need to make a classifier. I took mobilenet, downloaded different videos from cameras, made a couple of videos with blockages, added augmentations and retrained mobilenet on my data. It seems to work, but periodically the network incorrectly classifies images.

Question: how can such a classifier be improved? Or is my approach completely wrong?

3 Comments
2024/11/09
14:43 UTC

10 Comments
2024/11/09
14:38 UTC

1

Is my logistics regression model good(concerned about the true values)data is cleaned and balanced

Logit reg - Accuracy: 0.90 Confusion Matrix: [[67472 499] [ 6679 511]] True Positive: 511 True Negative: 67472 False Negative: 6679 False Positive: 499 Sensitivity: 0.07 Specificity: 0.99 Positive Predictive Value: 0.51 Negative Predictive Value: 0.91 Classification Report: precision recall f1-score support

       0       0.91      0.99      0.95     67971
       1       0.51      0.07      0.12      7190

accuracy                           0.90     75161

macro avg 0.71 0.53 0.54 75161 weighted avg 0.87 0.90 0.87 75161

2 Comments
2024/11/09
14:23 UTC

2

Frequent Pattern Mining question

I'm performing a Frequent Pattern Mining analysis on a dataframe in pandas.

Suppose I want to find the most frequent patterns for columns AB and C. I find several patterns, let's pick one: (abc). The problem is that with high probability this pattern is frequent just because a is very frequent in column A per se, and the same with b and c. How can I discriminate patterns that are frequent for this trivial reason and others that are frequent for interesting reasons? I know there are many metrics to do so like the lift, but they are all binary metrics, in the sense that I can only calculate them on two-columns-patterns, not three or more. Is there a way to to this for a pattern of arbitrary length?

One way would be calculating the lift on all possible subsets of length two:

lift(AB)

lift((AB), C)

and so on

but how do I aggregate all the results to make a decision?

Any advice would be really appreciated.

2 Comments
2024/11/09
14:17 UTC

25

Newbie asking how to build an LLM or generative AI for a site with 1.5 million data

I'm a developer but newbie in AI and this is my first question I ever posted about it.

Our non-profit site hosts data of people such as biographies. I'm looking to build something like chatgpt that could help users search through and make sense of this data.

For example, if someone asks, "how many people died of covid and were married in South Carolina" it will be able to tell you.

Basically an AI driven search engine based on our data.

I don't know where to start looking or coding. I somehow know I need an llm model and datasets to train the AI. But how do I find the model, then how to install it and what UI do we use to train the AI with our data. Our site is powered by WordPress.

Basically I need a guide on where to start.

Thanks in advance!

23 Comments
2024/11/09
14:02 UTC

1

Title: How to Start Preparing for a Job in Machine Learning as an MCA Passout Fresher?

Hi everyone,

I recently completed my MCA (Master of Computer Applications) and unfortunately, I wasn't able to secure a placement during campus recruitment. I’m now feeling a bit lost, as many of my peers have already landed jobs, and I’m concerned about the impact of this study gap on my job prospects. I’ve decided to focus on building a career in machine learning, but I’m not sure where to start, given that I’m a fresher without prior experience in this field.

Could anyone guide me on how to begin my journey in machine learning from scratch? What are the essential skills I need to acquire, and what resources (books, courses, projects) would be helpful for a beginner like me?

Additionally, in the current job market conditions, do you think it’s realistic to land a job in machine learning? Are there specific strategies I should adopt to stand out in this competitive job market?

Any advice or personal experiences would be greatly appreciated!

Thanks in advance!

0 Comments
2024/11/09
13:00 UTC

29

Daniel Bourke is the goat for leanin

That's it, he explains so well practical concepts, andrew ng is good too but mostly for theoretical

11 Comments
2024/11/09
12:45 UTC

116

Beating the dinosaur game with ML - details in comments

11 Comments
2024/11/09
12:42 UTC

0

Perplexity AI PRO - 1 YEAR PLAN OFFER - 75% OFF [ LIMITED TIME ]

As the title: We offer Perplexity AI PRO voucher codes for one year plan.

To Order: https://cheapgpts.store/Perplexity

Payments accepted:

  • PayPal. (100% Buyer protected)
  • Revolut.
0 Comments
2024/11/09
12:15 UTC

1

Great Roadmap with machine learning🎖️

New to Machine Learning? Start Here with a Beginner-Friendly Roadmap!😌

Machine learning can seem daunting, but with the right roadmap, anyone can get started. This post lays out a clear, beginner-friendly plan to help newcomers navigate the world of ML. From understanding basic algorithms to working with Python and PyTorch, you’ll find resources to start building and deploying your own models. Say goodbye to confusion and hello to actionable steps toward ML mastery.

Ready to begin your ML journey? Head over to r/learnmachinelearning and start with this guide! 👇🏽

0 Comments
2024/11/09
11:41 UTC

16

Pytorch diverges although numpy converges with same data + parameters

I implemented basic gradient descent for linear regression first in numpy and then using pytorch. However, with the same data, parameter initialization and learning rate, one converges (numpy, left) while the other diverges (pytorch, right)

https://preview.redd.it/xz80dqlxzuzd1.png?width=1274&format=png&auto=webp&s=808142943d9ecf323a4a7933fef95b2bb7532de7

Here is the code for each:

Numpy:

import math

import matplotlib.pyplot as plt
import numpy as np


n = 50
np.random.seed(1)
x = np.linspace(0, 2*math.pi, n)
y = np.sin(x)
y += np.random.normal(scale=0.1, size=len(y))

alpha = 0.15
m = 0
b = 0
losses = []
fig, axs = plt.subplots(2)
while True:
    axs[0].plot(x, m*x+b)
    axs[0].scatter(x, y)
    axs[1].plot(losses)
    plt.draw()
    plt.waitforbuttonpress()
    for ax in axs:
        ax.clear()

    b -= alpha * 1/n * sum(b + m*x[i] - y[i] for i in range(n))
    m -= alpha * 1/n * sum((b + m*x[i] - y[i]) * x[i] for i in range(n))

    mse = sum((y - (m*x+b))**2)/n
    losses.append(mse)

Pytorch:

import math

import matplotlib.pyplot as plt
import numpy as np
import torch.nn

n = 50
np.random.seed(1)
x = np.linspace(0, 2*math.pi, n)
y = np.sin(x)
y += np.random.normal(scale=0.1, size=len(y))
x = torch.from_numpy(x)
y = torch.from_numpy(y)
x = x.reshape(-1, 1)
y = y.reshape(-1, 1)

alpha = 0.15
m = torch.zeros(1, requires_grad=True)
b = torch.zeros(1, requires_grad=True)
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.SGD([m, b], lr=alpha)
losses = []
fig, axs = plt.subplots(2)
while True:
    y_est = m * x + b
    loss = loss_fn(y_est, y)
    losses.append(loss.item())

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    axs[0].plot(x, y_est.detach().numpy())
    axs[0].scatter(x, y)
    axs[1].plot(losses)
    plt.draw()
    plt.waitforbuttonpress()
    for ax in axs:
        ax.clear()

Even when I drop the LR to 0.1 they still behave the same, so I don't think it's a small rounding error or similar.

2 Comments
2024/11/09
11:22 UTC

1

Cosine similarity activation function

I've read that the cosine similarity activation function isn't usually used in practice as an activation function and I wonder why that is. Specifically if the use case is training for similarity?

I am currently training a sentence transformer neural net using linear activation functions but assessed against labelled cosine similarity scores based on doc2vec vectors so the match doesn't seem to great as output values can be quite outside the bounds of the cosine similarity function.

7 Comments
2024/11/09
08:45 UTC

0

If Gradient Descent is really how the brain "learns", how would we define the learning rate?

I came across a recent video featuring Geoffrey Hinton where he said (I'm paraphrasing) in the context of humans learning languages, "(...) recent models show us that stochastic gradient descent is really how the brain learns (...)" and I remember him comparing "weights" to "synapses" in the brain. If we were to take this analogy forward - if weights are synapses in the brain, what would the learning rate be?

21 Comments
2024/11/09
08:12 UTC

1

What next ?

I just finished neural network zero to hero by andrej karpathy. I am trying to again revise it as it is so much information dense.

What other course should I take? I was looking forward to fast ai ?is it good or should I go for cs231n? Or what should I do ?

12 Comments
2024/11/09
07:14 UTC

18

How is Fast.ai helpful?

I have tried learning from it multiple times and from multiple versions of it. I just don't get how there are some people going on to work at big tech AI labs who attribute Fast.ai for their success. I understand my learning style could be different from the intended audience, but I'd like to know the people it benefited.

Firstly, the notebooks/book have little to do with the videos. Secondly, there is so much abstraction that it kind of doubles your work, as you need to look up how something is actually implemented in PyTorch. Thirdly, everything is a notebook, and I am not a fan of notebooks.

15 Comments
2024/11/09
05:11 UTC

Back To Top