/r/learnmachinelearning

Photograph via snooOG

A subreddit dedicated to learning machine learning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

  • Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.
  • Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.
  • Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.

Chatrooms

Official Discord Server


Wiki

Getting Started with Machine Learning

Resources


Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

/r/learnmachinelearning

455,338 Subscribers

1

What model is best at understanding patterns to do transfer learning to train on my image sets

I’m building an app where users will upload image and my model will need to identify the product style and the print used pattern id). Which model is best to use to transfer learning? I appreciate the help!

0 Comments
2024/12/01
02:31 UTC

1

does the type of gpu matter

I'm new to building a PC and i'm learning various new topics along the way. Any suggestions on which GPU would be a good starting point?

1 Comment
2024/12/01
02:14 UTC

1

What are the ML certifications one can do from non-tech background?

1 Comment
2024/12/01
02:07 UTC

2

search and optimization, how relevant is it for computer vision?

Is the following topics very relevant for foundational models and NLP/CV? I am thinking of taking search and optimization class next quarter. Do you recommend taking this class if I am thinking of pursuing PhD in Computer Vision?

Schedule

• Week 1: Numerical Optimization (I)

(first-order and second-order directions, line search, various accelerations)

• Week 2: Stochastic Search

(simulated annealing, cross-entropy methods, search gradient)

• Week 3: Classical Search

(heuristic search, adversarial search, sampling-based planning)

• Week 4: Reinforcement Learning (I)

(MDP, value and policy iteration, temporal-difference, Q-learning)

• Week 5: Reinforcement Learning (II)

(deep Q-learning, policy gradient, policy improvement theorems)

• Week 6: Bandits and Monte Carlo Tree Search

(concentration bounds, upper confidence bound, MCTS, AlphaGo)

• Week 7: Combinatorial Search (I)

(constraint programming, SAT, conflict-driven backtracking)

• Week 8: Combinatorial Search (II)

(integer programming, cutting planes, general nonlinear problems)

• Week 9: Numerical Optimization (II)

(gradient projection, Lagrange duality, interior point methods)

1 Comment
2024/12/01
01:30 UTC

4

Is DS good for me or should i switch to CS?

It seems that the “building ml models” part is going to ml engineers, while data scientists especially at big tech companies are just analysts that do ab testing (at least from reading job descriptions).

Is DS still a good path if i like to analyze data and build ml models or should i switch to ml engineer? I am currently studying MS is data science, i can switch to CS but it would cost me one year, if it is worth it i will do it no problem

4 Comments
2024/11/30
21:43 UTC

2

Dot product for vector embedding: single scalar output interpretation is confusing me

I understand that we can see similarity (forgetting cosine similarity for now) by understanding how much two vectors "align." However, a larger dot product means more "alignment," and this is where I get confused.

If we have vector embeddings a=[10,20]; b=[11,21]; and c=[15,25], visually in the dimensional space, a and b would be "more similar" because they are so close, but the dot product would be lower than that of a and c.

Since the dot product is higher for a and c, my understanding suggests they are more aligned and therefore more similar.

However, we know that a and b are closer. How should I interpret the dot product in this case? Or generally dot product for vector embeddings and other data structures.

3 Comments
2024/11/30
21:02 UTC

1

TicTacToe ANN not learning.

I hot-wired the ANN to itself, and had it play 1 million times. Awful results, and it fails to learn very well. I'm considering putting it up against a hard-coded bot instead of itself, to provide a greater challenge, or maybe start with human training, then after 1000 games, it will play against itself?

ANN details:

This project will utilize an ANN, and ASCII depictions of the Tic-Tac-Toe board. In the ASCII depiction, the Xs and Os will be displayed, but empty spaces will be replaced with a "." It will run locally on the user's CPU. The ANN will be manually programmed, with no libraries.

Idea:

The input layer consists of 9 neurons, each with an input normalized to either 1, or -1. 1 will indicate a move by the player, 0 will indicate an empty spot, and -1 will indicate a move made by the ANN. IE:

Board visualization:

User is X, ANN is O:

```

X | O | X

---------

O | X | .

---------

X | O | X

```

The input layer will be:

```

{1, -1, 1, -1, 1, 0, 1, -1, 1}

```

This will enable the ANN to start playing either player, and be able to continue previous games.

Output will consist of 9 neurons, with a weight between 1 and -1. The highest output will become the spot where the ANN is to place their piece. If the highest output already has a piece in it, the second highest, then the 3rd, and so on will be considered.

# Neurons

Each neuron will have the following properties:

- A weight between 1 and -1

- A bias from between 1 and -1

- A value between 1 and -1

- A function to calculate the value of the neuron

- A function to update the weight and bias of the neuron

For the input neuron, the input is multiplied by the weight, and the bias is added to the result. The value is then normalized to between 1 and -1 by dividing by the sum of the absolute value of the weight and the bias. It's then fed into a binary function, where if the total is more than 0.5, it activates. Same is similar for hidden layer neurons, but the input from every connected neuron is multiplied by the weight, summed up, averaged, and then normalized to between 1 and -1 by dividing by the sum of the absolute value of the weight and the bias. The value is then fed into a binary function, where if the total is more than 0.5, it activates. For output neurons, there is no activation function.

# Architecture

Will run locally on the user's CPU.

Input layer: 9 neurons

Hidden layer #1: 9 neurons

Hidden layer #2: 9 neurons

Hidden layer #3: 9 neurons

Output layer: 9 neurons

# Training

ALL VALUES WILL BE NORMALIZED TO C++ LONG DOUBLES.

The ANN will be trained using the following method:

- The ANN will start with random weights and biases between -0.5 and 0.5.

- The ANN will play a game of Tic-Tac-Toe against a human player

During the game, a win/loss/draw detection function will be run after every move.

- The ANN will then add 2/(1+(game#/200) (minimum will be 0.1 to prevent stagnation) to the weights and a quarter of that to the biases of the neurons that were activated when the ANN won, and subtract 2/(1+(game#/101) (minimum will be 0.1 to prevent stagnation) from the weights and a quarter of that from the biases of the neurons that were activated when the human player won.

**NOTE THAT THE BIASES ARE EXEMPT FROM THE 0.0001 RULE, THEY GO TO A MINNIMUM OF 0.000025, WHICH IS 1/4th OF THE MINNIMUM WEIGHT**.

- The ANN will then play another game of Tic-Tac-Toe against a human player

There are plans to implement ANN vs ANN training.

#Win/Loss Detection

- A program will be written to detect if the game has been won, lost, or is a draw

4 Comments
2024/11/30
20:23 UTC

1

ML and DS bootcamp by Andrei Neagoie VS DS bootcamp by 365 careers ?

Background : I've taken Andrew Ng's Machine learning specialisation. Now I want to learn python libraries like matplotlib , pandas and scikit learn and tensorflow for DL in depth.

PS : If you know better sources please guide me

5 Comments
2024/11/30
18:38 UTC

1

Where to buy a GPU

I am looking for help buying a 3090 with a decent price. It's too expensive and I have to train a model which needs higher VRAM. Where can I look for a decent price for 3090.

1 Comment
2024/11/30
17:48 UTC

1

Preprocess two different kind of datasets for a machine learning problem

I am working on two health-related datasets. And I use Python.

- One tabular dataset (called A) contains patient-level information (by id) and a bunch of other features which I have already transformed and cleaned. This dataset has around 3000 rows. The dataset contains labels (y) for a classification problem.

- The other data is a collection of dataframes. Each dataframe represents time-series data on a particular patient (by id also). There are around 1000 dataframes (only 1000 patients have available information on this time-series data).

My methods so far:

- For the collection of dataframes, for each dataframe/patient-id, I selected only the mean, median, max, and min for each column. Then transformed the a dataframe into a single row of data: for example: "patient_id", "min_X", "max_X", "median_X", "mean_X" instead of lengthy timestep-level dataframe. Do you think this is a good idea to preserve key information about the time-series data? Otherwise, I think of a machine learning model to select the time-series features but not sure how to do so.

- Now, I would have this single dataframe (called B) of patient-level time-series data and want to join it with the first cleaned dataframe (A) but the rows are mismatched. That is, A has 3000 rows but B only has 1000 rows. The patient ids of B are subset of the patient ids of A. I don't know how to deal with this. I'm thinking of just using the 1000 rows of B and left join A but would it be a lot of data loss?

Any advice/thoughts are appreciated.

1 Comment
2024/11/30
17:43 UTC

1

Mastering Derivatives: From Math to Code - Python Numerical Differentiation

0 Comments
2024/11/30
17:17 UTC

308

Scikit Learn ML algorithms u need

15 Comments
2024/11/30
16:47 UTC

0

Is I am On Right Track of Ai???

Hey Guys, I want to Become an Ai engineer And My Journey Will Be Self Taught I am Learning python and then full stack web Dev. With python and django and Then DSA in python , then Move to ai engineer ,this is my path. What Do You Think On path, is am following right path.

And I am starting this journey at the age of 21 is I am too late??? What do You Think 🤔

23 Comments
2024/11/30
16:03 UTC

1

What do these text in blocks mean?

https://preview.redd.it/brosizocm14e1.png?width=1201&format=png&auto=webp&s=dfab421e882ddd5511658681b5b3120acd8bcaa8

For resBlocks the paper says,

the input feature maps go through a convolution layer with a kernel size of 1×1 and a convolution layer with a kernel size of 3×3 to obtain the feature maps F.

But then what do the number say on the blocks? I thought they meant as input_dims, kernel_size, stride, out_channels but then why does the paper mentions only a 1x1 and then 3x3.

The figure 2 is below:

https://preview.redd.it/u4t2sfqxm14e1.png?width=1576&format=png&auto=webp&s=f606428121cf72fa85c372384c6e918d7263e09b

This is my first time implementing the paper. So, any help is appreciated.

Link to paper: https://ieeexplore.ieee.org/document/9303478

1 Comment
2024/11/30
13:49 UTC

3

how to identify a subset of correlated data in a larger set of uncorrelated data

Hi,

I have a set of data, I'm trying to find correlation between input and label, but by nature, a majority of the data is not supposed to have correlation. let's imagine that under certain circumstances, say 1% of the data has a strong correlation between the input and the label.

the problem is that if I train an neural net model I will have at best 50 to 51% good predictions since 99% of the data don't have correlation. I need to identify this subset of data.

I have tried K-cluster groupement as suggested by chat gpt but it didn't improve the prediction % for any of the clustered set of datas. any suggestion if this is even possible ?

6 Comments
2024/11/30
12:16 UTC

1

What’s the best next step for learning ML?

Hey everyone,

Hope you’re all doing well! I’m a Senior Software Engineer, and I’ve been really curious about getting into machine learning. Right now, my knowledge of ML is pretty basic, and I don’t know much about its different areas.

I started by learning some math, thinking it would help later. I’ve just finished a Calculus 1 course, but now I’m stuck on what to do next. Should I keep going with math (like Linear Algebra, Calculus 2, or Stats and Probability), or should I start exploring ML concepts, get a general idea, and then dive deeper into the math when needed?

What do you all think? What’s the best way to move forward? Would love some advice!

1 Comment
2024/11/30
12:07 UTC

11

Linux Distro for ML and DL

I just want to know which Linux Distro is best for ML and DL development which also supports nvidia graphics card for CUDA and cuDNN. I am open for all the suggestions that I can get rn.

24 Comments
2024/11/30
08:58 UTC

2

Where should I learn/find statistics for DS/ML Education

I'm a Electronics Engineer and had started learning mathematics for DS/ML two months ago and i found myself tangled in it.

I decided to unlearn and start fresh. Please recommend me yt playlist/notes for me.

Thank you for reading. Glad if you respond🫶

5 Comments
2024/11/30
08:51 UTC

1

Thoughts about hyperskill AI engineer bootcamp?

Hi guys thoughts about AI engineer from hyperskill thats starting in January? Is it also beneficial for someone that may go into network engineering? (I ask because I want to have this as backup if I cant find job in IT already have CCNA Im also in my third year of CS degree)

2 Comments
2024/11/30
07:30 UTC

2

AWS released new Multi-AI Agent framework

0 Comments
2024/11/30
07:19 UTC

5

Can someone suggest a few good books on understanding and creating AI-agents from an LLM perspective?

11 Comments
2024/11/30
05:16 UTC

1

Sparse Neural Network

I'm currently trying to train a sparse neural network and could use some advice. I've experimented with L1 regularization and the pruning techniques available in PyTorch, but neither has given me good results so far.

When I used L1 regularization alone, I found that the resulting neural network didn't show any real sparsity. I suspect this might be due to the optimizer's numerical nature, which introduces small errors that prevent sparsity from emerging. A friend suggested that dropout might help in training sparse neural networks, but I'm a bit skeptical about how effective that would be.

If anyone has practical tips or insights on how to train a sparse neural network effectively, I would greatly appreciate your help!

4 Comments
2024/11/30
03:07 UTC

1

Do child nodes in a decision trees always reduce entropy LESS than their parent nodes?

I'm curious if it's the case that child nodes in a decision trees always have less information gain/less entropy reduction and are in general always less informative than their parent nodes?

1 Comment
2024/11/30
01:22 UTC

1

Python Implementation of Softmax that takes integer input

Hey, So I am working on a project whereby I have to quantize my model's weights and biases to integers and perform subsequent operations using integers. The output of my model can be either (int8 or int16) values (in this case, logits) and I need to call softmax on this logits output/array. I was able to find an integer implementation of softmax written in C (https://github.com/ARM-software/CMSIS-NN/tree/main/Source/SoftmaxFunctions). The problem I'm having is trying to evaluate that this C implementation is accurate (or more specifically, that I am using it accurately). The way I'm thinking of doing that is detailed below:

**In Python**
Take my integer logits, call an integer python implementation of softmax on the logits, get a result
(**python_integer_prediction_probabilities**).

** In C (using CMSIS-NN's )
Take the same integer logits, call the C softmax implementation on my logits, get a result (**CMSIS_NN_prediction_probabilities**)

Finally, I compare these two results to see if they are close enough. The main problem I'm having is, I assumed there would be information about how to implement a softmax function that takes integer inputs in Python, but I can't find anything online. Does anyone have an idea of how to implement this in python or is aware of resources that I could use to figure this out? thank you.

0 Comments
2024/11/30
01:19 UTC

0

What does it take to become a senior machine learning engineer?

Hello,

I was wondering how a entry level machine learning engineer becomes a senior machine learning engineer. Is the skills required to become a Sr ML engineer learned on the job, or do I have to self study? If self studying is the appropriate way to advance, how many hours per week should I dedicate to go from entry level to Sr level in 3 years, and how exactly should I self study? Advice is greatly appreciated!

31 Comments
2024/11/30
00:52 UTC

36

Do you need SWE experience to become MLE?

I’ve noticed job postings for machine learning engineers often fall into two categories:

  • Roles that look like what we used to call “full-stack data scientists”—people who build models and deploy them.
  • Roles, especially at big tech companies, that require 2+ years of experience as a software engineer.

This makes me wonder: Is prior experience as a software engineer necessary, or is a background in data science (a degree in ds and experience in the field) sufficient for most MLE roles? (With mle roles i mean roles that build models, so no data scientist role that is actually a glorified analyst)

P.S. I know job titles can be misleading, but I hope my question is clear!

11 Comments
2024/11/29
22:20 UTC

1

Trying to understand DeWave

DeWave is an EEG-to-text model that uses discrete codex. What I'm struggling to understand is how they could have made a discrete or indexing codex for the model. Figure 3 in the paper mentions a "Codex Transformer" being used to create a codex encoder and decoder, but I don't know what that is and can't find anything online about it. If anyone knows the answer to these questions it would be greatly appreciated.

1 Comment
2024/11/29
21:35 UTC

5

All about embeddings in RAG

Embeddings are a fundamental step in a RAG pipeline. Irrespective of how we choose to implement RAG, we won't be able to escape the embedding step. When researching for an indepth video, I found this one:

https://youtu.be/rZnfv6KHdIQ?si=0n9qfUsWWQnEyYTU

Hope its useful.

5 Comments
2024/11/29
21:23 UTC

4

Denoising autoencoders, link to scores

Hi guys, I made a video about the connection between denoising autoencoders and the underlying data distribution. If you don't know these topics, they rule most of the principles of modern generative models such as diffusion models. Anyway, here's the video, hope you enjoy. https://youtu.be/0V96wE7lY4w?si=P45Pz_CmqQgDFSFq

0 Comments
2024/11/29
21:14 UTC

Back To Top