/r/learnmachinelearning

Photograph via snooOG

A subreddit dedicated to learning machine learning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

  • Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.
  • Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.
  • Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.

Chatrooms

Official Discord Server


Wiki

Getting Started with Machine Learning

Resources


Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

/r/learnmachinelearning

396,330 Subscribers

0

Beginner projects

0 Comments
2024/04/15
10:59 UTC

3

Twitch Stream to Learn ML/AI/Robotics

Hi folks,

I'm a Master's student studying AI and Robotics. I'm considering starting a Twitch Channel where i'd stream myself working on projects and learning about ML/AI/Robotics.

I'm wondering if anyone in the community would be interested in watching this?

Sort of inspired by watching streams like https://www.twitch.tv/georgehotz and Andrej Karpathy's videos.

It also feels like having an active chat would be worthwhile for folks who have questions about the field.

0 Comments
2024/04/15
10:31 UTC

1

Embeddings quantization and SVD

Hi everyone, since quantization is a hot topic right now, here is a series of experiments on embeddings quantization with SVD in order to understand a bit better the effects of quantization. It is a part of my broader cloud embedding series. Hope this helps !

0 Comments
2024/04/15
10:03 UTC

1

Coursera ML courses

Hello, i have a scholarship that offers me to choose any course from Coursera. When i finish the current one im taking im thinking of taking a one related to ML as i already completed a Udemy course (Ai/ML from A-Z) so i already have a base. I can take anycourse from a reputable company

0 Comments
2024/04/15
09:21 UTC

1

How to make the shape of audio data same after applying envelope to it

I want to use CNN over audio data so when I was doing pre-processing it applied envelope over it which removed noise from the audio. But this result in smaller duration audio clip. Now I want to make the shape of the input to be same. But how to do it? I know about trimming and padding but still I am confused on how to apply.

0 Comments
2024/04/15
09:21 UTC

3

Accuracy and Loss stagnation, what could be the reason ?

Accuracy reaches peak at 35 and then stagnates

https://preview.redd.it/vqwowdaw1muc1.png?width=842&format=png&auto=webp&s=7aa21fbe6cce28fe437b53d26e06f7293627ef53

Is my model too simple ?

I was just trying this model out with CIFAR100 dataset, and accuracy and loss stagnates around 50 epochs in, tried SGD with 0.001 and momentum 0.9 and Adam with 0.001 same results

5 Comments
2024/04/15
09:20 UTC

1

Seeking Advice: Building a Model to Differentiate Handwritten vs. Printed Digits

I'm working on developing a machine learning model that can accurately identify whether digits in an image are handwritten or printed. Despite having access to datasets like Char74K and MNIST, I'm struggling to get my model to perform well.

I have some experience with basic machine learning models, but I'm finding this particular challenge a bit daunting.

Model Attempts: I've attempted using a Convolutional Neural Network (CNN) given its success in image-based tasks, but my results have been underwhelming.

I'm seeking advice on the following:
Optimal Model Architecture: Are there specific architectures or layers that have worked well for a task like this?
Effective Preprocessing: Any recommendations on preprocessing techniques that might enhance model performance?
Dataset Balancing and Augmentation: Tips on how to balance and augment these datasets for better training outcomes.

If anyone has worked on similar problems or knows of relevant literature or tutorials, I would really appreciate your insights.

Thank you so much for taking the time to read and respond. Any guidance or references would be greatly appreciated!

0 Comments
2024/04/15
09:02 UTC

1

Use NLP for evaluating company strategy papers

Question: Are the company's strategies implemented in all departments?

Data: Official strategy papers (beginning of strategy, main documents); (strategy) communication in departamental meetings

Goal: Evaluate the degree of strategies being present in the communication docs and build a hierachical top-down tree which marks the company's structure.

Ideas: use a pre-trained transformers; Keyword-based topic modelling and quantification for each document -> compare texts related to each strategy; Word and sentence embedding for similarity evaluation

What ideas do you have to compare texts with concerning specific strategies?

0 Comments
2024/04/15
08:30 UTC

6

Quantized Embeddings: Drastically reduce memory usage with this technique!

3 Comments
2024/04/15
07:57 UTC

1

Is dynamically changing layer output size according to input possible?

Is it possible to have a layer in your model architecture which dynamically transforms the output according to the input provided? My problem is that I wish to replace a visual frontend module in an existing model architecture which is calculating the features of a video and outputs a shape (T, 512) per video where T is time steps(number of frames) in the video, which is then passed into a conformer to calculate the temporal relation between the video frames. I want to replace it with a video vision transformer (this). The problem is the video vision transformer provides an output of the form (3137, 768) which are fixed hidden_size constants, and thus the time step variable is not retained in the output. I wish to further tranform this shape into a (T, 512) where T keeps changing for each video. Is it possible to do this using some pre existing layer exposed by PyTorch? Something where I can pick the T most significant outputs from the (C1, C2) shape?

Somethings to be kept in mind:

  1. T is always in the range of around [50, 200], not beyond that.
  2. I cannot clip the videos in the dataset to make T constant for all videos since
0 Comments
2024/04/15
07:36 UTC

3

Machine learning with a small data set?

Is it even possible with sample size less than 100? Could you suggest on how to achieve it?

4 Comments
2024/04/15
07:34 UTC

2

9 Challenges and Solutions to Build Production-ready RAG Pipelines

RAG suffers from:

Bad Retrieval

Low Precision: Not all chunks in the retrieved set are relevant
— Hallucination + Lost in the Middle Problems
Low Recall: Now all relevant chunks are retrieved.
— Lacks enough context for LLM to synthesize an answer
Outdated information: The data is redundant or out of date.

Bad Response Generation
Hallucination: The model makes up an answer that isn’t in the context.
Irrelevance: The model makes up an answer that doesn’t answer the question.
Toxicity/Bias: The model makes up an answer that’s harmful/offensive.

FULL ARTICLE: https://medium.com/aiguys/solving-production-issues-in-modern-rag-systems-b7c31802167c

9 Challenges and Solutions to Modern RAG Pipelines

Response Quality Related

1. Context Missing in the Knowledge Base: Clean data and better prompting

2. Context Missing in the Initial Retrieval Pass: Hyperparameter tuning for chunk size and top-k

3. Context Missing After Reranking: Better retrieval strategies like Knowledge Graph Retrievers or Composed/Hierarchical Retrievers

4. Context Not Extracted: Prompt compression or LongContextReorder

5. Output is in the Wrong Format: Better text prompting/output parsing; Use OpenAI function calling + JSON mode; Use token-level prompting (LMQL, Guidance)

6. Output has an Incorrect Level of Specificity: Small-to-big retrieval; Sentence window retrieval; Recursive retrieval

7. Output is Incomplete: Query transformations

Scalability

8. Can’t Scale to Larger Data Volumes: Parallelizing ingestion pipeline

9. Rate-Limit Errors: Multiple API keys and rotate them in our application

0 Comments
2024/04/15
06:55 UTC

1

Hierarchical Clustering -Average Linkage - Example Problem with Step by ...

0 Comments
2024/04/15
06:00 UTC

3

Open Source TTS model with high-quality presets

I need an open source TTS model with pre-trained presets. Scrolling through huggingface and github I keep encountering models that allow me to train my own, but I'd like to just select from properly, high-quality pre-trained presets. I'd like to interact with the model completely via python script.

I'm doing this to replace some functionality in an app I'm developing that uses OpenAI's paid TTS API (https://platform.openai.com/docs/guides/text-to-speech) - but as I scale I'm realizing how expensive that service really is. Any help is much appreciated, thank you!

0 Comments
2024/04/15
05:02 UTC

1

Image and masked image

I want to train a segmentation model but the dataset I choose the number of the image and the masked image are different. What should I do in this situation? Please give me some suggestions and thank you for your time.

0 Comments
2024/04/15
03:21 UTC

5

Title: Considering Transitioning from Web Development to Machine Learning - Seeking Advice

Hey everyone,

I hope you're all doing well. I wanted to reach out to this community for some advice and guidance.

A little background about myself: I initially started my journey in the tech world with a focus on machine learning. However, despite my passion and efforts, I struggled to find internships or job opportunities in the field. As a result, I ended up transitioning into web development.

While I appreciate the skills I've gained in web development, I've come to realize that it's not where my true passion lies. My heart still beats for machine learning, and I find myself yearning to dive back into that world.

I'm reaching out to this community for any advice, tips, or resources on how to make a successful transition from web development back to machine learning. Whether it's courses to take, projects to work on, or networking opportunities, I'm open to any suggestions that can help me pursue my passion for ML.

0 Comments
2024/04/15
02:31 UTC

5

Multi-Agent Movie scripting using LangGraph

Checkout this tutorial on how to generate movie scripts using Multi-Agent Orchestration where the user inputs the movie scene, LLM creates which agents to create and then these agents follo the scene description to say dialogues. https://youtu.be/Vry2-h81_I0?si=0KknmT8CfAhTucht

0 Comments
2024/04/15
02:28 UTC

1

Tensorflow on Runpod GPU. Floating Point Exception Error caused by gradient calculation.

TLDR at the bottom

My Background

I am an electrical engineer by trade. I started learning about ML/AI almost a year ago. I have taken the Machine Learning Specialization by Andrew Ng on Coursera, and most of the GAN specialization also on Coursera. I'm comfortable in python, but definitely no expert. Most of my coding and AI/ML knowledge in self-taught, so there's definitely huge holes in my knowledge.

My Goal

I am trying to create a video generator model. To that end I found an architecture that supposedly works very well called TGANv2, which is written in Chainer. I want to edit this model and eventually put it on a microcontroller. I am also treating this as a learning exercise, so I copied the model in PyTorch and got similar results as the Chainer model. Here is the Pytorch model, if you are interested.

Now I was playing around with making edits in PyTorch and found the training slow and expensive sometimes (I don't have a GPU, I am using a runpod VM with a GPU). I tried changing the the generator convolutions to depth-wise separable and it considerably increased training time, which is odd considering it cut down my generator size to about 1/10 or so. So I decided to re-create the model in Tensorflow to see if it would train faster, and because I think I will eventually have to write the model in Tensorflow Lite to get it on a microcontroller. But I keep running into a Floating Point Exception Error. More on that below.

Here is my Tensorflow Code: Model (It's long, I have a snippet of the training loop a bit further down, which I think is more relevant).

Setup

Runpod VM with an NVIDIA A40 GPU running Ubuntu 22.04. Originally running CUDA 11.8, Tensorflow 2.14, and cuDNN 8.7 (though tried a bunch of different combinations, explained below). I am also running this in a conda virtual environment.

Issue I am running into

I run my tensorflow code, it instantiates the model just fine. It starts the training loop and goes through forward propagation just fine. But when it gets to the line that is meant to calculate discriminator gradients, it keeps throwing a Floating Point Exception error. I have tried multiple different combinations of CUDA, Tensorflow, and cuDNN. Also, when I run this code on my local machine CPU it calculates the discriminator gradients and doesn't throw a floating point exception error (it has another error later on, but I will cross that bridge when I get there).

Here is the pertinent part of my training loop (snipped for brevity):

    #Create dataloader
    dataset = VideoDataset(directory)
    dataloader = tf.data.Dataset.from_generator(
        lambda: iter(dataset),  # Corrected to use iter() to clearly return an iterator from the dataset
        output_signature=(
            tf.TensorSpec(shape=(16, 32, 32, 3), dtype=tf.float32),
            tf.TensorSpec(shape=(8, 64, 64, 3), dtype=tf.float32),
            tf.TensorSpec(shape=(4, 128, 128, 3), dtype=tf.float32),
            tf.TensorSpec(shape=(2, 256, 256, 3), dtype=tf.float32)
        )
    ).batch(batch_size)

    print("Starting Training...")
    for epoch in range(n_epochs):
        start_time = time.time()
        total_gen_loss, total_disc_loss, total_loss_real = 0, 0, 0
        num_batches = 0

        for batch in dataloader:
            #Adjust learning rate


            iteration += 1
            x_real = batch
            num_batches += 1
            current_batch_size = x_real[0].shape[0]

            noise = tf.random.uniform((batch_size, z_dim), -1.0, 1.0)
            # Discriminator loss
            print("Generated noise")
            with tf.GradientTape(persistent=True) as gen_tape, tf.GradientTape() as disc_tape:
                x_fake = gen(noise, training=True)
                print("Generated x_fake")
                
                y_fake = disc(x_fake, training=True)  # Compute once for both G and D updates
                y_real = disc(x_real, training=True)
                print("Generated y_fake and y_real")
                
                disc_loss = disc_loss = tf.reduce_mean(tf.math.softplus(y_fake)) + tf.reduce_mean(tf.math.softplus(-y_real))
                gen_loss = tf.reduce_mean(tf.math.softplus(-y_fake))

                # Use the checking function instead of tf.debugging.check_numerics
                check_and_report(y_fake, "y_fake")
                check_and_report(y_real, "y_real")
                check_and_report(disc_loss, "disc_loss")
                check_and_report(gen_loss, "gen_loss")

                total_disc_loss += disc_loss

                loss_real = tf.reduce_mean(tf.math.softplus(-y_real))
                total_loss_real += loss_real


            #Apply discriminator loss gradients
            print("Calculating disc gradients")
            disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)
            print("Applying disc gradients")
            try:
                disc_opt.apply_gradients(zip(disc_gradients, disc.trainable_variables))
            except Exception as e:
                print(f"Error applying gradients: {e}")
            print("Applied disc grad")

And here is my terminal output:

(tf_gpu) root@07016ffd12d2:/workspace# python 2TF-TGANv2.py
Num GPUs Available:  1
Tensorflow version: 2.8.0
2024-04-14 19:46:37.328967: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-14 19:46:37.920705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 43453 MB memory:  -> device: 0, name: NVIDIA A40, pci bus id: 0000:52:00.0, compute capability: 8.6
Starting Training...
Generated noise
2024-04-14 19:47:03.174697: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2024-04-14 19:47:03.705363: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100
Generated x_fake
Generated y_fake and y_real
No issues in y_fake
No issues in y_real
No issues in disc_loss
No issues in gen_loss
Calculating disc gradients
Floating point exception

The "no issues" print statement comes from a function checking the named tensors for NaN and Inf values, but these are just the output labels, not necessarily the gradients. Notice the error "Floating point exception" comes after "Calculating disc gradients", which means that this line is where the issue is:

disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)

I have been at this for a week and am at a loss.

What I've tried

I have essentially just tried a bunch of different combinations Tensorflow(TF), CUDA, and cuDNN based on the "Tested Build Configurations" for GPU according to tensorflow's "Build From Source" page. Combinations I've tried (on NVIDIA A40).

TF = 2.15, CUDA = 12.2, cuDNN = 8.9

TF = 2.14, CUDA = 11.8, cuDNN = 8.7

TF = 2.13, CUDA = 11.8, cuDNN = 8.7

TF = 2.12, CUDA = 11.8, cuDNN = 8.7

TF = 2.4, CUDA = 11.0, cuDNN = 8.0

TF = 2.5, CUDA = 11.0, cuDNN = 8.0

TF = 2.11, CUDA = 11.2, cuDNN = 8.1

TF = 2.10, CUDA = 11.2, cuDNN = 8.1

TF = 2.9, CUDA = 11.2, cuDNN = 8.1

TF = 2.8, CUDA = 11.2, cuDNN = 8.1

TF = 2.3, CUDA = 10.1, cuDNN = 7.6

TF = 2.2, CUDA = 10.1, cuDNN = 7.6

All of these give me the same "Floating point exception" error at the discriminator gradient calculation step. Does anyone have an idea of what is causing this? Again, I do NOT have this issue when running this on my local machine CPU. It only happens when running on a runpod VM. Is this something in my code that is handled differently on a GPU? Is this caused by a crappy VM setup? If so what could be causing it? Please help me. I really want to get this to work and I have spent a ridiculous amount of time on just trying to get Tensorflow to work on GPU lol.

TLDR

Trying to copy a video generation model TGANv2, written in Chainer. Copied and ran it in PyTorch, works fine. When trying to run it in Tensorflow, I keep getting a "Floating point exception" error when it gets to the gradient calculation for the discriminator:

disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)

I am running it on a runpod VM with Ubuntu 22.04 and an NVIDIA A40 GPU. I am also running it on a conda environment. The error does not happen when running it on my local computer CPU. I have tried 13 different combinations of Tensorflow (see section immediately before this) based on the "Tested Build Configurations" for GPU according to tensorflow's "Build From Source" page, all have the same issue. Please help me.

0 Comments
2024/04/15
00:59 UTC

0

Why is my pytorch model making up its own data?

I was wondering why the accuracy on my inference model was so bad when I discovered that after running the numpy array dataset I had through the pytorch datatset loader it was transformed into totally different and random data thaat doesnt exist in my dataset. Why is this happening?

1 Comment
2024/04/14
22:22 UTC

0

How does my model with 99% accuracy perform so badly on inference

The model cannot even get things correct on the training dataset. I am not sure where I could have gone wrong except maybe incorrectly matching the labels to the data. Please ask for any more specifications if you need them.

11 Comments
2024/04/14
22:13 UTC

6

Data Structures and Algorithms and Object Oriented Programming

How much data structures and algorithms, Object Oriented Programming do we need to know for Data Science and ML in industry? And also how much leetcode we need to solve for data science? Appreciate your opinions.

5 Comments
2024/04/14
21:14 UTC

5

Am i doing it right,anyone pls check my way of learning

I kept bouncing back to math stuff i cant convice myself that can learn it with learning all theory(includes matrix algebra, linear algebra, partial diff eqns and many more) stuff...textbooks adviced by redditors are great but have very lengthy theory explanation l...i kept studying for 2 months(books+yt videos) but when a someone working at a ADAS(self driving ,parking assitantance) company asked me "can you build a model if know how math behind it works?" I wrote a linear reg model on housingprice...he went "WHHHAT, you are supposed to make classes,functions, this is not how we do this, this is just model that works but you should write code that can used for deployment or atleast it should be able compare with other working on problem statement, this is useless" I went "😬😨😰" i dont know what to do

1 Comment
2024/04/14
19:44 UTC

1

How can I derive associations between player positions?

How can I derive associations between player positions?

So I have a csv containing football data about goals where each goal has a scorer, GCA1(the player that gave assist), GCA2(the player that gave the pass to the assister)

I want to discover patterns of player positions that lead to a goal AKA buildups to a goal

Example: RB passed to a CAM which assisted a goal scored by a ST, or CB passed to a RW which assisted a goal scored by a LW

I want to find the most frequent buildups, think of it as finding frequent itemsets for a supermarket to derive discount decisions. Except my goal is to know which buildups are most common and make up coaching plans to better strengthen the relationship between the players in those buildups

I was thinking of using APRIORI algorithm or FP-Growth, I tried CHATGPT but it didn't help me that much (I'm getting only one association between FW players and no one, or sort of saying forward players scoring solo, which is definitely not logical based on my dataset) and gemini is the most awful AI out there. Seriously my grandma can do better, I gave it a prompt and rephrased it 3 times and it still gave me 'Rephrase your prompt and try again'

So does anyone know a way I can do this, or if there is a way to do it better. I'm still a junior data scientist so I'm still learning and I would gladly appreciate any feedback or advice.

0 Comments
2024/04/14
19:25 UTC

1

Extract calculations naive bayes model

I have a naive Bayes ML model that takes call attributes and predicts if the caller is going to abandon the call while they are on hold waiting to speak to an agent. The model lives in Databricks ML flow, I have it registered.

What I need to do is extract the exact calculations so I can make them myself during the call. Once the user hangs up, the prediction is useless. I want to predict whether the caller breaks the threshold of "likely to abandon" based on a running tally of the features and weights during the call.

Is there a way to extract the calculations being made in the model and make them myself? Because of the way our call center is set up, the business owners do not want to import the whole model and make predictions every time a feature is updated. It would eat significantly less resources to just have a running tally and once that tally breaks a threshold, flag the call.

Asking AI, it seems like the calculations are obscured and it's not easy to extract them and make them myself, especially if using a naive bayes model.

0 Comments
2024/04/14
19:01 UTC

1

Python For Beginners - Building a AI Recommender System

1 Comment
2024/04/14
18:56 UTC

2

I get stuck while learning Math for Ml. What should I do?

I have been learning ML for a long time now. I was going smooth and good. Learned some basic and intermediate stuff but this tjought irritated me that I had to learn it in depth. So i started learning the Math behind ML and now its been 2 monthsbut I ma stuck have made zero progress and feels like i have forgotten everything. I just can't find a good resource for the math and If I do i have a really hard time grasping the concepts.

4 Comments
2024/04/14
17:42 UTC

0

using vscode with gpu

its important i use vscode becasue i want to use microphone of my device. if i use colab i will have to use javascript which i dont write.

this link says its possible to use gpu in vscode, but its free. if so, whats the benefit of using colab gpu if i can have gpu on my own device.

also i am not sure if i am doing it the right way, but how can i use gpu in vscode?

thanks

0 Comments
2024/04/14
17:12 UTC

5

Final Year Project Ideas

I am doing my bachelor's in data science and my final year is around the corner. We have to make a research and/or industry scope project with a front-end in a group of 2-3 members. I am still confused about the scope of the project (how far a bachelor's student is realistically expected to take it), but I know a 'good' AI/ML project usually lies in either the medical domain along with computer vision, or creating speech-to-text chatbots with LLMs.

Here's a few projects (sans front-end) that I have already worked on just to show I aim to do something bigger than these for my final project:

  • Mitosis detection in microscopic cell images of varying stains
  • Art style detector using web scraping (selenium + bs4)
  • Age/gender/etc recognition using custom CNN
  • Endoscopy classification using VGG16/19
  • Sentiment Analysis on multilingual text
  • Time series analysis
  • Stock market predictions
  • RNN based lab-tasks

My goal is to secure a good master's admission with a remarkable project. I am curious about LLMs and Reinforcement Learning, but more specific help is appreciated!

1 Comment
2024/04/14
17:00 UTC

4

Stable Diffusion SD 1.5 and SDXL Full Fine Tuning Tutorial

3 Comments
2024/04/14
16:17 UTC

Back To Top