/r/learnmachinelearning
A subreddit dedicated to learning machine learning
A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.
Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.
/r/learnmachinelearning
Hi folks,
I'm a Master's student studying AI and Robotics. I'm considering starting a Twitch Channel where i'd stream myself working on projects and learning about ML/AI/Robotics.
I'm wondering if anyone in the community would be interested in watching this?
Sort of inspired by watching streams like https://www.twitch.tv/georgehotz and Andrej Karpathy's videos.
It also feels like having an active chat would be worthwhile for folks who have questions about the field.
Hi everyone, since quantization is a hot topic right now, here is a series of experiments on embeddings quantization with SVD in order to understand a bit better the effects of quantization. It is a part of my broader cloud embedding series. Hope this helps !
Hello, i have a scholarship that offers me to choose any course from Coursera. When i finish the current one im taking im thinking of taking a one related to ML as i already completed a Udemy course (Ai/ML from A-Z) so i already have a base. I can take anycourse from a reputable company
I want to use CNN over audio data so when I was doing pre-processing it applied envelope over it which removed noise from the audio. But this result in smaller duration audio clip. Now I want to make the shape of the input to be same. But how to do it? I know about trimming and padding but still I am confused on how to apply.
Accuracy reaches peak at 35 and then stagnates
I was just trying this model out with CIFAR100 dataset, and accuracy and loss stagnates around 50 epochs in, tried SGD with 0.001 and momentum 0.9 and Adam with 0.001 same results
I'm working on developing a machine learning model that can accurately identify whether digits in an image are handwritten or printed. Despite having access to datasets like Char74K and MNIST, I'm struggling to get my model to perform well.
I have some experience with basic machine learning models, but I'm finding this particular challenge a bit daunting.
Model Attempts: I've attempted using a Convolutional Neural Network (CNN) given its success in image-based tasks, but my results have been underwhelming.
I'm seeking advice on the following:
Optimal Model Architecture: Are there specific architectures or layers that have worked well for a task like this?
Effective Preprocessing: Any recommendations on preprocessing techniques that might enhance model performance?
Dataset Balancing and Augmentation: Tips on how to balance and augment these datasets for better training outcomes.
If anyone has worked on similar problems or knows of relevant literature or tutorials, I would really appreciate your insights.
Thank you so much for taking the time to read and respond. Any guidance or references would be greatly appreciated!
Question: Are the company's strategies implemented in all departments?
Data: Official strategy papers (beginning of strategy, main documents); (strategy) communication in departamental meetings
Goal: Evaluate the degree of strategies being present in the communication docs and build a hierachical top-down tree which marks the company's structure.
Ideas: use a pre-trained transformers; Keyword-based topic modelling and quantification for each document -> compare texts related to each strategy; Word and sentence embedding for similarity evaluation
What ideas do you have to compare texts with concerning specific strategies?
Is it possible to have a layer in your model architecture which dynamically transforms the output according to the input provided? My problem is that I wish to replace a visual frontend module in an existing model architecture which is calculating the features of a video and outputs a shape (T, 512) per video where T is time steps(number of frames) in the video, which is then passed into a conformer to calculate the temporal relation between the video frames. I want to replace it with a video vision transformer (this). The problem is the video vision transformer provides an output of the form (3137, 768) which are fixed hidden_size constants, and thus the time step variable is not retained in the output. I wish to further tranform this shape into a (T, 512) where T keeps changing for each video. Is it possible to do this using some pre existing layer exposed by PyTorch? Something where I can pick the T most significant outputs from the (C1, C2) shape?
Somethings to be kept in mind:
Is it even possible with sample size less than 100? Could you suggest on how to achieve it?
Bad Retrieval
Low Precision: Not all chunks in the retrieved set are relevant
— Hallucination + Lost in the Middle Problems
Low Recall: Now all relevant chunks are retrieved.
— Lacks enough context for LLM to synthesize an answer
Outdated information: The data is redundant or out of date.
Bad Response Generation
Hallucination: The model makes up an answer that isn’t in the context.
Irrelevance: The model makes up an answer that doesn’t answer the question.
Toxicity/Bias: The model makes up an answer that’s harmful/offensive.
1. Context Missing in the Knowledge Base: Clean data and better prompting
2. Context Missing in the Initial Retrieval Pass: Hyperparameter tuning for chunk size and top-k
3. Context Missing After Reranking: Better retrieval strategies like Knowledge Graph Retrievers or Composed/Hierarchical Retrievers
4. Context Not Extracted: Prompt compression or LongContextReorder
5. Output is in the Wrong Format: Better text prompting/output parsing; Use OpenAI function calling + JSON mode; Use token-level prompting (LMQL, Guidance)
6. Output has an Incorrect Level of Specificity: Small-to-big retrieval; Sentence window retrieval; Recursive retrieval
7. Output is Incomplete: Query transformations
8. Can’t Scale to Larger Data Volumes: Parallelizing ingestion pipeline
9. Rate-Limit Errors: Multiple API keys and rotate them in our application
I need an open source TTS model with pre-trained presets. Scrolling through huggingface and github I keep encountering models that allow me to train my own, but I'd like to just select from properly, high-quality pre-trained presets. I'd like to interact with the model completely via python script.
I'm doing this to replace some functionality in an app I'm developing that uses OpenAI's paid TTS API (https://platform.openai.com/docs/guides/text-to-speech) - but as I scale I'm realizing how expensive that service really is. Any help is much appreciated, thank you!
I want to train a segmentation model but the dataset I choose the number of the image and the masked image are different. What should I do in this situation? Please give me some suggestions and thank you for your time.
Hey everyone,
I hope you're all doing well. I wanted to reach out to this community for some advice and guidance.
A little background about myself: I initially started my journey in the tech world with a focus on machine learning. However, despite my passion and efforts, I struggled to find internships or job opportunities in the field. As a result, I ended up transitioning into web development.
While I appreciate the skills I've gained in web development, I've come to realize that it's not where my true passion lies. My heart still beats for machine learning, and I find myself yearning to dive back into that world.
I'm reaching out to this community for any advice, tips, or resources on how to make a successful transition from web development back to machine learning. Whether it's courses to take, projects to work on, or networking opportunities, I'm open to any suggestions that can help me pursue my passion for ML.
Checkout this tutorial on how to generate movie scripts using Multi-Agent Orchestration where the user inputs the movie scene, LLM creates which agents to create and then these agents follo the scene description to say dialogues. https://youtu.be/Vry2-h81_I0?si=0KknmT8CfAhTucht
TLDR at the bottom
My Background
I am an electrical engineer by trade. I started learning about ML/AI almost a year ago. I have taken the Machine Learning Specialization by Andrew Ng on Coursera, and most of the GAN specialization also on Coursera. I'm comfortable in python, but definitely no expert. Most of my coding and AI/ML knowledge in self-taught, so there's definitely huge holes in my knowledge.
My Goal
I am trying to create a video generator model. To that end I found an architecture that supposedly works very well called TGANv2, which is written in Chainer. I want to edit this model and eventually put it on a microcontroller. I am also treating this as a learning exercise, so I copied the model in PyTorch and got similar results as the Chainer model. Here is the Pytorch model, if you are interested.
Now I was playing around with making edits in PyTorch and found the training slow and expensive sometimes (I don't have a GPU, I am using a runpod VM with a GPU). I tried changing the the generator convolutions to depth-wise separable and it considerably increased training time, which is odd considering it cut down my generator size to about 1/10 or so. So I decided to re-create the model in Tensorflow to see if it would train faster, and because I think I will eventually have to write the model in Tensorflow Lite to get it on a microcontroller. But I keep running into a Floating Point Exception Error. More on that below.
Here is my Tensorflow Code: Model (It's long, I have a snippet of the training loop a bit further down, which I think is more relevant).
Setup
Runpod VM with an NVIDIA A40 GPU running Ubuntu 22.04. Originally running CUDA 11.8, Tensorflow 2.14, and cuDNN 8.7 (though tried a bunch of different combinations, explained below). I am also running this in a conda virtual environment.
Issue I am running into
I run my tensorflow code, it instantiates the model just fine. It starts the training loop and goes through forward propagation just fine. But when it gets to the line that is meant to calculate discriminator gradients, it keeps throwing a Floating Point Exception error. I have tried multiple different combinations of CUDA, Tensorflow, and cuDNN. Also, when I run this code on my local machine CPU it calculates the discriminator gradients and doesn't throw a floating point exception error (it has another error later on, but I will cross that bridge when I get there).
Here is the pertinent part of my training loop (snipped for brevity):
#Create dataloader
dataset = VideoDataset(directory)
dataloader = tf.data.Dataset.from_generator(
lambda: iter(dataset), # Corrected to use iter() to clearly return an iterator from the dataset
output_signature=(
tf.TensorSpec(shape=(16, 32, 32, 3), dtype=tf.float32),
tf.TensorSpec(shape=(8, 64, 64, 3), dtype=tf.float32),
tf.TensorSpec(shape=(4, 128, 128, 3), dtype=tf.float32),
tf.TensorSpec(shape=(2, 256, 256, 3), dtype=tf.float32)
)
).batch(batch_size)
print("Starting Training...")
for epoch in range(n_epochs):
start_time = time.time()
total_gen_loss, total_disc_loss, total_loss_real = 0, 0, 0
num_batches = 0
for batch in dataloader:
#Adjust learning rate
iteration += 1
x_real = batch
num_batches += 1
current_batch_size = x_real[0].shape[0]
noise = tf.random.uniform((batch_size, z_dim), -1.0, 1.0)
# Discriminator loss
print("Generated noise")
with tf.GradientTape(persistent=True) as gen_tape, tf.GradientTape() as disc_tape:
x_fake = gen(noise, training=True)
print("Generated x_fake")
y_fake = disc(x_fake, training=True) # Compute once for both G and D updates
y_real = disc(x_real, training=True)
print("Generated y_fake and y_real")
disc_loss = disc_loss = tf.reduce_mean(tf.math.softplus(y_fake)) + tf.reduce_mean(tf.math.softplus(-y_real))
gen_loss = tf.reduce_mean(tf.math.softplus(-y_fake))
# Use the checking function instead of tf.debugging.check_numerics
check_and_report(y_fake, "y_fake")
check_and_report(y_real, "y_real")
check_and_report(disc_loss, "disc_loss")
check_and_report(gen_loss, "gen_loss")
total_disc_loss += disc_loss
loss_real = tf.reduce_mean(tf.math.softplus(-y_real))
total_loss_real += loss_real
#Apply discriminator loss gradients
print("Calculating disc gradients")
disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)
print("Applying disc gradients")
try:
disc_opt.apply_gradients(zip(disc_gradients, disc.trainable_variables))
except Exception as e:
print(f"Error applying gradients: {e}")
print("Applied disc grad")
And here is my terminal output:
(tf_gpu) root@07016ffd12d2:/workspace# python 2TF-TGANv2.py
Num GPUs Available: 1
Tensorflow version: 2.8.0
2024-04-14 19:46:37.328967: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-14 19:46:37.920705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 43453 MB memory: -> device: 0, name: NVIDIA A40, pci bus id: 0000:52:00.0, compute capability: 8.6
Starting Training...
Generated noise
2024-04-14 19:47:03.174697: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2024-04-14 19:47:03.705363: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100
Generated x_fake
Generated y_fake and y_real
No issues in y_fake
No issues in y_real
No issues in disc_loss
No issues in gen_loss
Calculating disc gradients
Floating point exception
The "no issues" print statement comes from a function checking the named tensors for NaN and Inf values, but these are just the output labels, not necessarily the gradients. Notice the error "Floating point exception" comes after "Calculating disc gradients", which means that this line is where the issue is:
disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)
I have been at this for a week and am at a loss.
What I've tried
I have essentially just tried a bunch of different combinations Tensorflow(TF), CUDA, and cuDNN based on the "Tested Build Configurations" for GPU according to tensorflow's "Build From Source" page. Combinations I've tried (on NVIDIA A40).
TF = 2.15, CUDA = 12.2, cuDNN = 8.9
TF = 2.14, CUDA = 11.8, cuDNN = 8.7
TF = 2.13, CUDA = 11.8, cuDNN = 8.7
TF = 2.12, CUDA = 11.8, cuDNN = 8.7
TF = 2.4, CUDA = 11.0, cuDNN = 8.0
TF = 2.5, CUDA = 11.0, cuDNN = 8.0
TF = 2.11, CUDA = 11.2, cuDNN = 8.1
TF = 2.10, CUDA = 11.2, cuDNN = 8.1
TF = 2.9, CUDA = 11.2, cuDNN = 8.1
TF = 2.8, CUDA = 11.2, cuDNN = 8.1
TF = 2.3, CUDA = 10.1, cuDNN = 7.6
TF = 2.2, CUDA = 10.1, cuDNN = 7.6
All of these give me the same "Floating point exception" error at the discriminator gradient calculation step. Does anyone have an idea of what is causing this? Again, I do NOT have this issue when running this on my local machine CPU. It only happens when running on a runpod VM. Is this something in my code that is handled differently on a GPU? Is this caused by a crappy VM setup? If so what could be causing it? Please help me. I really want to get this to work and I have spent a ridiculous amount of time on just trying to get Tensorflow to work on GPU lol.
TLDR
Trying to copy a video generation model TGANv2, written in Chainer. Copied and ran it in PyTorch, works fine. When trying to run it in Tensorflow, I keep getting a "Floating point exception" error when it gets to the gradient calculation for the discriminator:
disc_gradients = disc_tape.gradient(disc_loss, disc.trainable_variables)
I am running it on a runpod VM with Ubuntu 22.04 and an NVIDIA A40 GPU. I am also running it on a conda environment. The error does not happen when running it on my local computer CPU. I have tried 13 different combinations of Tensorflow (see section immediately before this) based on the "Tested Build Configurations" for GPU according to tensorflow's "Build From Source" page, all have the same issue. Please help me.
I was wondering why the accuracy on my inference model was so bad when I discovered that after running the numpy array dataset I had through the pytorch datatset loader it was transformed into totally different and random data thaat doesnt exist in my dataset. Why is this happening?
The model cannot even get things correct on the training dataset. I am not sure where I could have gone wrong except maybe incorrectly matching the labels to the data. Please ask for any more specifications if you need them.
How much data structures and algorithms, Object Oriented Programming do we need to know for Data Science and ML in industry? And also how much leetcode we need to solve for data science? Appreciate your opinions.
I kept bouncing back to math stuff i cant convice myself that can learn it with learning all theory(includes matrix algebra, linear algebra, partial diff eqns and many more) stuff...textbooks adviced by redditors are great but have very lengthy theory explanation l...i kept studying for 2 months(books+yt videos) but when a someone working at a ADAS(self driving ,parking assitantance) company asked me "can you build a model if know how math behind it works?" I wrote a linear reg model on housingprice...he went "WHHHAT, you are supposed to make classes,functions, this is not how we do this, this is just model that works but you should write code that can used for deployment or atleast it should be able compare with other working on problem statement, this is useless" I went "😬😨😰" i dont know what to do
How can I derive associations between player positions?
So I have a csv containing football data about goals where each goal has a scorer, GCA1(the player that gave assist), GCA2(the player that gave the pass to the assister)
I want to discover patterns of player positions that lead to a goal AKA buildups to a goal
Example: RB passed to a CAM which assisted a goal scored by a ST, or CB passed to a RW which assisted a goal scored by a LW
I want to find the most frequent buildups, think of it as finding frequent itemsets for a supermarket to derive discount decisions. Except my goal is to know which buildups are most common and make up coaching plans to better strengthen the relationship between the players in those buildups
I was thinking of using APRIORI algorithm or FP-Growth, I tried CHATGPT but it didn't help me that much (I'm getting only one association between FW players and no one, or sort of saying forward players scoring solo, which is definitely not logical based on my dataset) and gemini is the most awful AI out there. Seriously my grandma can do better, I gave it a prompt and rephrased it 3 times and it still gave me 'Rephrase your prompt and try again'
So does anyone know a way I can do this, or if there is a way to do it better. I'm still a junior data scientist so I'm still learning and I would gladly appreciate any feedback or advice.
I have a naive Bayes ML model that takes call attributes and predicts if the caller is going to abandon the call while they are on hold waiting to speak to an agent. The model lives in Databricks ML flow, I have it registered.
What I need to do is extract the exact calculations so I can make them myself during the call. Once the user hangs up, the prediction is useless. I want to predict whether the caller breaks the threshold of "likely to abandon" based on a running tally of the features and weights during the call.
Is there a way to extract the calculations being made in the model and make them myself? Because of the way our call center is set up, the business owners do not want to import the whole model and make predictions every time a feature is updated. It would eat significantly less resources to just have a running tally and once that tally breaks a threshold, flag the call.
Asking AI, it seems like the calculations are obscured and it's not easy to extract them and make them myself, especially if using a naive bayes model.
I have been learning ML for a long time now. I was going smooth and good. Learned some basic and intermediate stuff but this tjought irritated me that I had to learn it in depth. So i started learning the Math behind ML and now its been 2 monthsbut I ma stuck have made zero progress and feels like i have forgotten everything. I just can't find a good resource for the math and If I do i have a really hard time grasping the concepts.
its important i use vscode becasue i want to use microphone of my device. if i use colab i will have to use javascript which i dont write.
this link says its possible to use gpu in vscode, but its free. if so, whats the benefit of using colab gpu if i can have gpu on my own device.
also i am not sure if i am doing it the right way, but how can i use gpu in vscode?
thanks
I am doing my bachelor's in data science and my final year is around the corner. We have to make a research and/or industry scope project with a front-end in a group of 2-3 members. I am still confused about the scope of the project (how far a bachelor's student is realistically expected to take it), but I know a 'good' AI/ML project usually lies in either the medical domain along with computer vision, or creating speech-to-text chatbots with LLMs.
Here's a few projects (sans front-end) that I have already worked on just to show I aim to do something bigger than these for my final project:
My goal is to secure a good master's admission with a remarkable project. I am curious about LLMs and Reinforcement Learning, but more specific help is appreciated!