/r/learnmachinelearning
A subreddit dedicated to learning machine learning
A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.
Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.
/r/learnmachinelearning
I'm watching the awesome machine learning lectures from Cornell here: https://www.youtube.com/watch?v=zj-5nkNKAow and I'm stuck around 28 minutes.
He's starting to explain about loss functions and has started with a 1/0 loss function - when the prediction is not the same as the label, it returns 1, otherwise 0. I want to add this equation to my notes using latex, but I can't read what the middle part is.
So, it's 1/h ??? delta function. What is the ??? middle part? Is it a '2' or a z? does it have an overline? the lower part of (x_i, y_i) from dataset D I understand, but not what's above it.
Any help please.
Hey machine enthusiasts I was building a local servillance monitoring system to control cctv camera, and the system should be able to detect motion and sound and livestream real-time footage in my device. could you help me please I'm facing challenge to make it work.
I am designing a neural network with the goal of controlling enemy decisions in my strategy video game. I would like to show this theoretical neural network sequential animations of game state progressions, and have it infer the game's rules and optimal strategies based on this limited information.
The goal of this neural network is to have very little feature programming and no access to the rules of the game. It's scope should be limited to viewing turn-by-turn game states and discovering winning and losing strategies as well as the hidden rules of the game. An example as follows:
In a 2d grid-based game, fire tiles will spread to tree tiles. The neural network should see step-by-step views of trees burning, and learn that trees next to a fire tile will ignite. It should also see evidence of water tiles dousing tree fires and learn that water will douse any fire. As the game becomes more complex, it should learn that burning a forest next to enemy tiles will yield successful results, while burning a forest next to friendly tiles will lead to failure and damage to the team. Through analysis of game states, it should theoretically learn these complex behaviors through scoring and reinforcement. The end result is a neural network that decides to burn forests near enemy bases, while using water to douse fires near their own bases, and keep fires away from their friendly forests. It should never have explicit knowledge of what a 'tree' or 'fire' or 'water' is; rather, it should learn their connection in regards to winning or losing the game, both in short-term and long term. 'tree' might be 1, 'fire' might be 2, 'water' might be 3. All the neural network knows is that these entities are different, and they have hidden rules that affects how they propagate throughout the game.
All the resources i've found on causal neural networks are very rigid and imply some sort of pre-determined or preprogrammed structure to the network. I realize that having a machine learn such complex patterns in a strategy game requires more flexibility and modularity to its network. So, these articles are not as helpful as i'd like them to be.
I am looking for resources or ideas that will aid in the achievement of this goal.
MiniMax is the new addition to text to video models giving a stiff competition to Kling AI, Luma Dream Machine and Runway Alpha Gen3. Check-out the demo and how to use it : https://youtu.be/WR5zsSelxzw
Im in my diploma engineering final year and want to create a project on Plant Disease Detection and Classification Using Machine Learning Algorithm. I know basic python only and I've no idea about ml our teachers have never taught about it but expect us to make complicated projects can someone please guide me regarding how do I do this? Like front end back end everything
Hello everyone, I am a software engineering student starting my final year, and I'm thinking of using machine learning in my final project. My initial idea is kinda unique, and I haven't found a library that has pre-trained models for it. Do you think if I dedicated three months to learn machine learning from scratch, I could manage to do it? I mean in the next semester. For context: I mostly have experience with web development, specifically ReactJS, NodeJS, and a little bit of Java.
edit:for clarity.
Had any of you done the Databricks Machine Learning Associate certification? Or any Databricks certification in general? I would like to know th general process, how difficult it is, and the chances of clearing it in the first go.
Hi, I'm a junior (at least in my head, but the company I'm working at only has ds and senior ds titles) data scientist and I came across something I really don't understand, googling didn't help.
So I've joined a project where the most senior guy trained Isolation Forest for finding anomalies on some dataset and was told to add some more stuff.
I've found out that since the original dataset is huge he did find the anomalies doing weekly batches. I asked why he won't just train it on a sample from different weeks and then use the trained model on the rest of the data as the data is apparently not changing much. He basically laughed at me and told me I should know that you couldn't use Isolation Forest this way and it only ever works for the specific dataset it was trained on.
I don't think it's true, but maybe I'm missing something crucial. I didn't really study anomaly detection at university and he's impossible to talk to. Any idea why would that be true?
I have following torch dataset (I have replaced actual code to read data from files with random number generation to make it minimal reproducible):
from torch.utils.data import Dataset
import torch
class TempDataset(Dataset):
def __init__(self, window_size=200):
self.window = window_size
self.x = torch.randn(4340, 10, dtype=torch.float32) # None
self.y = torch.randn(4340, 3, dtype=torch.float32)
self.len = len(self.x) - self.window + 1 # = 4340 - 200 + 1 = 4141
# Hence, last window start index = 4140
# And last window will range from 4140 to 4339, i.e. total 200 elements
def __len__(self):
return self.len
def __getitem__(self, index):
# AFAIU, below if-condition should NEVER evaluate to True as last index with which
# __getitem__ is called should be self.len - 1
if index == self.len:
print('self.__len__(): ', self.__len__())
print('Tried to access eleemnt @ index: ', index)
return self.x[index: index + self.window], self.y[index + self.window - 1]
ds = TempDataset(window_size=200)
print('len: ', len(ds))
counter = 0 # no record is read yet
for x, y in ds:
counter += 1 # above line read one more record from the dataset
print('counter: ', counter)
It prints:
len: 4141
self.__len__(): 4141
Tried to access eleemnt @ index: 4141
counter: 4141
As far as I understand, __getitem__()
is called with index
ranging from 0
to __len__()-1
. If thats correct, then why it tried to call __getitem__()
with index 4141, when the length of the data itself is 4141?
One more thing I noticed is that despite getting called with index = 4141
, it does not seem to return any elements, which is why counter
stays at 4141
What my eyes (or brain) are missing here?
PS: Though it wont have any effect, just to confirm, I also tried to wrap DataSet
with torch DataLoader
and it still behaves the same.
AniNameCraft: https://github.com/IAmPara0x/AniNameCraft
Hello everyone!,
I have been learning about text generation and came across char-rnn, so I decided to create something similar.
Just like char-rnn, the RNN in AniNameCraft can generate new names given input seed (i.e. name prefix), but in addition to that you can ask the RNN to generate names for male or female specifically.
Under the hood, when generating a name for say male character, the encoding prefixes the name with special token `<M>` , and from that token the RNN is capable of understanding that it should generate names that resemble male character.
Example:
Generating names for female character that starts with "hi":
```
cmd: ./main.py hi --gender Female beam-search --beam-width 5 --beam-depth 10
output: ['hinon', 'hisaki', 'hirono', 'hirona', 'hiroka']
```
Generating names for male character that starts with "hi":
```
cmd: ./main.py hi --gender Male beam-search --beam-width 5 --beam-depth 10
output: ['hisaki', 'hiroyama', 'hiroyasu', 'hideyoshi', 'hironosuke']
```
I hope you like it!!
I'm working on a PPO based off of the CleanRL PPO implementation.
I built a custom gym environment using Farama's gym API. The gym loads training data from Postgresql. During my local development on my mac to speed things up I used memcached to store the training data in memory. This helps in that each env can pull from a shared memory cache and it is faster than querying the DB.
On my macbook, the model itself uses about 2GB of memory and the memcached data takes about 3GB of memory. This is using 128 envs.
My problem is when I try and run on my desktop using CUDA. CUDA runs OOM trying to allocate 56GB of memory, of which I have 12 lol.
If I lessen the env count, I can slide under the memory limit. This doesn't work though as there is a min. number of envs that I need for my training to work.
So my question is, is there a reason the CUDA env. is not using memcached? Does it want to just put everything in memory regardless of the source? Is there an inbuilt way using the gymnasium to have the environments share a data memory pool?
I know there must be a solution out there and I thought my memcached one was decent (I know slower than using the memory on the card, but better than nothing).
The envs are created using SyncVectorEnv.
I'm looking to learn the mathematics behind AI. I've started with Linear Algebra. Currently, I'm doing Gilbert Strang's course on Linear Algebra and Essence of Linear Algebra by 3Blue 1 Brown.
While I have been moving through these resources and found them useful (I especially like the visualisations of 3 Blue 1 Brown), I want to know if there are resources as easily understandable that relate this math to AI.
I previously tried using Mathematics for Machine Learning book but found it boring and cumbersome. I tried using some cousera courses(Imperial College London's ) but don't have the money to pay even after financial aid.
Any alternatives not only for linear Algebra but the other aspects will be highly appreciated. I belive it will help me feel motivated when I know why I am learning the math in the first place at each step.
I am trying to implement custom diffusion pipeline following hugging face tutorial.
The same model & weights were used but the images generated are different for the same seed. I am very new to this field. Any help is appreciated. Thank you.
Same seed & prompt. Different image generated
Custom Diffusion Pipeline from Hugging face tutorial
from PIL import Image
import torch
from transformers import CLIPTextModel, CLIPTokenizer
from diffusers import AutoencoderKL, UNet2DConditionModel, PNDMScheduler
vae = AutoencoderKL.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="vae", use_safetensors=True)
tokenizer = CLIPTokenizer.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="tokenizer")
text_encoder = CLIPTextModel.from_pretrained(
"CompVis/stable-diffusion-v1-4", subfolder="text_encoder", use_safetensors=True
)
unet = UNet2DConditionModel.from_pretrained(
"CompVis/stable-diffusion-v1-4", subfolder="unet", use_safetensors=True
)
from diffusers import PNDMScheduler
scheduler = PNDMScheduler.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="scheduler")
torch_device = "cuda"
vae.to(torch_device)
text_encoder.to(torch_device)
unet.to(torch_device)
prompt = ["a photograph of an astronaut riding a horse"]
height = 512 # default height of Stable Diffusion
width = 512 # default width of Stable Diffusion
num_inference_steps = 25 # Number of denoising steps
guidance_scale = 7.5 # Scale for classifier-free guidance
generator = torch.manual_seed(0) # Seed generator to create the initial latent noise
batch_size = len(prompt)
text_input = tokenizer(
prompt, padding="max_length", max_length=tokenizer.model_max_length, truncation=True, return_tensors="pt"
)
with torch.no_grad():
text_embeddings = text_encoder(text_input.input_ids.to(torch_device))[0]
max_length = text_input.input_ids.shape[-1]
uncond_input = tokenizer([""] * batch_size, padding="max_length", max_length=max_length, return_tensors="pt")
uncond_embeddings = text_encoder(uncond_input.input_ids.to(torch_device))[0]
max_length = text_input.input_ids.shape[-1]
uncond_input = tokenizer([""] * batch_size, padding="max_length", max_length=max_length, return_tensors="pt")
uncond_embeddings = text_encoder(uncond_input.input_ids.to(torch_device))[0]
text_embeddings = torch.cat([uncond_embeddings, text_embeddings])
latents = torch.randn(
(batch_size, unet.config.in_channels, height // 8, width // 8),
generator=generator,
)
latents = latents.to(torch_device)
latents = latents * scheduler.init_noise_sigma
from tqdm.auto import tqdm
scheduler.set_timesteps(num_inference_steps)
for t in tqdm(scheduler.timesteps):
# expand the latents if we are doing classifier-free guidance to avoid doing two forward passes.
latent_model_input = torch.cat([latents] * 2)
latent_model_input = scheduler.scale_model_input(latent_model_input, timestep=t)
# predict the noise residual
with torch.no_grad():
noise_pred = unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
# perform guidance
noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond)
# compute the previous noisy sample x_t -> x_t-1
latents = scheduler.step(noise_pred, t, latents).prev_sample
# scale and decode the image latents with vae
latents = 1 / 0.18215 * latents
with torch.no_grad():
image = vae.decode(latents).sample
image = (image / 2 + 0.5).clamp(0, 1).squeeze()
image = (image.permute(1, 2, 0) * 255).to(torch.uint8).cpu().numpy()
images = (image * 255).round().astype("uint8")
image = Image.fromarray(image)
image
Using diffusers library directly
import torch
from diffusers import StableDiffusionPipeline
model_id = "CompVis/stable-diffusion-v1-4"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
generator = torch.manual_seed(0)
prompt = "a photograph of an astronaut riding a horse"
image = pipe(prompt, generator = generator).images[0]
image
I am trying to use catboost inside my VS Code, I am using Python version 3.11.0, catboost version 1.2.5 and numpy version 2.1.0, i am getting the above error while importing catboost. I have tried upgrading numpy, catboost but it's not working... HELP!
I am QA engineer working in a company from past 7 months. I want to switch to machine learning engineer role after 2.3 years. If I leaned python, maths(stats, algebra, probability) machine learning, deep learning, specialize in one of CV, NLP. Done meaningful projects, participate in kaggle competitions gain some real life experiences. Build portfolio to showcase on GitHub, linkedin.
If I apply online on LinkedIn, naukari, Glassdoor, company sites. CAN I GET THE JOB?
I am trying to solve a classification problem using Python scikit-learn. I am capturing the coordinates of certain points on a person's face relative to the center of the face using a webcam. By finding some relationships between these points (such as the angle of the eyebrows in degrees, the distance between the eyebrows and eyes, mouth opening, etc.), I have expanded the diversity of my data. My goal is to determine whether the person is yawning. I have numerical data like coordinates of the form 0.55461845 and angles of the form 61.7546368. I need to normalize this data but I'm not sure which method to use. Additionally, I understand that this is a classification problem (since the person is either yawning (1) or not yawning (0)), but I can't decide which model to use. Please provide suggestions.
Hello, I wanted to share that I am sharing free courses and projects on my YouTube Channel. I have more than 200 videos and I created playlists for learning Machine Learning. I am leaving the playlist link below, have a great day!
Machine Learning Tutorials -> https://youtube.com/playlist?list=PLTsu3dft3CWhSJh3x5T6jqPWTTg2i6jp1&si=1rZ8PI1J4ShM_9vW
Data Science Full Courses & Projects -> https://youtube.com/playlist?list=PLTsu3dft3CWiow7L7WrCd27ohlra_5PGH&si=6WUpVwXeAKEs4tB6
Hey everyone,
I'm currently working on a project where I need to interact with dynamically changing XML files using Langchain4j or Langchain with OpenAI's GPT-4o model. The XML files I'm dealing with are quite large (over 100MB each), and I'm not able to include them directly in the prompts. The XML schema (XSD) is also available, but it's split across multiple files and spans thousands of lines.
The XML files are already parsed into Java Objects, and here's what I'm trying to achieve:
I've attempted to use Retrieval-Augmented Generation (RAG) and Function Calling, but neither approach has provided a satisfactory solution.
Given the constraints, how can I efficiently communicate with and manipulate these large, dynamic XML files using Langchain(or 4j) and GPT-4o? Any advice, examples, or guidance would be greatly appreciated!
Thanks in advance!
I just recently created a discord server for those who are beginners in it like myself. So, getting a good roadmap will help us a lot. If anyone have a roadmap that you think is the best. Please share that with us if possible.
My budget is rupees 1 lakh 30 thousand
Hi all,
I am interested in joining an innovation challenge at work regarding GenAI. I have thought about ideas for days and because of the limited understand I have about processes in this area, I can only think of desiging an automating tool by genAI to analyze data which will replace the current one.
The thing is the current tool is not that old - maybe 4-5 years or so, and I don't want to tip on anyones toes (people who developed this tool) to say yes we should replace it. Also, our sales team uses this tool to work with customers and by replacing it with genAI, they will have less work. I also don't want to imply that genAI can just take over peoples work.
Could you advise me if I should at all suggest this solution or try to think about another, because of some ethical doubts I still have?
any idea where and how I can think of better ideas? Hope for some advice. Many thanks!
Hi all,
I am interested in joining an innovation challenge at work regarding GenAI. I have thought about ideas for days and because of the limited understand I have about processes in this area, I can only think of desiging an automating tool by genAI to analyze data which will replace the current one.
The thing is the current tool is not that old - maybe 4-5 years or so, and I don't want to tip on anyones toes (people who developed this tool) to say yes we should replace it. Also, our sales team uses this tool to work with customers and by replacing it with genAI, they will have less work. I also don't want to imply that genAI can just take over peoples work.
Could you advise me if I should at all suggest this solution or try to think about another, because of some ethical doubts I still have?
any idea where and how I can think of better ideas? Hope for some advice. Many thanks!
I tried to find an answer to this question, but I didn't find anything useful. In most cases (in all, in fact), the answers relate to either fine tuning of models or inference, but not a word about the experience of commercial use…
For example: how many users, under the condition of asynchronous load, will be able to withstand the RTX2080-ti, with llama 8B spread out on it? What does this number of users depend on?
I am trying to track down the body of the work combing reinforcement learning with model predictive control. can someone guide me through key research papers in that direction? thanks.
Hello everyone! I'm a college student who recently landed an internship involving NLP. My work requires me to learn the RAG framework and implement it in conjunction with a large language model integrated with the OpenAI API. I'm seeking guidance and a clear roadmap to help me navigate this process
I want to use autoencoders to extract some new features for prediction on a dataset. Let's say I have "K" features ( ie K is the number of columns in my data). If I want to reduce K features to a lower dimension in latent space what's reasonable for that new dimension to be where I don't lose too much information but still get the data nicely compressed.
Complete beginner here, started learning machine learning fundamentals a couple months back and rn practicing to code the models for real use cases and datasets
I have heard that adjusting hyperparameters is more art than science, though I am kinda confused over how should I do it to get optimum results without wasting much time. Currently what I do is that I create several models and analyse their losses to figure out the best model to use, along with checking the bias and variance to check for over and under fitting, but it's kinda exhausting and feels very random to me.
Are there any advanced tips and tricks to be able to figure out these parameters more efficiently? Like how many neurons to use in a layer or how many layers to even use, or what kind of activation function suits best for which layer.
I apologize if my question is too basic, but I would love to learn more and overcome my shortcomings and gaps in knowledge.
Thank you
Hi! I come from an engineering background and I have been learning ML as a way to bridge solutions for a future business I might pursue (maybe in energy). I’ve been trying to get a solid foundation in the subject. Enough to build prototype models and to work with actual CS majors in creating the final product by understanding the general process and bridging the gap between the needs and solution. So not exactly a hardcore ML engineer goal but to be able to work with them while holding my own.
I’ve taken Andrew Ng’s course on Machine Learning and I will take his Deep Learning course soon. I’ve began reading into the Hands On Machine Learning Book and I’ve implemented two projects (Titanic & Advanced Housing Regression), scoring in the upper 10th percentile.
Im not sure what I should be learning next or what I should be taking a deeper dive into. Whether it would be feature engineering, the different algorithms, new tools, or more math. Would appreciate your help!!