/r/learnmachinelearning

Photograph via snooOG

A subreddit dedicated to learning machine learning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

  • Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.
  • Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.
  • Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.

Chatrooms

Official Discord Server


Wiki

Getting Started with Machine Learning

Resources


Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

/r/learnmachinelearning

399,345 Subscribers

1

What are the career options for an unsuccessful ML PhD?

After a few (<5) publications at non-top-tier conferences and an internship at a non-FAANG company, I've come to realize that I am probably not cut for this career.

While I still find my research topics interesting, my research direction is not particularly employable. My supervisor is a nice person and mentor but is too "hands-free" which doesn't help. I'm not failing or dropping out, but I'm not thriving either. I'm very tired and burnt out from this competition and want to pursue a non-academic career that's stable, less demanding, and has a good work-life balance. Of course, I can accept lower pay.

What are some possible career paths for an unsuccessful ML PhD like me?

2 Comments
2024/04/29
21:56 UTC

1

What to learn next after supervised learning?

I've gotten decently familiar with supervised learning projects in classification and regression, but I'd like to keep learning and expand my knowledge (to become data scientist/data/analyst/machine learning engineer). Some basic skills I have include Python, SQL, Tableau, Pandas, Numpy, Matplotlib, Seaborn, Scikit-learn. What is the best process I should take to keep learning skills relevant to data science and machine learning? I was considering starting to learn unsupervised learning/deep learning or learning cloud services like AWS, but I'd like to get some advice on what to learn and where to find resources as well.

0 Comments
2024/04/29
21:40 UTC

4

Applying Genetic Algorithms (GAs) to trading strategy optimization: A Guide

Hello!

My name is Austin. I'm a CMU alumnus within my Masters in Software Engineering. During my Masters program, I took a course called "Data Science within Software Engineering", and we learned a lot of data science techniques within NLP and Machine Learning including genetic algorithms.

For those of you aren't familiar, genetic algorithms (GAs) or genetic optimization is a process in which we iteratively evolve a population of solutions using principles from natural selection. It's an old-school optimization algorithm that's good at finding a "good enough" solution in a large search space.

Since taking that course, I've been very interested in how this algorithm can be applied to stock trading. I decided to test this out myself.

To do this, I've been working on a variety of tools that applies genetic optimization to trading strategy optimization. The first tool of which is NextTrade, an open-source algorithmic trading platform. NextTrade implements single-objective optimization. The main problem with it is that it is ungodly slow, and takes days (even with a good computer) to finish running.

The second iteration is NexusTrade.io. I re-implemented NexusTrade in Rust and also was more careful with other aspects of the code (including implementing sliding windows for indicators) to drastically improve the performance of the optimization process. This changed the optimization process from running for hours/days to running for minutes.

Within NexusTrade, I've also implemented multi-objective optimization to improve more than one fitness function simultaneously. In the context of trading, you can optimize

  • Percent change (self-explanatory)
  • Sharpe ratio (a measure of risk-adjusted returns)
  • Sortino ratio (a measure of risk-adjusted returns that doesn't penalize for positive volatility)
  • Maximum drawdown (the percent change from the highest portfolio value to the lowest value)
  • Amongst other fitness functions

I've very curious to see what this group thinks of genetic optimization as it relates to finance. I've put a lot of effort into implementing such a feature including

  • Being able to configure the optimization parameters from the UI
  • Being able to test each individual candidate solution
  • Being able to modify the mutation rate, mutation intensity, the rate of spontaneous generation, amongst other hyperparameters
  • The ability to split the optimization process into a training set and validation set within the UI
  • Amongst other configurability params

If you'd like to learn to learn more about how GAs intersect with finance, check out the following material:

0 Comments
2024/04/29
21:08 UTC

1

Project ideas using diffusion/language models

Hey guys, I’m really trying to build some project to improve my skills and showcase my proficiency utilising GenAI. I’m open to project ideas but my question is a bit more fundamental, meaning how to get ideas for building projects? I know it sounds weird but I would love to know what approach you guys use when building something using ml

p.s - RAG chatbots are just too cliched

0 Comments
2024/04/29
21:06 UTC

1

Lightning AI Studio

Has anyone here experience with Lightning AI Studio? Especially in comparison to other tools like Google Colab, Amazon Sage Maker, etc. I am trying it atm, it looks quite cool yet seems fairly expensive. We are a small start up trying to asses which tools work best. Any help appreciated!

1 Comment
2024/04/29
21:06 UTC

0

I'm lost, guide me.

I apologize for the dumb question, im new to AI.

Im trying extract Questions, answers, and explaination of the answer from Arabic textbooks

Previous developer used chatGPT, it worked well but not all the time. we're changing it cuz its expensive.
So far what i understood is i should use a tool like openNLP instead of chatGPT or Gemini.

Both GPT, Gemini and google confused the hell out of me, I just need to know where to start, or the tools i should use, nth more.

Thanks.

0 Comments
2024/04/29
20:55 UTC

1

MLMW: Free guide to machine learning (5 parts, 112 topics and 173 links)

The original idea was to write out useful topics in theory (statistics, econometrics, ml and deep learning) and productisation (pipelines, job roles, tools) and the topic count was around 30-40. The current version is over a hundred topics while some are just keywords, some have notes and some are fully developped (good links and even some exercises).

Ideally, I hope the guide is useful to people swithcing job areas (programmer to ml, econometrician to engineer) as well as review and planning the studies. I'm concerned part 1 (theory, models and methods) is well developed, while other parts may be lacking a bit of focus (the early draft with fewer topics was more concentrated).

The guide is freely avaialble at https://docs.google.com/document/d/e/2PACX-1vT9ZkQJDDimZuPgBb7_hUJ40lm8LhqzL45HwIcYRYHw0AQkwA7pcqg0AIE7Gwf3QpAnZ34-BrFrWovO/pub (it is a published Google doc). Comments and refactoring suggestions welcome.

0 Comments
2024/04/29
20:52 UTC

1

Cloud solution to run my own AI API on?

I've fine tuned a model over hugging face, I've also created a small API wrapping around my model, i have a decent machine that i can run the api on it locally and use it to prompt the model.

I want to take this to the next level of deploying my API app on a cloud hosting server, so i can communicate with it through various apps i already have, is there a specific platform that does that with a free tear, (the model is light and small) does aws provide such a service? what's it called and is it costy? i want to pay as little as possible or preferably something free.

0 Comments
2024/04/29
19:58 UTC

1

Llama3 forgets how to write a poem

I'm trying to fine tune Llama3 70B to write about the stock market. My training set consists of a list of possible interesting storylines about a given day in the stock market, along with an example of a report that was written using those storylines.

One of the questions I ask after training is just to write up a new three-paragraph market report using fresh storylines, and not surprisingly the model does a good job creating that. To test whether it has really learned to inference, however, I also ask it to write a poem about the market using those same storylines. Instead of writing a poem, it just writes up a three-paragraph report, similar to what was in the training set. This persists even if I put tons of references to writing a poem in the prompt, and also only run fine tuning over 400 samples at a moderate (5e-5) learning rate (using LORA, rank=96; alpha=32).

The thing is, if you give the untrained model the same market info and ask it to write a poem it does a decent job. Some of the market info isn't correct/ideal, but it definitely tries to write it in poem form. My big question is- how can such a large model be so overwhelmed by such a small amount of fine tuning to the point that it 'forgets' how to write a poem when faced with a prompt that is otherwise similar to the training data? Any insight or suggestions would be very much appreciated!

Note: while the zero-shot results are decent, I really do need to train the model to be better for the output to be useful.

6 Comments
2024/04/29
19:08 UTC

1

How to become a Machine Learning Infrastructure Manager

Hello everyone, T hope you all are having a great time. English is not my first language please bear with me, thank you. I was thinking of how does one become a machine learning infrastructure manager from scratch. I wonder how long it'll take to learn everything from scratch and if self learning is possible and how to go about it. I do not have any knowledge in software development or engineering only little programming knowledge in python, which i taught myself. All I think i have is just the zeal to learn. Also, is Machine Learning Infrastructure Management a good career to choose? Any advice will be appreciated, thank you very much.

1 Comment
2024/04/29
18:53 UTC

1

Usual industry approach on video anonymization

Hi, I'm about to acquire about 5 hours of video that must be annotated and anonymized, people on the video will be wearing masks (which seems pretty anonymous to me but I'm not the one deciding on this :I ). I'm quite new in ML, how is face blurring often handled on the industry? Are the masks going to be a problem?

I've found some Gh projects that I'll test once I get some samples, and also saw expensive online tools but it doesn't seem to be that much info on the subject.

0 Comments
2024/04/29
18:34 UTC

1

what to do after completing python basics in machine learning, can i learn scikit for machine learning

can I learn scikit learn after completing the basics in python for machine learning?

2 Comments
2024/04/29
16:54 UTC

2

Introduction to Phonetic Word Embeddings

1 Comment
2024/04/29
16:20 UTC

0

Looking for a technical partner

I have an idea for an AI influencer platform that allows content creators to create digital clones of themselves for their fans to interact with. Think Cameo or Onlyfans but entirely AI.

Short term, it's relatively straight forward: a personified chat bot with voice cloning/calling and consistent character image gen. Really just piecing different providers together like an LLM + 11labs + deepgram + stable diff. Want to keep the MVP pretty simple so we can ship fast and test product-market fit.

Have already connected with nearly a dozen content creators who seem keen for something exactly like this. But frankly, I'm a terrible dev. My strengths lie in at distribution, UX design, operations, and overall strategy. So I'm looking for a technical co-founder to build the MVP with.

Don't really care about your background. As long as you're obsessive, just comment or shoot me a DM and let's see how far we can take this!

3 Comments
2024/04/29
16:18 UTC

1

Demand forecasting model using LSTM. Preventing data leakge?

Does this model forecast taxi demand correctly?

Is the model testing against unseen data correctly? are there any major errors i have missed?

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.preprocessing import MinMaxScaler

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Bidirectional, LSTM, Dense, Dropout

from tensorflow.keras.callbacks import EarlyStopping

# Load the dataset

df = pd.read_csv('combined.csv')

df['tpep_pickup_datetime'] = pd.to_datetime(df['tpep_pickup_datetime'])

df.set_index('tpep_pickup_datetime', inplace=True)

# Selecting data for a single location, here location ID '132'

data = df['132'].values.reshape(-1, 1)

scaler = MinMaxScaler(feature_range=(0, 1))

data_scaled = scaler.fit_transform(data)

# Function to create sequences

def create_dataset(dataset, look_back=10):

X, Y = [], []

for i in range(len(dataset)-look_back-1):

a = dataset[i:(i+look_back), 0]

X.append(a)

Y.append(dataset[i + look_back, 0])

return np.array(X), np.array(Y)

# Prepare data for LSTM

look_back = 10

X, Y = create_dataset(data_scaled, look_back)

X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Split data into train, validation, and test sets

train_size = int(len(X) * 0.7)

val_size = int(len(X) * 0.2)

test_size = len(X) - train_size - val_size

trainX, trainY = X[:train_size], Y[:train_size]

valX, valY = X[train_size:train_size+val_size], Y[train_size:train_size+val_size]

testX, testY = X[train_size+val_size:], Y[train_size+val_size:]

# Build the Bidirectional LSTM model

model = Sequential()

model.add(Bidirectional(LSTM(50, return_sequences=True), input_shape=(look_back, 1)))

model.add(Dropout(0.2))

model.add(Bidirectional(LSTM(50)))

model.add(Dense(1))

model.compile(optimizer='adam', loss='mean_squared_error')

# Use early stopping to halt the training when no improvement

early_stop = EarlyStopping(monitor='val_loss', patience=10)

# Fit the model on training data, validate on validation data

model.fit(trainX, trainY, epochs=10, batch_size=64, verbose=1, validation_data=(valX, valY), callbacks=[early_stop])

# Combine training and validation data

combinedX = np.concatenate((trainX, valX), axis=0)

combinedY = np.concatenate((trainY, valY), axis=0)

# Retrain the model on combined data

model.fit(combinedX, combinedY, epochs=10, batch_size=64, verbose=1, callbacks=[early_stop])

# Predict

train_predict = model.predict(combinedX)

test_predict = model.predict(testX)

# Inverse transformation for plotting

train_predict = scaler.inverse_transform(train_predict)

combinedY_inv = scaler.inverse_transform([combinedY])

test_predict = scaler.inverse_transform(test_predict)

testY_inv = scaler.inverse_transform([testY])

# Plot

plt.figure(figsize=(12, 6))

plt.plot(scaler.inverse_transform(data_scaled), label='Actual')

plt.plot(np.arange(look_back, len(train_predict)+look_back), train_predict, label='Train+Val Predict')

plt.plot(np.arange(len(train_predict)+(2*look_back)+1, len(train_predict)+(2*look_back)+1+len(test_predict)), test_predict, label='Test Predict')

plt.title('Taxi Demand Prediction')

plt.xlabel('Time Interval')

plt.ylabel('Taxi Trips')

plt.legend()

plt.show()

0 Comments
2024/04/29
15:43 UTC

3

Approximate time to train Llama 2 model with 10 GB of data?

"Hey everyone, I have a question that I need some help with. I'm looking to train an Llama 2 model using 10 GB of data. Could anyone give me an idea of how long it might take to complete this task? I'm new to deep learning. If anyone has an estimate or experience with this, please share. Thanks a lot!"

5 Comments
2024/04/29
15:38 UTC

3

Should I continue doing ISLP?

Sorry for a noob question but I started with ISLP last month and have finished a chapter. It's a really good book and i am into it.

But, When I told my teacher that i am reading this book, he told me that book doesn't have much concept about neural networks and read some other "practical" books.

What should i do?

8 Comments
2024/04/29
15:35 UTC

2

Regarding publishing research papers in computer vision

I want to publish two research papers.So I have knowledge of machine learning.Have done andrew ng cnn course or fastai.And have done three decent deep learning projects.I have interest in computer vision. What should be my approach now to publish research papers in decent conference.I am left with 6 to 7 months.I have to apply in masters.what should be my approach now.Please guide me.

0 Comments
2024/04/29
14:47 UTC

2

Fast.ai 19 vs fast.ai 22 part2.

Hey guys,need your help.What i should do fatai 19 or fast 22.People say in fastai 19,jeremy code from scratch.I am confused.

1 Comment
2024/04/29
14:41 UTC

1

Model being trained too slow for some reason? (Explanation in post)

2 Comments
2024/04/29
14:37 UTC

2

How is AI applied for robotics, dogfight, driving, etc?

I saw recently that military of USA used AI for dogfighting, which is amazing and spells out what many fear and hope AI can do, from a human point of view if this can mean someday no more humans dying in wars then thats amazing. But from a curious guy's perspective I have no idea how that can be done. I know it does use a computer but I don't know how that can be trained to dogfight. I'm similar curious how AI is used for robotics and driving and whatnots, I never knew about neural network architecture which was designed for those tasks.

I understand these proprietary model wont be out in wild but on high level I dont know how this can works

4 Comments
2024/04/29
13:53 UTC

1

Persisting Overfitting Issue

Hello, I'm somewhat of a beginner in machine learning, but I have a good foundation. I started a school project which involves developing a model to predict the employability of data scientists in the United States. The goal is to help job seekers determine if they are employable or not in the current U.S. market. I scraped my data from Indeed and developed the model, but I am still facing issues with overfitting. Initially, there was a problem of data leakage, but now, even after resolving that, I still have an overfitting problem. I consistently get a perfect score of 100% on the training data and a score of 9% on the test data. With data leakage, it was 100% on both. Now, I'm unsure how to fix this. I've tried everything from data balancing to feature selection. I've tried everything i can think of and used several algorithms with grid search. I don't know what to do now, even though I think I've done good preprocessing of the raw data. Could someone help me identify the problem?

0 Comments
2024/04/29
13:47 UTC

2

How can I learn this?

What are some good resources to learn ML? Im not talking about basic ML. I learned basic concepts like regression and classification in my intro class. Im talking about the integration of ML with mathematics like linear algebra, calculus, statistics, and differential equations. I already know the math from my college courses, I just need a resource or textbook to apply the math.

1 Comment
2024/04/29
13:19 UTC

2

Need urgent help with my project

I have to show my implementation tomorrow and I'm in deep shit.
For context, we're a team of 3 and had no choice but to do this project for 12 college credits. My teammates left for job/internship and didn't give any input once they left. All 3 of us don't know much about DL, although I have worked on ML projects.

As for the project specifications, I am working with BraTS2020 dataset, doing image segmentation to find out which particular brain samples are affected by tumor. Once that's done, I have to use NeRF to get a general 3D visualization of a brain affected by tumor.

The main problem was that somehow, someway, Resnet gave us a better Dice score than Vnet, Unet and other models (all the other models gave a dice score of almost 0). So, it felt like the model implementation itself is wrong. Now I'm working on it and I don't understand anything. I could really use some help. Heck, I'm even ready to pay for this. Please do reach out.

Edit1: I'll leave my discord here, in case someone wants to help, please add me. I'll make a server if a number of people try to help. Discord ID - arush420_

Edit2: I made a discord for this: https://discord.gg/NwzhX7e8

6 Comments
2024/04/29
11:52 UTC

1

whats the best way to move audio recorded locally to the cloud

Hello

i want to record audio from my local machine.

then send it to azure cloud to use GPU

in a previous post, i suggested saving audio into database (i want to use postregres but doesnt matter for me if something is better) then connect DB to azure and do my thing. answers suggest this is not optimal method. this comment mentioned s3 storage but thats in aws.

what is the equivelant of s3 storage in azure?

also, am i doing it in the most optimal way? how owuld you suggest i do it instead?

thanks

0 Comments
2024/04/29
11:45 UTC

12

Got stuck knowing ML/data science roles are only for experienced software engineers

I came to know this 2 weeks ago but i ignored it and conviced to keep learning. But now i am just lost in that thought. The thought of learning all these things is just a waste of time cuz i dont even have a choice. Companies just dont hire fresh grads as ML /DATA science roles. I started learning python so that i could be data analyst then convinced myself to learn ML/deep nn. Went through a specturm of self realization...people said i need to knowing java or cpp is just mandatory for a fresher ,and sql is must, mongodb is must, AWS/Azure knowledge is preferred...ok...now they are saying freshers just cant get hired for these roles What should i do now? Learn spring boot? Or maybe selenium..i dont know man.. does any one else know about this going on in industry?

7 Comments
2024/04/29
11:40 UTC

6

Did Google just solve the Context length problem for LLMs?

The big breakthrough in Self-attention was the network’s ability to relate different parts of inputs with each other no matter the distance. This importance is directly dependent on the output. For different types of output, the attention given to each token in the input data sequence will vary quite significantly. But the problem lies in scaling this type of system. In order to calculate attention scores, we need to store an NxN matrix of attention scores for a sequence of length N. That means scaling will become more and more resource intensive as we go in the order of context window of 128k or a million tokens.

Full Article: https://vishal-ai.medium.com/infini-attention-infinite-context-for-llms-d4485619a01e

So, what does the Infini-Attention paper promise?

An efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention.

So, the idea behind Infini's attention is to use local attention and global attention in combination. First, the entire text piece is divided into chunks, we apply the standard attention on one chunk, and to get the context of the previous chunk we use a form of linear attention.

https://preview.redd.it/kttw9ypnnexc1.png?width=489&format=png&auto=webp&s=722105277b48d93bd20fcd302175d3fb17c351bd

- Hybrid Attention mixing: Local attention focuses on the immediate context around a word, while long-range attention maintains a broader view by referring to a compressed summary of the entire sequence seen so far.

- Compressive Memory: Previous chunks are memorized with the help of linear attention.

- Efficient Updates: To avoid redundancy and save computational effort Infini-attention doesn’t just add new information to its memory. Instead, it first checks to see what’s already known and only updates memory with what’s new or different similar to the skip connections in ResNet.

- Trade-off Control: A hyperparameter to mix local information and compressed memory.

0 Comments
2024/04/29
11:37 UTC

2

New to AI/ML, kinda overwhelmed

Hey everyone, I am new to the group so regards and love to everyone.

Getting to the point, I just started learning about AI/ML/DS, NLP, LLMs and kinda overwhelmed with different categories and applications we have in the field.

I needed some resources, i.e. videos, blogs, books, articles to start from the basics where I could understand how the heirarchy works and how do they overlap each other.

In addition, some thing to understand how nlp and llm fall under AI, and how to apprach it from the grass root level.

I am new, so I am sorry for being so noob :')

Thanks.

1 Comment
2024/04/29
10:32 UTC

1

learning more about compression in ai models

0 Comments
2024/04/29
08:37 UTC

Back To Top