/r/learnmachinelearning
A subreddit dedicated to learning machine learning
A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.
Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.
/r/learnmachinelearning
i’m currently in med school and i’ve been interested in radiology since the beginning. i’ve seen machine learning being brought up often when mentioning the ways to boost cv. is there anything i can start doing ? any actionable projects i can do that would make me stand out?
college fy stumbled up on this problem statement a ml model that takes 3d floor plan as input and give back its blueprint/2d floor plan . i have very basic knowledge of ml and related stuff(Still learning ) i have worked mnist and dogcat detection models but this is a very big piece for me to chew how should i go around making it what is the start point and what are some pothole i could fall into and has someone made it in the past any kind of insight will be very help full and appreciate
Should I just expect typical leetcode style questions? My background is in data science.
I'm interested in machine learning and working on it for past six months,and I've been doing dsa in java for more than one year.But now I want to pursue my career in machinelearning,Should i continue doing dsa in java or shift to python.I have only 4 months left for my placements
I want to build an Xgboost binary classification model but I would like the outputs (probabilities) to be a continuous sliding "scale" like how logistic regression is. Is this how Xgboost outputs are or are the probabilities "binned" based on the leaf that the observation falls into?
Or did chatgpt just make this up? doing a project for school, mostly basi, stuff, was wondering if there was a difference between representing biases as extra neurons or as just added values, if that makes sense. Can someone help me out?
Some people claimed that Apple's unified memory is superior and faster for AI reasoning due to large amount of memory that can access up to 192GB while even RTX 4090 can only have up to 24GB of VRAM. I know that Apple Silicon chip's GPU performance isn't great and the bandwidth is actually not that fast but the amount of memory seem superior.
But can anyone test with M3 Max and RTX 4090m for AI training and reasoning? I'm curious to know the performance difference.
I’m studying ML theory and plan to apply it with Python and relevant libraries. I’m also considering a master’s in AI/ML in about half a year.
Are there economical cloud-based options for small datasets and ML applications? I think it’s unnecessary to rely on a local environment for ML learning, right?
As the title: We offer Perplexity AI PRO voucher codes for one year plan.
To Order: https://cheapgpt.store/product/perplexity-ai-pro-subscription-one-year-plan
Payments accepted:
I have a RTX 3060 and RTX 4070 in the computer I built. I can use both GPUs simultaneously with just about any LLM model and it works reasonably well.
However, I have not found any Stable Diffusion image generator software that supports multi-GPU except maybe a sequential batching model where generation requests get routed to different GPUs for each to generate a complete image independent of the others.
Is there something fundamental to the architecture of Stable Diffusion models that makes multi-GPU generation impossible?
Traditional methods no longer keep up in a world that’s changing fast due to artificial intelligence. In the book Mastering AI Tools, you'll discover practical ways to use the latest AI tools to grow your skills, broaden your career prospects, and achieve real success. This isn’t just theory; it’s a guide that blends knowledge with real-world applications, putting today’s essential AI tools into your hands—the ones trusted by leaders and creators in the industry.
Join a community of professionals who’ve used these techniques to change their careers.
If you’re ready to shift your path and stay in sync with today’s digital landscape, this book could be your opportunity. Start learning now and equip yourself with skills that matter, becoming part of the next wave of digital professionals.Get your copy now!
Trying to find out if its worth getting a dedicated station to do the training or doing in on the MacBook Pro is good enough. The goal is to speed up the time of the machine learning. The software used is Anaconda and used on Windows.
The system will have
1600 Watt System
Video Card NVIDIA® GeForce® RTX 4090 24GB GDDR6X (3-Slot) (1xHDMI, 3xDP)
Memory Capacity 24 GB
Processor GeForce RTX 4090 (AD102-300)
Memory
Technology DDR4
Type 288-pin DIMM
Capacity 6 x 64 GB
Speed 3200 MHz
Error Checking ECC
Signal Processing Registered
Processor
Product Line Ryzen Threadripper PRO Socket sWRX8 Clock Speed 4.0 GHz
I know its a general question but a ball number is sufficient. Like will outperform by x10 or x2? what to expect?
I must've have tried at least 100 different resume designs till now. Some of them got me interviews and some did not. I am going to share with you briefly what my learnings were.
Background - I am a digital marketer and UI/UX designer, at least now I can say that but when I started out looking for jobs during my college. It was a dreadful task to create resumes and cover letters since I was never satisfied and I did not have much idea about Adobe & Canva at that time.
I started out with Canva- Those resumes were not T-shaped, extremely colourful and had no consistency.
I started adding more content with proper headings, subheadings etc. Started to get more precise with which content should go where to make sure that it was consistent throughout.
Fast forward, my resume started to look more professional and gained some weight.
I could never afford to pay someone else to make my resume but if you can, I will strongly advise you to do it. It will save so much of your time and effort. I did it on my own but this experience came in handy. Also, those online resume templates are aesthetically pleasing but I can almost never fit my stuff into those and if I change anything, their formatting explodes. I stopped using those.
Recently I tweaked it again a few months ago and now my resume gets me interviews very easily and I am quite satisfied with it. It took a lot of effort and research to get to the most optimal resume I have right now but who knows I might change it again.
My Key Learnings:
A lot of small companies still might accept the old word resumes but I have almost never seen any senior in a reputed company go with that.
A good resume should be top to bottom, easy to read and not straining to the eyes. Yes, it is possible for the recruiter to read the whole resume or at least some parts of it. And it's your job to guide them to do so.
For an entry-level professional- one page is ideal and for seniors, 2 pages are considered good. But don't compromise on your experience to fit it in. It's more important to showcase your capabilities, no of pages are the least important.
Sequence of section-
Contact details -> Career summary -> Skills -> Work experience -> education -> Projects & certifications -> references (if any).
These are all important sections- don't skip any. If you've not done any projects or courses, make it up.
Always add timelines to the job experience and certifications.
Keep going back to your resume sporadically, read it again. Bet you will find something to improve.
Lastly, these videos might help - https://www.youtube.com/watch?v=a5qUeeT2m8k
How do I know how many layers my RNN should have and how many neurons per layer I should implement? Is it purely trial and error or is there a more "correct" way of going about this? I need to design a model that will predict anomalies in a gearbox system. I have a bunch of sensors on many components within said gearbox (e.g. vibrations of a shaft, temperature of a bearing etc.) and need to use this data to create a RNN model that will tell me (as early as possible) when a failure will happen. Any help is appreciated.
Hello everyone,
I’m currently working on developing a content-based recommendation system that leverages audio features. I have implemented some code, and while I’ve achieved some results, I’m eager to get feedback on potential improvements or any mistakes I may have overlooked.
Any insights or suggestions would be greatly appreciated!
import ast
import os
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras import layers, losses
from tensorflow.keras.models import Model
os.environ["CUDA_VISIBLE_DEVICES"] = "-1" # Disable GPU
class Autoencoder(Model):
def __init__(self, latent_dim, shape):
super(Autoencoder, self).__init__()
self.latent_dim = latent_dim
self.shape = shape
self.encoder = tf.keras.Sequential(
[
layers.Flatten(),
layers.Dense(latent_dim, activation="relu"),
]
)
self.decoder = tf.keras.Sequential(
[
layers.Dense(
np.prod(shape), activation="sigmoid"
), # Use np.prod instead of tf.math.reduce_prod
layers.Reshape(shape),
]
)
def call(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return decoded
def load_track_features(path, max_rows=None):
df = pd.read_csv(path, delimiter="\t", nrows=max_rows)
# Apply ast.literal_eval on each cell that contains list-like strings
for col in df.columns:
df[col] = df[col].apply(
lambda x: (
ast.literal_eval(x) if isinstance(x, str) and x.startswith("[") else x
)
)
df.drop(columns=["tags"], errors="ignore", inplace=True)
return df.to_numpy()
# Data preparation function
def prepare_data(data):
scaler = MinMaxScaler()
data_normalized = scaler.fit_transform(data)
return data_normalized, scaler
def recommend_similar_tracks(track_id, encoded_items):
# Calculate cosine similarity between the target track and all other tracks
sim_scores = cosine_similarity([encoded_items[track_id]], encoded_items)[0]
# Sort by similarity and get most similar tracks
sim_track_indices = np.argsort(sim_scores)[::-1]
sim_scores = sim_scores[sim_track_indices]
# Create a mask to exclude the input track from the recommendations
mask = sim_track_indices != track_id
# Filter out the input track from the indices and similarity scores
filtered_indices = sim_track_indices[mask]
filtered_scores = sim_scores[mask]
s
if __name__ == "__main__":
# Example data - replace with your actual data
R = load_track_features("../remappings/data/Modified_Music_info.txt", 30000)
# Prepare data
data_normalized, scaler = prepare_data(R)
# Train autoencoder and get item feature matrix
latent_dim = 16 # Adjust as needed
input_shape = data_normalized.shape[
1:
] # Assuming data_normalized is (num_samples, num_features)
autoencoder = Autoencoder(latent_dim, input_shape)
autoencoder.compile(optimizer="adam", loss="mean_squared_error")
autoencoder.fit(data_normalized, data_normalized, epochs=10)
# Get the encoded items
encoded_items = autoencoder.encoder.predict(data_normalized)
track_id_to_recommend = 0
similar_tracks, similar_tracks_scores = recommend_similar_tracks(
track_id_to_recommend, encoded_items
)
recommend_id_example = similar_tracks[2]
print("Recommended Track Indices:", similar_tracks)
print("Similarity score:", similar_tracks_scores)
print("score of recommended item similarity of encoded item: recommend_id_example")
print(
cosine_similarity(
[encoded_items[track_id_to_recommend]],
[encoded_items[recommend_id_example]],
)[0][0]
)
print("score of recommended item similarity of item: recommend_id_example")
print(
cosine_similarity(
[data_normalized[track_id_to_recommend]],
[data_normalized[recommend_id_example]],
)[0][0]
)
Title : Complete A.I. Machine Learning and Data Science: Zero to Mastery
We made a video to introduce superposition: a foundational concept enabling parallel computing.
While classical bits are either 0 or 1, quantum bits (qubits) can exist in a probabilistic combination of both states at once, allowing for powerful, simultaneous computations beyond the reach of classical methods.
Our application is supported in various languages and it’s an enterprise one. We usually give the new sentences to various people who are expertise in different languages to translate which costs the company a lot. We are exploring ML models to automate this. Our team has suggest OpenAI to do this which is probably a decent idea. I am wondering if there is any other way to try this. Translation won’t be a live translation, translation will happen and we will update on the relevant files and it will get released later. I tried a local model like Marian, it didn’t do well even for basic sentence Japanese.
I am not new to ML, been learning for sometime but lack experience.
Hey everyone!
As AI continues to evolve, one area that’s seeing significant improvement is data labeling. This blog I found dives into how AI-driven data labeling solutions are changing the landscape and making the process faster, more accurate, and scalable.
Here are a few highlights:
Efficiency Gains: AI-powered tools can automate repetitive tasks, speeding up the labeling process while reducing human error.
Improved Accuracy: Machine learning models continuously learn and refine labeling, which can enhance the precision of labeled data over time.
Scalability: AI-driven solutions make it easier to manage and label large volumes of data, which is especially crucial in industries like healthcare, autonomous vehicles, and e-commerce.
The blog also discusses the challenges and limitations, so it’s not all rosy—but it’s exciting to see how AI is paving the way for better data preparation and ultimately better insights.
You can check out the full article here if you're interested: Exploring the Role of AI in Data Labeling Solutions
Would love to hear your thoughts! Have any of you tried AI-powered data labeling tools? What’s been your experience?