/r/MLQuestions
A place for beginners to ask stupid questions and for experts to help them! /r/Machine learning is a great subreddit, but it is for interesting articles and news related to machine learning. Here, you can feel free to ask any question regarding machine learning.
What kinds of questions do we want here?
"I've just started with deep nets. What are their strengths and weaknesses?" "What is the current state of the art in speech recognition?" "My data looks like X,Y what type of model should I use?"
If you are well versed in machine learning, please answer any question you feel knowledgeable about, even if they already have answers, and thank you!
Related Subreddits:
/r/MLQuestions
Basically what the title says. I’m an undergrad student doing ml research and I’m currently looking for ds internships and ml internships, but I just don’t know how to add my research to my resume. Should it be like looking for swe roles?
Such as, “Used [technology] that led to [XYZ] and improved this by [XYZ]
Or should it be more like this, “Created a [model] that gave [XYZ results]. Kind of vague, but im kind of lost here.
Title says it all; I'm getting inconsistent metric results for my test dataset when using Catboost. My LGBM model (and others) are consistent on train/val data, while there's (very) slight variation with the Catboost.
I know there's a randomness to Catboost, but to my understanding, setting a random seed should mitigate that. Below is my training code:
X=train_data[final_features]
y=train_data['pass']
is_cat = (X.dtypes != float)
cat_features_index = np.where(is_cat)[0]
pool = Pool(X, y, cat_features=cat_features_index, feature_names=list(X.columns))
model = CatBoostClassifier( **cat_params, verbose=False).fit(pool)
And test code:
X=test_data[final_features]
y_test=test_data['pass']
is_cat = (X.dtypes != float)
y_pred=model.predict(X)
With cat_params (kept static for each run) being
{'learning_rate': 0.08047248508279288, 'depth': 6, 'subsample': 0.6327805587079891, 'colsample_bylevel': 0.6601989777908728, 'iterations': 467, 'random_seed':42}
Please let me know if there's anything obvious I'm missing that would cause this Catboost inconsistency. I'm restarting and running all on my notebook each time. I figured I'd post here since nothing really helped after extensive Googling.
Thanks in advance for any help!
How do I fine tune a model like Llama 3 to extract important information from a given description? Also, do I have to do this process manually? I want It to extract very specific pieces of data and organize It in a special way so I’m thinking I’ll have to prompt It, tell It if the output was correct and keep producing my own data. Is there a way to automate the production of data so I don’t have to always do It manually?
This is my first time doing this so any tips and guidance would be great. Thanks!
Ok so I'm looking for those who might be interested in a hobbyist ML/ALife project that explores applying a novel Genetic Algorithm NNs To create a Neural Plastic online learner that doesn't rely on back propagation.
The Concept is a fully spatially embedded RNN where all connections, weights, biases are derived from neurons spatial relationship to each other where the GA is applied to regulate that spatial relationship. The GA Framework is fully developed and has its own set of interesting properties I'm not going to get into here.
Questions and reasonable criticism welcome but please be nice I'm not looking to pick a fight.
Here's a link to the Branch of my GA git repo that contains this project (at least so far )if you want to check it out. Also an image of a randomly initialized SP_NN for cool factor.
I started using wandb for hyperparameter optimization (HPO) purposes (this is the first time I'm using it), and I have a weird issue when fine-tuning a Transformer on a binary classification task. The fine-tuning works perfectly fine when not using wandb, but the following issue occurs with wandb: at some point during the HPO search, the accuracy will freeze to 0.75005 (while previous accuracy results were around 0.98) and subsequent sweep runs will have the exact same accuracy even with different parameters.
There must be something wrong with my code or the way I am dealing with that because it only occurs with wandb. I have tried changing things in my code several times but no to avail. I used wandb with a logistic regression model and it worked fine though. Here is an excerpt of my code:
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
return accuracy.compute(predictions=predictions, references=labels)
sweep_configuration = {
"name": "some_sweep_name",
"method": "bayes",
"metric": {"goal": "maximize", "name": "eval_accuracy"},
"parameters": {
'learning_rate': {
'distribution': 'log_uniform_values',
'min': 1e-5,
'max': 1e-3
},
"batch_size": {"values": [16, 32]},
"epochs": {"value": 1},
"optimizer": {"values": ["adamw", "adam"]},
'weight_decay': {
'values': [0.0, 0.1, 0.2, 0.3, 0.4, 0.5]
},
}
}
sweep_id = wandb.sweep(sweep_configuration)
def train():
with wandb.init():
config = wandb.config
training_args = TrainingArguments(
output_dir='models',
report_to='wandb',
num_train_epochs=config.epochs,
learning_rate=config.learning_rate,
weight_decay=config.weight_decay,
per_device_train_batch_size=config.batch_size,
per_device_eval_batch_size=16,
save_strategy='epoch',
evaluation_strategy='epoch',
logging_strategy='epoch',
load_best_model_at_end=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
compute_metrics=compute_metrics,
)
trainer.train()
final_eval = trainer.evaluate()
wandb.log({"final_accuracy": final_eval["eval_accuracy"]})
wandb.finish()
wandb.agent(sweep_id, function=train, count=10)
Is there a way to give one image of a person and make it identify and track the person in a video with features not particularly their facial features. Maybe it could detect all people and show the probability that its the same person and some filtering can be done to confirm based on model accuracy. But can this be done? And how? Looking to use this for a robotics project.
Hi everyone! I’m currently finishing my Master’s in Physics and starting to transition into machine learning. Right now, I work as a junior data engineer, and I’d like some feedback on a roadmap I’ve put together for myself. My goal is either to land a position as a data scientist or ML engineer or eventually pursue a Ph.D. to apply for research positions. For context, I’m from Argentina, so the job market here might be a bit different.
Here’s the roadmap I’ve planned:
I’m currently taking Andrew Ng’s ML specialization and working through An Introduction to Statistical Learning (ISL), doing the exercises.
After finishing the specialization, I plan to read Machine Learning with PyTorch and Scikit-Learn while continuing to follow topics in ISL.
Then, I’d like to work on a few projects that interest me, particularly around recommendation systems and classification, but in an end-to-end format (starting with initial analysis in a notebook and then moving towards a production-ready implementation using MLOps tools, etc.).
Finally, to round out the theoretical side, I plan to read The Elements of Statistical Learning (ESL) and Dive into Deep Learning.
I’ve set aside around 6 months for this, given that I’m finishing my Master’s while also working.
Do you think this is a good roadmap? Or is it too much theory and reading and not enough coding?
Thanks!
I love understanding HOW everything works, WHY everything works and ofcourse to understand Deep Learn better you need to go deeper into the math. And for that very reason I want to build up my foundation once again: redo the probability, stats, linear algebra. But it's just tideous learning the math, the details, the notation, everything.
Could someone just share some words from experience that doing the math is worth it? Like I KNOW it's a slow process but god damn it's annoying and tough.
Need some motivation :)
Had experimented with a traditional cloud GPU provider (with it's own captive fleet and data-center), but was a bit surprised and disappointed with how the overall billing worked, wanted to check if Tensordock, Vast.ai, Runpod and other such cloud GPU providers on-demand usage and billing works similarly, or is it significantly better.
The traditional cloud GPU provider I tried earlier, charged for:
So a total of $ 2.1 per day, or $63 per month -- an amount which seems unreasonably high for the actual usage i.e. around 90hrs per month. Wondering if Tensordock and Vast.ai also operate in the same model ?
My aim is to run the cloudGPU as a remote, inference endpoint for things that could be chatbot type usage, coding-assistance usage or other consumer inference workloads, and only rarely perhaps some fine-tuning. I plan to do this during my after office hours, so limited to max 3hrs a day on average.
Just as AlphaGo and the StarCraft AI (AlphaStar) made significant contributions to the advancement of reinforcement learning, why not conduct research to develop an AI specifically for defeating World of Warcraft raid bosses?
I believe that achieving significant research outcomes in the interactions of 20 players and real-time decision-making would be possible when tackling WoW raid bosses.
In particular, rather than training the AI on the patterns of existing raid bosses, it could learn and adapt to new bosses without any prior information, similar to AlphaZero. This approach, especially when new bosses emerge in events like the Race to World First, would be much more challenging and beneficial for the advancement of AI technology compared to previous efforts with AlphaGo or AlphaStar.
However, I’m just a beginner developer who loves World of Warcraft and only has basic knowledge of AI, so I would love to hear the opinions of experts who are well-versed in this field!
If possible, could it be achievable for the AI to compete in the Race to World First and potentially beat teams like Liquid or Method, just as AlphaGo surpassed professional Go players?
Hi,
I am working on a Python tool that should be able to identify elements on mobile application screenshots.
For example, in a screenshot, I want the AI to find the "Play button coordinates" and have the model return the X and Y coordinates. I have tried GPT-4 and Claude, but they are not accurate. Which AI or LLM would be best for this? Should I consider training my own models? Please tell me which technologies are the best for this project.
I am about to complete andrew ng course of coursera what should be my future roadmap??
I have got hold of everything taught in video thinking of doing those lab from github for free.
Helpp🙏🙏
To be clear I am pretty ignorant about computer science. The max of my cs knowledge is coding some matlab during a mechanical engineering degree..
I read life 3.0 and superintelligence and they very clearly cover some of the capabilities and risks of AGI and the different routes to the emergence of AGI. Something I found interesting and a bit odd was the lack of discussion of an AGI agent changing its goals. The alignment problem is clear to me and how in really any given scenario the agent would be likely to eliminate humanity to achieve its goal and/or protect itself, i.e. the paperclip collector. I've been left wondering if there is a case where the agent can be programmed to collect paperclips and unilaterally changes its goal to something else? Such as collect cheese instead of paperclips or leave no trace on earth and fly into a black hole. I get how flying into a blackhole gets in the way of getting paperclips, but can it stop caring about paperclips? During an intelligence explosion and the iterations of recursive self-improvement within it, could an AGI change its utility function? (Hope i used that term right) I feel im missing something fundamental about the nature of programming that the topic of an agent changing its goals was so conspicuously absent in these books. It just seemed strange to me that something could be so intelligent its almost inconceivable to my tiny human brain yet it cannot "change its mind". It can accomplish goals and objectives beyond compression yet it can't go "you know I was originally going to stay home, eat pizza and play video games but instead im going to the gym". Again i think I'm missing something glaring here that im so stuck on this anthropomorphization
Tldr: can an AGI be programmed to collect paperclips and then unilaterally change its goal to something else?
Hey guys I couldn't find studies on implementing machine learning on real-time datasets in e-commerce. I think it is a novel topic that hasn't been explored yet. Any idea if this can be considered a novel research topic? Also If an e-commerce company has already implemented it but hasn't published it. Would it still be a novel?
I am a final semester MSCS student at Texas A&M. I just defended my Master’s Thesis and received good positive feedback. I have submitted a paper to NAACL2025 on the same. However, I do not have any previous paper. My final goal is to be able to research on Generative AI and specifically on the reasoning aspect of it in research labs like Meta, Google, Amazon, etc., hopefully soon.
I do have an offer for Data Scientist 2 in a Tier-2 company (Its an old HDD Company - I guess it would be Tier2 for AI/ML stuff), however the work is mostly traditional ML and some Computer Vision stuff. I can join it and try switching in some time.
My Advisor is asking me to apply to better universities in the next cycle as he doesn’t have funding right now. And yeah, I have an education loan of $30k to pay off.
I am really in turmoil. Please help me and give me some perspective.
I'm currently working on a RAG system and would like some advice on processing the data prior to storing in a database. The raw data can vary from file to file in terms of format and can contain 1000s of lines of information. At first I used Claude to structure the data into a YAML format but it failed to comprehensively capture all the data. Any advice or pointers would be great - thanks
I am interning at a recruitment company, and i need to standardize a dataset of skills. The issues i'm running into right now is that there may be typos, like modelling or modeling (small spelling mistakes), stuff like bash scripting and bash script, or just stuff that semantically mean the same thing and can all come under one header. Any tips on how I would go about this, and would ml be useful?
I’m building a content-based filtering system using the following data structure:
track_id name artist tags year duration_ms danceability energy key loudness mode speechiness acousticness instrumentalness liveness valence tempo time_signature
0 0 0 [95, 28, 80, 86, 57, 73] 2004 222200 0.355 0.918 1 -4.36 1 0.0746 0.00119 0.0 0.0971 0.24 148.114 4
1 1 1 [95, 28, 80, 26, 86, 35, 78, 31, 92] 2006 258613 0.409 0.892 2 -4.373 1 0.0336 0.000807 0.0 0.207 0.651 174.426 4
2 2 2 [95, 28, 86, 78, 13] 1991 218920 0.508 0.826 4 -5.783 0 0.04 0.000175 0.000459 0.0878 0.543 120.012 4
3 3 3 [95, 28, 80, 86, 57, 35, 73, 92] 2004 237026 0.279 0.664 9 -8.851 1 0.0371 0.000389 0.000655 0.133 0.49 104.56 4
4 4 4 [95, 28, 80, 86, 57, 35, 78, 92] 2008 238640 0.515 0.43 7 -9.935 1 0.0369 0.0102 0.000141 0.129 0.104 91.841 4
The issue I’m facing is with the tags column, which contains an array of tags (ranging from 2 to 20+ tags per track) rather than a single value. I’m looking for advice on the best approach to handle these varying-sized arrays for content-based filtering. Currently I'm using TensorFlow and sklearn for some of it.
Basically the title explains it all. There are a lot of performance comparisons for different types of neural nets and float precisions. But I have failed to find ANY benchmarks for A100/4090/3090/A6000 for XGBoost/Catboost/lightgbm libraries.
The reason I am looking for this is that I am doing predictions on big tabular datasets with A LOT of noise, where NNs are notoriously hard to fit.
So currently I am trying to understand is there a big difference (say 2-3x performance) between say 1080ti, 3090, A6000 and A100 gpus. (The reason i mention 1080ti is the last time I ran large boosting models was on a chunk of 1080tis).
The size of datasets is anywhere between 100Gb and 1TB (f32).
Any links/advice/anecdotal evidence will be appreciated.
Hey guys :) I was wondering which type of NN architecture one could use to train a model on time series data of for example stock/index prices. I am new to the field and would like to play around with this to start :D Advise would be highly appreciated :)
Hey everyone, I'm trying to build a model that can classify typefaces into serif and sans-serif categories (and even subcategories like in Vox-ATypI classification, see more in here https://en.wikipedia.org/wiki/Vox-ATypI_classification), but I'm having trouble figuring out the best way to proceed. I could really use some advice!
My first approach was to convert each glyph of the font to SVG format and train on an SVG dataset. The problem is, I'm stuck when it comes to finding a library or method to effectively train on SVG data directly. Most resources I've found focus on image-based training, but I'd like to maintain the vector nature of the data for more accuracy if possible. Does anyone have any suggestions on libraries or frameworks that can help me work directly with SVG data?
The second approach I'm considering involves using FontForge. FontForge can give me instructions on how to draw each glyph in the font, and I was thinking of creating a dataset based on these curve instructions. However, I'm unsure if this is a viable route for training a classifier, and I'm also wondering if anyone knows if it's "allowed" in the sense of common practice or standard methods.
Any pointers, advice, or resources would be super helpful! Thanks in advance :)
I'm a Software Engineer, been for about 3 years now. I'm good at python and programming in general. I want to start learning about AI and machine learning. I started with a book "Hands on machine learning with sckit, keras and tensorflow". I've read the first chapter but I feel like I'm missing something, a piece cause I can't see the picture, I've never been really great with learning from books. I want to get a Udemy course on machine learning by ztm as I learn great with videos than books. I also want to refresh on some maths as I see I really need to understand some maths before starting. If you have any clue on how to be great at machine learning, please share.
Hi everyone, I am currently researching speech technologies as an undergrad, mainly focusing on improving the applications for the visually challenged. I am new to this niche area of research, so I want to pick a research topic that will address some of the existing issues of the current tech. So far, ElevenLabs seem to be the SOTA. I would like to know whether there is anything else to improve in TTS, speech to speech, voice cloning, deepfake audio detection etc., And any insights on ethical issues or the need for guardrails in the future would also be helpful. And due to the availability of low compute resources from uni, I cannot address the research involving scaling or multilingual.
What's there yet to improve in speech technologies? What's there left in speech research?
Hi everyone, I am currently researching speech technologies as an undergrad, mainly focusing on improving the applications for the visually challenged. I am new to this niche area of research, so I want to pick a research topic that will address some of the existing issues of the current tech. So far, ElevenLabs seem to be the SOTA. I would like to know whether there is anything else to improve in TTS, speech to speech, voice cloning, deepfake audio detection etc., And any insights on ethical issues or the need for guardrails in the future would also be helpful. And due to the availability of low compute resources from uni, I cannot address the research involving scaling or multilingual.
Trying to pick courses and probably can't take both of these. For someone trying to end up as a machine learning engineer, which of the above statistical concepts would be better to master first? I'm mostly interested in knowing which would be more marketable for the next 3-5 years as I imagine I'd be continuously learning to meet my professional needs long-term. If there's any nuance in terms of different sectors having distinct preferences, then I'd love that extra detail as well!