2,787,885 Subscribers


[P] Camera based monitoring of infant's breathing

Hi! I recently have seen systems that monitor breathing rate of an infant through camera. I have read several articles on that topic, where people used things like 3D camera, RGB or Interferometric Radar Sensor. Do you guys have any idea on how to accurately measure this?

02:58 UTC


[P] [p] I need a search algorithm model using AI/ML. For example, when I select B.E., it should automatically display Bachelor of Engineering.

I need a search algorithm model using AI/ML. For example, when I select B.E., it should automatically display Bachelor of Engineering. For this query which model I prefer any idea

00:48 UTC


[D] How Do You Track Projects in a Scaling ML Team?"

I am part of a Machine Learning team that has experienced significant growth recently. When we were a small team, tracking projects was straightforward. However, as the team has expanded, it's become increasingly challenging to keep track of everything. We are part of a larger corporation, so we have access to tools for creating epics and boards. However, these corporate tools are too generic and don't provide the level of detail I need for internal management. Specifically, I'm looking for a way to track model versions, dataset versions, and the overall status of our projects. I'd also like to be able to assign team members to projects.

Currently, we use a MIRO board, but it's disorganized and difficult to read and update. I'd love to hear what tools or strategies you've used for similar situations, especially since our team is expected to grow even more, making tracking increasingly complex.

00:14 UTC


[D] What are some effective dimensionality reduction (unsupervised feature selection) techniques for a high dimensional, sparse dataset?

I am considering comparing mutual information scores, but I also don't think I understand MI well enough.

For example, I(X;Y) = H(X) + H(Y) - H(X,Y). To me, visualizing H(X) and H(Y) as venn diagrams and H(X,Y) as the information from both X, Y (like an overlapping venn diagram) makes me think that when X, Y are disjoint, then MI is 0 and when X, Y overlap completely, then the MI score will be high. So, I'm thinking that a high MI value is "bad" since this means X, Y would be redundant. I am not sure if my understanding here is correct.

Another method I have tried is to binarize the data for each feature (represented as rows in my dataset) using "present" (1) and "absent" (0). The main issue I have run into doing this is that I am trying to then create a distribution to compare the features (such as seeing what percent of 1s and 0s I find in each feature), but here is the issue:

Let's say that feature A has 50% 1s and 50% 0s, and feature B also has 50% 1s and 50% 0s. So, it will look as if the distribution of their values is identical, though it could be that feature A and B are "opposites":

Feat. A: [0, 0, 1, 1]

Feat. B: [1, 1, 0, 0]

So, I wonder if there is a better way to compare the distributions of the features once I have made the data "present" (1) and "absent" (0).

I am also looking at making a Probability Density Function for each feature to compare them, but it's not clear to me how I would go about creating such a PDF for each feature given that I don't know what the probabilities associated actually are. Should I be binning the data then finding what percentage falls in these intervals?

Overall, I am looking for advice on where to find useful information on how to compare features for unsupervised feature selection, particularly in regards to how to use and compare mutual information scores, how to create PDFs for features, and how to compare distributions between features after they have been binned to avoid the problem I mentioned (with how [0, 0, 1, 1] and [1, 1, 0, 0] would appear to have the same distribution).

Relevant textbook resources and other reliable source recomendations would be much appreciated.

Thank you.

1 Comment
23:19 UTC


[D] Best interface to use LLMs for code: Chat or completion?

Hi everyone,

I am quite interested in understanding what are the feedback from the community in terms of interface to leverage LLMs for code productivity.

Because LLMs tend to do mistake I have mostly used Chat-like interfaces, like ChatGPT, as they allow to interact with the model and converge to a conclusion.

I haven't used Copilot for a while but my feeling was that it could do some boilerplate correctly but then it quickly started suggesting code that would be misleading and could actually hurt productivity.

It might have changed since then but that was my feeling back then.

What is your favorite option and why?

View Poll

23:09 UTC


[D] ML input data has to be derived from a larger dataset

Hello everyone. I am curious to know if anyone has encountered a ML problem like this and if so, I seek your advice. Usually in ML classification such as the IRIS dataset, each row represents a sample and each column a parameter, right ! My problem is that my ML classification parameters have to be derived from a range of values (parent data). I have taken mean of the parent values to generate the parameters for the ML input data. This results in lower classification accuracies using Random forest and XGBoost.

Has anyone encountered a similar situation like this where the data has to be generated from a range of other datasets? Is there any other way to do this? I did not find any papers or articles from the web so just asking.

I can generate additional parameters from other statistics such as median, standard deviation etc. which can improve the classification accuracy but can make interpretation of the results a little weird, domain wise. I wish to avoid this if possible.


Added a pdf file to explain the problem a bit more clearly


23:06 UTC


[D] Book review for Meta's ML Design interview? Machine Learning System Design Interview (by Ali Aminian and Alex Xu)

I'm preparing for the ML system design interview for Meta, and I searched for various resources. This book (ML System Design Interview (by Ali Aminian & Alex Xu)) seems like a solid structured resource that covers solutions to case studies in detail. Has anyone used it to prepare for Meta's ML System Design interview? Thoughts?

Khang's book doesn't seem to have great reviews.

Chip Huyen's book (Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications) doesn't seem very focused on interview prep??

Also, happy to hear about other cool resources to prepare. Thanks very much!

19:27 UTC


[R] Open X-Embodiment: Robotic Learning Datasets and RT-X Models - DeepMind 2023 - RT-X exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms!

Blog: https://www.deepmind.com/blog/scaling-up-learning-across-many-different-robot-types

https://robotics-transformer-x.github.io/ here you can also find the Datasets and Code!

Paper: https://robotics-transformer-x.github.io/paper.pdf


Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train “generalist” X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.





19:03 UTC


[D] Biggest problems with ML in industry?

For all my corporate ML engineers I have a question, what are the most annoying / biggest problems you face when developing/deploying ML in industry?

This can be anywhere from data, to tuning, to even MLOPS.

18:57 UTC


[D] Difficulty with paper implementations on google colab

I am not from CS background, my knowledge is from online courses and books. All of which used some variation of Jupyter notebook. My knowledge of code can be lacking sometimes, since I am not from CS background.

I am trying to implement some computer vision paper codes on newer samples. I understand the papers, and the underlying mechanisms. However, I fail to decipher the codes provided with the associated github repository. Usually, these repository contains information on how to recreate the experiment on some specific data using shell. But I am using google Colab for this purpose, as I don't have access to GPU, and I found it impossible to recreate the experiments in the google Colab, using shell commands, let alone extend it to newer samples.

I would appreciate some help in this regard, I haven't done this before, and there aren't really any tutorial/resource on how to do this. Ideally, what I am trying to do is separate the model, input some images, get the output, and interpret it. I am stuck, and I would really appreciate some help or advice in this regard. Right now I am trying to work with this paper, meta ood

I would appreciate any help/advice/resource anything. I feel very lost. Thanks in Advance.

18:49 UTC


[D] Should I learn pytorch or tensorflow and which would be better for jobs?

I've been learning tensorflow recently and even made a beginner project with it. However when I was introduced to this reddit, I saw many hating on tensorflow and recommending pytorch though these posts are a few years old.

I would like to know which library should I learn for jobs, large scale projects and MLOps( I heard there is something called tensorboard for tensorflow which can help with this)

Excuse my english as this is my second language. Thanks in advance

17:41 UTC


Repurposing a personal desktop computer [P]


I'm debating turning my old desktop (old CPU but relatively new GPU 3980 or 90) into a ML box that I can remote into. I'm sure people here have done something similar and I was wondering if anyone could point me towards some resources for getting it off the ground/any pitfalls to avoid/suggestions.

I'm an active data scientist researcher for my job and this would just be for fun side projects but I have some pretty glaring holes in my knowledge of computers (like the best way to set this up - should I uninstall windows install unbuntu or is windows fine?)

Honestly I'm sure my ignorance will be pretty apparent from the questions I'm asking/not asking so any advice anyone has would be welcome!

Thanks! Sorry if this is the wrong subreddit for this sort of thing.

16:58 UTC


[R] Generative memory: generative diffusion models are equivalent to modern Hopfield nets

16:31 UTC


[D] Using LLMs for non-language based applcaitions

Hi everyone,
I'm working on a project which explores the use of LLMs for non-language based applications. Could you tell me how could I find an LLM and provide it with a dataset of my own and train it from scratch? Any help on how to get started would be greatly appreciated cause I only have a beginner level knowledge of LLMs and machine learning.

16:07 UTC


[D] Stuck in Automation of AI models

Hello everyone!

I'm currently working on a project and have hit a roadblock in automating the deployment of my machine-learning models. Can anyone provide guidance on the best practices or tools for streamlining the deployment process? Specifically, I'm looking to create a seamless workflow where models can be easily uploaded, deployed on the cloud, and accessible through APIs. Any insights or advice would be greatly appreciated!


16:06 UTC


[P] The Case of the Missing Masterpiece

Hi, I just wanted to share an applied image classification problem that I worked on a few years ago: https://vdalv.github.io/2018/09/01/missingMasterpiece.html

16:06 UTC


Need to build a XAI model to explain the behaviour of an IDS [P]

Hello, I need help from someone that knows about XAI. I have to create a XAI model to intérprete the resulta of an AI model, an MLP, that works as an IDS classifier. I have no idea on how to do It and I have been completely blocked for 2.5 years. This is the final project of my career and I just don't know how to do It, and my tutor isn't very helpful. If anyone is able to help I would explain him what I have to do and would be very grateful.

Thanks for your help

1 Comment
15:59 UTC


[D] Optimal scheduling tool with AI/ML recommendations

Hello all,

I'm trying to plan out for a new web platform development for workforce management but have little experience. We all know that hard coding can be done for general scheduling, including manager polling shifts based on labor category, staff assignments, conflt resolving, emergency scheduling, etc. But what I want to research to is....how can I ensure that one optimal schedule is automatically computed using AI/machine learning tools so that I don't have to go through the list of hard-coded generated schedules (I’m sure these will work fine, but still want to compute one ultimate schedule).

13:55 UTC


[R] Break-A-Scene: Extracting Multiple Concepts from a Single Image

Break-A-Scene: Given a single image with multiple concepts, annotated by loose segmentation masks, our method can learn a distinct token for each concept, and use natural language guidance to re-synthesize the individual concepts or combinations of them in various contexts.

Project Page: https://omriavrahami.com/break-a-scene/

Code is publicly released!


Text-to-image model personalization aims to introduce a user-provided concept to the model, allowing its synthesis in diverse contexts. However, current methods primarily focus on the case of learning a single concept from multiple images with variations in backgrounds and poses, and struggle when adapted to a different scenario. In this work, we introduce the task of textual scene decomposition: given a single image of a scene that may contain several concepts, we aim to extract a distinct text token for each concept, enabling fine-grained control over the generated scenes. To this end, we propose augmenting the input image with masks that indicate the presence of target concepts. These masks can be provided by the user or generated automatically by a pre-trained segmentation model. We then present a novel two-phase customization process that optimizes a set of dedicated textual embeddings (handles), as well as the model weights, striking a delicate balance between accurately capturing the concepts and avoiding overfitting. We employ a masked diffusion loss to enable handles to generate their assigned concepts, complemented by a novel loss on cross-attention maps to prevent entanglement. We also introduce union-sampling, a training strategy aimed to improve the ability of combining multiple concepts in generated images. We use several automatic metrics to quantitatively compare our method against several baselines, and further affirm the results using a user study. Finally, we showcase several applications of our method.

13:46 UTC


[D] It's not really intelligent because it doesn't flap its wings.

Time and time again I see people claiming that AI is not 'really' intelligent. I have some thoughts on the matter and welcome any critique of my position:

The fact that LLMs don't do things like humans is irrelevant. Planes fly without flapping their wings, yet you would not say it's not "real" flight. Why is that? Well, its because you understand that flight is the principle that underlies both what birds and planes are doing and so it the way in which it is done is irrelevant. This might seem obvious to you now, but prior to the first planes, it was not so obvious and indeed 'flight' was what birds did and nothing else.

The same will eventually be obvious about intelligence. So far you only have one example of it (humans) and so to you, that seems like this is intelligence and that can't be intelligence because it's not like this. However, you're making the same mistake as anyone who looked at the first planes crashing into the ground and claiming - that's not flying because it's not flapping its wings. As LLMs pass us in every measurable way, there will come a point where it doesn't make sense to say that they are not intelligence because "they don't flap their wings".

13:05 UTC


[R] MIT, Meta, CMU Researchers: LLMs trained with a finite attention window can be extended to infinite sequence lengths without any fine-tuning

LLMs like GPT-3 struggle in streaming uses like chatbots because their performance tanks on long texts exceeding their training length. I checked out a new paper investigating why windowed attention fails for this.

By visualizing the attention maps, the researchers noticed LLMs heavily attend initial tokens as "attention sinks" even if meaningless. This anchors the distribution.

They realized evicting these sink tokens causes the attention scores to get warped, destabilizing predictions.

Their proposed "StreamingLLM" method simply caches a few initial sink tokens plus recent ones. This tweaks LLMs to handle crazy long texts. Models tuned with StreamingLLM smoothly processed sequences with millions of tokens, and were up to 22x faster than other approaches.

Even cooler - adding a special "[Sink Token]" during pre-training further improved streaming ability. The model just used that single token as the anchor. I think the abstract says it best:

We introduce StreamingLLM, an efficient framework that enables LLMs trained with a finite length attention window to generalize to infinite sequence length without any fine-tuning. We show that StreamingLLM can enable Llama-2, MPT, Falcon, and Pythia to perform stable and efficient language modeling with up to 4 million tokens and more.

TLDR: LLMs break on long convos. Researchers found they cling to initial tokens as attention sinks. Caching those tokens lets LLMs chat infinitely.

Full summary here

Paper link: https://arxiv.org/pdf/2309.17453.pdf

12:56 UTC


Openai api function call as "gatekeeper keeper" in app? [D]

So I am building an app and have multiple functions the LLM can use. But some need to ensure they need specific data - for example if I say "please add x to my todo for next monday" it works but only if I ensure in the recent prompt is the day and date. Now if the user says this deep in the conversation this system info can be forgotten. And you can imagine other cases too.

So it seemed to me to give it all functions, but without properties and when one is "used" it actually triggers another call to the api with the background data needed for that particular function.

So basically any function call becomes two api calls. So a bit wasteful, but this seems right to me.

Any thoughts? Or anyone taken this approach?

12:33 UTC


[D] Really good dataset for a Course Capstone

Hey everyone!

My friends and I are taking a Data Science course in our university. We are modestly versed in ML/DL techniques, and want to use everything we know on a really good capstone project for this course. We are looking for a dataset where we can demonstrate a nice variety of techniques to really blow the socks off our Professor.

Ideally we'd like this to be stemming from something basic that most would consider "Data Science", as in something with a tabular dataset and elements of classification. Though we still want chances to bring in what we know from outside the course: for example, if there's images to supplement the dataset we could use Image Classification models or something multimodal to bring in more features, if there's natural language data then we could use LLMs to extract salient features etc. More importantly though, we want something whose exploration can be really motivated so it doesn't seem we're only in it for the ML aspect.

Thank you!

1 Comment
12:09 UTC


[D] Competitiveness in ML research

I've been diving deep into the world of machine learning research, and I'm genuinely baffled: how on Earth do some researchers seem to pump out paper after paper? I mean, there's only 24 hours in a day, right?

Are academic minions (i.e. PhD students) doing all the heavy lifting? Or maybe some highly efficient workflows I'm not privy to?

On a more serious note, I would like a career in ML, and the sheer volume and pace of these publications is making me feel a bit disheartened.

How is this prolificity possible? Any words of encouragement or advice?

11:43 UTC


[D] Why should I use a hosted/cloud VectorDB solutions over a serverless or vector store?

Why the hell should i use cloud based or server hosted solution over a easy peasy servless variant like lancedb or even faiss vector store is enough for most of the use cases on small-medium

I often see posts like

"oh my stack is... pinecone Chroma weaviate_io"

And they just ingest minisets of data, what the hell man

11:41 UTC


[P] Assistance Urgently Needed: Final Year Project

Hi everyone, I am currently in my final year of my computer science degree and about to begin working on my final year project, but I LITERALLY have NO PROJECT IDEA.

I have a keen interest in Artificial Intelligence and I've just completed the AWS Students Deepracer scholarship prequalifying course, which has sparked my interest in Reinforcement Learning for Self-Driving Vehicles. I would love to do something on Self-Driving technologies, but I do not know what, how or why.

Can anyone please help/guide me about it?

1 Comment
11:40 UTC


[P] FontoGen: generating true-type fonts

I'd like to share a project that I've spent a few weekends working on. FontoGen is an autoregressive encoder-only transformer model that's capable of generating true-type fonts.

GitHub: https://github.com/SerCeMan/fontogen

Weights: https://huggingface.co/SerCe/fontogen

Blog post with more details: https://serce.me/posts/02-10-2023-hey-computer-make-me-a-font

The project is largely an exploration of whether generating fonts natively, line by line, is possible. I'm not aware of any previous research that would achieve the same results for complete fonts previously. This is my first ML-specific project, and I would appreciate any feedback on the model architecture, and I'm also happy to answer any questions you may have.

10:28 UTC


[D]Does it matter if I have an amd cpu as long as the GPU is nvidia?

I'm planning to buy a laptop for ml with 16 gigs of vram. Majority of them are equipped with an rtx 3080 but have an amd cpu. Is an Intel CPU necessary to run CV related libraries?

09:44 UTC


[D] Classification based on graph sub-communities

Hey all, I am working on a project that will require classification based on sub-communities formed within the main graph. Please help me out by directing me to appropriate resources or sharing your views on how you would approach this. Thank you.

09:22 UTC

Back To Top