/r/computervision

Photograph via snooOG

Computer Vision is the scientific subfield of AI concerned with developing algorithms to extract meaningful information from raw images, videos, and sensor data. This community is home to the academics and engineers both advancing and applying this interdisciplinary field, with backgrounds in computer science, machine learning, robotics, mathematics, and more.

We welcome everyone from published researchers to beginners!

Content which benefits the community (news, technical articles, and discussions) is valued over content which benefits only the individual (technical questions, help buying/selling, rants, etc.).

If you want an answer to a query, please post a legible, complete question that includes details so we can help you in a proper manner!

Related Subreddits

Computer Vision Discord group

Computer Vision Slack group

/r/computervision

90,082 Subscribers

1

Project suggestions

Any unique projects to build to standout in a resume

0 Comments
2024/04/26
18:55 UTC

2

Ocr model

I am having an image of a monitor, where i need to read the vitals values using camera. I am able to extract all the values and names using ocr model but how would i correlate the vitals values with their vital names,like Hr 78 Spo2 97 Pr 77 Rr 22

7 Comments
2024/04/26
18:16 UTC

1

How to create a dataset to train a model to detect circles in an image?

Let's say I'm supposed to detect the image below within a rural city (for example, this logo is on the floor, and there are some people on it, maybe it's blurry, half of it is shown). How should I create a dataset for this? I don't think just putting this directly would create a realistic dataset for training a model (Copying and pasting this on an image). What are your suggestions? I heard about histogram equalization as a method.

https://preview.redd.it/eqaxh622puwc1.png?width=1000&format=png&auto=webp&s=f1fc14837d4c5c78df29afff46b9f2d575368098

0 Comments
2024/04/26
16:34 UTC

0

One Model, or Two Models?

I am going to train a YOLO-v9e model for object detection. Accuracy is very, very important to me, and I have lots of data. The main task is to detect pedestrians and vehicles. Would you suggest training two separate models (one for pedestrian detection and the other for the vehicle detection), or one model for both?

5 Comments
2024/04/26
16:26 UTC

1

YoloV8 optimization

so when I use my YoloV8 model (200 epoch, YoloV8s, 22mb) it gives me a frame like in every 30 seconds. it is really bad. I'm using raspberry pi 4 to run it on Webcam is there any way to optimize it. what should I do. (if I didn't give enough information sorry it is my first time)

11 Comments
2024/04/26
13:54 UTC

1

yolov7 additional graphs

Hello, I am currently training the yolov7 model for a polyp detection task.

!git clone https://github.com/WongKinYiu/yolov7.git# clone
%cd yolov7
!pip install -r requirements.txt      # install modules
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt # download pretrained weight

!python train.py --weights yolov7.pt --data "data/data.yaml" --workers 4 --batch-size 32 --img 416  --cfg cfg/training/yolov7.yaml --name yolov7 --epochs 5 --hyp data/hyp.scratch.p5.yaml
The default model provides certain specific graphs, like F1 vs Confidence, Precision vs confidence etc.

I would like to have different graphs, something like precision vs epochs, recall vs epochs etc.

Is there any way to retrieve these kind of graphs?

2 Comments
2024/04/26
13:27 UTC

2

Vision AI usecase: create graph database from images

Which modern vision model would be able to solve the following problem accurately?

https://preview.redd.it/nwexvlhlitwc1.png?width=849&format=png&auto=webp&s=1ad6c015718943741c39866dd4a8c0c59e6c5fc3

I used the free "Gemini chat" function here. I heard the context window of the Gemini model is largest, but perhaps not so in the free version. The map(s) I want to feed in is quite large and there are multiple types of connections (e.g. national rail, highspeed rail, ...)

The goal is to create a set of

- nodes (with features: name, latitude, longitude, ...)

- edges (with features: type, duration, length, ...)

1 Comment
2024/04/26
12:35 UTC

1

How to obtain shape and pose parameter from RGB image

I am working on a project that requires estimating the 3D human body shape and pose parameters. My goal is to use these parameters with the SMPL (Skinned Multi-Person Linear Model) to generate the 3D mesh representation, including the vertices and faces.

I have been using the ExPose library to generate human meshes from images, and I am able to obtain the SMPLX parameters, which are saved in .npz files. However, I am unsure about how to proceed from here to obtain the actual vertices and faces of the 3D mesh using these SMPL parameters.

I am relatively new to this field, and I would appreciate if someone could guide me through the process of using the SMPL-X parameters (obtained from ExPose) to generate the 3D mesh vertices and faces.

0 Comments
2024/04/26
11:40 UTC

1

How to convert YOLOv4 darknet model to ONNX?

I scoured the internet for some working code. I looked up ChatGPT, which told me to use pip install yolov4-onnx and install the library. The only problem is that no such library exists.

I looked up even on Google Gemini, and I was presented with this code: https://bin.0xfc.de/?e639e205f06b3ba2#HAgitVabH4xUwNFYSfetWnL8ngK2LsnbyGn8LSygVWJ3

(Use the password magneticelectron to access.)

But no luck: The traceback I'm getting is: raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

I also came across this repository https://github.com/Tianxiaomo/pytorch-YOLOv4 but it hasn't been any useful either.

3 Comments
2024/04/26
10:22 UTC

3

SOTA visual odometry methods

It seems visual odometry is quite under-developed in contrast to odometry methods using LiDAR or IMU sensors. I've been searching for well-developed opensource projects and ended up using DF-VO, which incorporates deep learning models into geometry-based solutions, for my research project. But the project is quite old (opened in 2020), and I wonder if there are another SOTA opensource projects working on visual odometry.

1 Comment
2024/04/26
08:48 UTC

3

How to classify monkeys images using convolutional neural network , Keras tuner hyper parameters , and transfer learning ? (part1)

https://preview.redd.it/kpqdun6ysrwc1.png?width=1280&format=png&auto=webp&s=cfcc2d8b3fb9bcba5385705c234d409b40ce034a

🎥 Image Classification Tutorial Series: Five Parts 🐵

In these five videos, we will guide you through the entire process of classifying monkey species in images. We begin by covering data preparation, where you'll learn how to download, explore, and preprocess the image data.

Next, we delve into the fundamentals of Convolutional Neural Networks (CNN) and demonstrate how to build, train, and evaluate a CNN model for accurate classification.

In the third video, we use Keras Tuner, optimizing hyperparameters to fine-tune your CNN model's performance. Moving on, we explore the power of pretrained models in the fourth video,

specifically focusing on fine-tuning a VGG16 model for superior classification accuracy.

Lastly, in the fifth video, we dive into the fascinating world of deep neural networks and visualize the outcome of their layers, providing valuable insights into the classification process

Video 1: Data Preparation Tutorial

In this tutorial we will download the dataset , make some data discovery , and prepare the images for the next phase of building the CNN model.

Link for the tutorial is here : https://youtu.be/ycEzhwiAXjY

Here is the code : https://github.com/feitgemel/TensorFlowProjects/tree/master/Monkey-Species

Enjoy

Eran

#Python #Cnn #TensorFlow #Deeplearning #basicsofcnnindeeplearning #cnnmachinelearningmodel #tensorflowconvolutionalneuralnetworktutorial

0 Comments
2024/04/26
06:47 UTC

1

Best camera for computer vision on a raspberry?

Hello, I am currently developing a compiproject which will detect radishes either planted or by themselves, this will move a robot which will activate a method to harvest them, etc, etc.

This will use Tensorflow lite so it can show at least 5 fps on a Raspberry and I'm looking for a camera that shows accurate colors and exposure so it can detect the radish itself. I've been using a webcam to test but this one doesn't have accurate colors (they're washed out) so I am wondering which webcam should I use or better get the Pi Cam modules (which are a lot more expensive).

Any advice?

2 Comments
2024/04/26
06:09 UTC

1

List the real-time detected and identified object in sql

How to save the logs (list of results) of the real-time object detection and identification using nosql?

Real time object detection and identification via live video feed and display the "what object is detected, DateTime, Type, Confidence Score, Recording file.

Once my system is finished recording within given time all the results will be display to Video Logs module.

Example
-----
DateTime: 2024-03-20 08:00:00

Recording: Recording 1

Confidence Score: 0.85

Fish Name: Clownfish

Type: Clear

0 Comments
2024/04/26
05:56 UTC

0

Too Many False Positives - Not sure where to begin troubleshooting

Project Link: https://universe.roboflow.com/testset-edjwf/f16-eesk5/dataset/2

I wanted to test if full synthetic datasets could be used for real time detection, but I am getting way to many false positives to pull any meaningful data from this project at the moment.

Project data:

~4700 training, 360 validation, 730 test.

1 class: F 16 Jet, all training and validation images are synthetic. Test data is annotated from airshow footage.

Models tried: YoloV9, Roboflow 3.0.

This is my first CV project, so I'm probably just missing some fundamentals here. When deployed, it will detect just about anything as an F16; people, text, keyboards, different aircraft, etc... Not sure if it is the model, my data, lack of null images, or something else. Any advice is appreciated.

5 Comments
2024/04/26
05:13 UTC

4

Help, getting into CV

Hello Fellow Computer vision enthusiasts !

Soon I m going to start my master’s degree in Cs for autonomous systems and I will be dealing with a lot of image/video and signal processing for now I have some free time and I really want to be proficient in computer vision. What’s the best road map I can follow or books/youtube channel I should learn from what projects or challenges I should focus on?

Thank you !

2 Comments
2024/04/26
02:07 UTC

1

Best tool?

I havee an large image data set of hands. I want to auto-caption/tag the data set based on the hand pose in the image / how many fingers as well. How should I go about this, as I'm not sure what to use to detect/train the hand poses/fingers?

0 Comments
2024/04/26
01:14 UTC

2

PlantVillage Dataset Disease Recognition using PyTorch

PlantVillage Dataset Disease Recognition using PyTorch

https://debuggercafe.com/plantvillage-dataset-disease-recognition-using-pytorch/

1 Comment
2024/04/26
00:39 UTC

1

Tips and tricks to enhance model performance

I'm building a model using YOLOv8 for object detection on video data. As a beginner, I've had some initial success by fine-tuning a pretrained model with a simple dataset. Now, I want to improve this model.

Beyond the usual advice of "get more data" (what I am already doing), I'm looking for other ways to enhance performance. Tweaking hyperparameters hasn't been very effective given my three-hour training constraint.

Does anyone of you have advice or resources that was beneficial for them in the past?

2 Comments
2024/04/26
00:24 UTC

0

Open Source Computer Vision App with UI

Hello legends, I’ve recently stumbled into your community and I’m blown away at what some of you are able to accomplish. I’ve been tasked to implement an open source computer vision project to detect missing labels on our assembly line. I know there are a lot of really powerful builds out there, but I’m more curious to ask if there are some with user interfaces to retrain or to have a button to click to take a picture and run the detection.

I’m sorry for the lack of detail, I’m new to a lot of this field. This is quite a stretch for me and any information you could provide would be greatly appreciated!

J

3 Comments
2024/04/25
18:35 UTC

0

Rabbit R1 AI Real World Uses First Impressions

0 Comments
2024/04/25
17:50 UTC

2

Help with Object Detection and Tracking

I have a project in my collage about making a software that does the following:

  1. Count Red Blood Cells (RBCs) from a video.
  2. Count whit blood cells (WBCs) from a video.

I'm new to this filed but I managed to gather some information.

So I used YOLOv9 for object detection and I've trained it on my custom data (will leave a link for the data link in RoboFlow).

I'm using supervision to do the tracking part, but for some reason I can't track the cells correctly, the problems I've faced are:

  1. Object ID keeps repeating for different cells of the same type.
  2. Detecting new cells as old ones with an old ID.

Notes for your background:

the cells shape changes throughout the video because they can deform.

the best method for tracking I can think of is to track the center of the cell regardless of its shape, but I can't do that either.

I've tried to use CenterTrack but I can't get my head around it and it's very complicated and old that makes me obligated to change a lot of the source code my self.

I'm open to use totally different models than the ones that I've selected so if you have an idea that will work better in my case I'm all ears :)

Thanks in advance <3

The used Dataset for training

My work on Google Colab

Sample of the input video

1 Comment
2024/04/25
16:27 UTC

1

Help with Object Detection and Tracking

I have a project in my collage about making a software that does the following:

  1. Count Red Blood Cells (RBCs) from a video.
  2. Count whit blood cells (WBCs) from a video.

I'm new to this filed but I managed to gather some information.

So I used YOLOv9 for object detection and I've trained it on my custom data (will leave a link for the data link in RoboFlow).

I'm using supervision to do the tracking part, but for some reason I can't track the cells correctly, the problems I've faced are:

  1. Object ID keeps repeating for different cells of the same type.
  2. Detecting new cells as old ones with an old ID.

Notes for your background:

the cells shape changes throughout the video because they can deform.

the best method for tracking I can think of is to track the center of the cell regardless of its shape, but I can't do that either.

I've tried to use CenterTrack but I can't get my head around it and it's very complicated and old that makes me obligated to change a lot of the source code my self.

I'm open to use totally different models than the ones that I've selected so if you have an idea that will work better in my case I'm all ears :)

Thanks in advance <3

The used Dataset for training

My work on Google Colab

Sample of the input video

0 Comments
2024/04/25
16:27 UTC

3

Recent work on two stage object detection models?

I’ve been working with YOLO models for the past few years, but thinking about trying some new stuff out, and trying to get back up to date with what’s been going on in the field. There have been new YOLO advances every year or so, but Faster R-CNN is from 2016 and it seems like it’s still the standard in that family - is that correct? I was reading about G-RCNN, but even that is from 2021 which isn’t exactly new. Has that line of research turned to DETR and other transformer models? Overall I’m trying to get a better sense of where the field is, and having some trouble sorting through info.

2 Comments
2024/04/25
14:24 UTC

1

Methods to filter segmentation masks.

An object is positioned on a rotating platform, allowing a stationary camera to capture multiple images. These images are then fed into Segment Anything Model to generate segmentation masks.

SAM produces high quality masks but there some masks that don't meet the standard and i have to filter them out. Currently, I have to manually go through the masks and filter out the bad masks. Are there methods to automate this?

Current solution: Counting contour Contour area to filter

4 Comments
2024/04/25
11:24 UTC

1

Help with no-identification object detection

Hi everyone, I am doing a college project with little knowledge on AI and especially computer vision and I need some help. Our goal is to take a picture of several ingredients, identify them, and suggest recipes. I already have a good model trained to identify ingredients in an image with 1 single ingredient. Right now, I want to take an image with many ingredients, isolate each one with a bounding box and feed them into the model so that it predicts 1 by 1.

I tried to find models that just create a bounding box for any object in the image. I dont want the model to identify the objects, it doesnt even need to know what they are, only that there is something there.

I tried YOLO, but apparently it only identifies objects for which it was trained on so in a picture with apples, oranges and strawberries it will not detect strawberries as an object, for example. I tried other packages and even Opencv but I cant seem to get it working the way I want to. Any suggestions? Just a package or link to some examples would be great, as I cant seem to find a tool that does exactly what I need.

Thanks in advance!

3 Comments
2024/04/25
10:57 UTC

123

Computer vision on an MCU, and I got this fan that follows my every single move! No more manual adjustments or stagnant air!!!

18 Comments
2024/04/25
09:55 UTC

0

Detection of Smoke and Fire

Hi everyone,

I was researching about ways to detect fire and smoke using Computer vision and I wanted to know what are the ways one can achieve that. I was able to find that Object Detection and Object Segmentation are two such ways.

I also think edge detection based methods can work but they would need to be scene specific, right? Are there other methods of solving this issue?

An important assumption: This method should work for indoor environments with controlled light conditions.

0 Comments
2024/04/25
09:40 UTC

1

Text Line Dewarping Dataset

I'm looking for any public available dataset that contains curved text lines (preferably one per image), like those from "Alignment of Curved Text Strings for Enhanced OCR Readability". I created an algorithm of my own (good enough for a paper, as per my professor) and I need a dataset to test its performance.

0 Comments
2024/04/25
07:42 UTC

Back To Top