/r/artificial
Reddit’s home for Artificial Intelligence (AI)
Welcome to /r/artificial The rules here are outdated, please check New Reddit for updated rules - here is the link https://www.reddit.com/r/artificial/about/rules /r/artificial is the largest subreddit dedicated to all issues related to Artificial Intelligence or AI. What does AI mean? Find out here!
Guidelines: Check New Reddit for updated rules - here is the link -https://www.reddit.com/r/artificial/about/rules, and do not complain to us in Modmail if you get banned. Submissions should generally be about Artificial Intelligence and its applications. If you think your submission could be of interest to the community, feel free to post it.
Please note that just because something else is a technology buzzword (e.g. blockchain, quantum computing, virtual reality, augmented reality, etc.), that doesn't automatically make it AI. We've had such a problem with blockchain posts that they will now need to be manually approved by a mod before they become visible. If your post is primarily about another technology (like blockchain), please make the relation to AI abundantly and immediately clear (e.g. through writing a comment).
All submissions are moderated through "collaborative filtering" approach. To help better align content with the expectations of the audience and improve the quality of the subreddit, submissions that receive overall negative feedback may be removed.
Submission titles should clearly indicate what the submission is about. In the case of link posts, they should almost always contain the title of the thing you're linking to. Don't make up your own clickbait title, and if the original title is clickbait, please add some nuance of your own. For example, if the link you want to post is to an article called "You won't believe what AI did this time!", then 1) consider if it's really a quality article, and 2) create a title like this: "A neural network gets superhuman performance on <insert task".
When posting about a story, please look on the front page if it is already being discussed. If so, consider replying there instead of making a new submission to the subreddit. If not, please make some effort to post the best link to the story you can find (often this is the story from the original source, rather than some outlet repeating what someone else already reported).
Consider doing a little research before posting a link, opinion or question. For link posts, consider writing a submission statement: a comment that describes what the link is about, why you posted it, what you'd like to discuss, and/or what you think about it.
Read Rule 2 on New Reddit for our self-promotion rule.
Do not personally attack other people (here or elsewhere; including e.g. researchers you disagree with). If you see someone do this (e.g. to you), use the report button and do not retaliate. If you disagree with anything, stick to the arguments.
Getting started with Artificial Intelligence
Looking to get started with AI? Check out our wiki!
Interested in doing an AMA?
We offer an opportunity for experienced people and companies working on interesting problems in AI to talk to the community about their work and experience in the field through an AMA (Ask Me Anything): Reddit's version of an interview where users can ask you questions. Please contact the moderators for more information.
We would love to hear from you!
Past AMAs:
2019/06/04
IBM researchers, scientists and developers
2018/05/17
Peter Voss (Aigo.ai) on AI assistants, AGI and his company
2018/04/23
Yunkai Zhou (Leap.ai) on AI in recruiting
/r/artificial
I’ve written the below as a handy guide for new features that have just dropped, with a heavy AI focus:
• **Writing Tools**: This suite includes advanced proofreading that goes beyond simple autocorrect, rephrasing options, and an adaptable tone feature with Friendly, Professional, and Concise options. It also offers summarization, key point extraction, and the ability to format text into lists or tables, making it ideal for summarizing articles or reorganizing information with ease. While powerful, it’s best suited to longer passages, as shorter selections may prompt a warning for reduced accuracy.
• **Siri Revamp**: Siri has undergone a significant transformation, both visually and functionally, to respond more fluidly to voice commands—even if the user pauses or rephrases mid-command. It now allows users to type queries, which can be a discreet way to use the assistant in quiet settings, and provides device-specific guidance on using Apple products. However, instructions are text-only, which may be less user-friendly compared to illustrated guides.
• **Priority Messages in Mail**: Apple Intelligence scans incoming emails to identify those that may be high-priority and highlights them in a dedicated inbox section at the top of the app. This helps users focus on essential messages without sifting through everything in their inbox, particularly useful for users who don’t meticulously clean out their mail and may overlook important emails amid clutter.
• **Smart Replies in Mail**: This feature suggests quick, AI-generated responses based on the content of an email, similar to the smart reply options available on platforms like Gmail. Although it’s not for everyone, the functionality is ideal for users who want to respond on the go with minimal typing, especially in high-email environments where brief, efficient replies can save time.
• **Message and Notification Summaries**: Apple Intelligence now generates concise summaries of incoming emails and messages, providing an easy-to-read preview that helps users understand the content before opening. Summaries also appear in lock screen notifications, giving a quick overview of message content at a glance. While it generally works well, it can struggle with casual or fragmented language often found in texts, as well as shorter emails.
• **Memory Movie Creation in Photos**: The Photos app can now auto-generate a movie from selected images based on a user-provided text prompt, organizing visuals into a cohesive slideshow. The feature allows for personal customization—users can edit the soundtrack, title, filters, and even individual images—making it an appealing, user-friendly option for creating sentimental or thematic videos from photo collections.
• **Clean Up Tool in Photos**: This new tool enhances images with AI-powered adjustments, which can be applied to both new and older photos in the gallery. While it works well for straightforward edits, such as brightening and contrast, it’s not yet as robust as competing brands for complex image retouching. It’s a convenient option for users who want quick fixes without leaving the Photos app.
• **Natural Language Search in Photos**: Users can now find images simply by describing what’s in them, which ideally would make searches faster and more intuitive. However, the search relies on precise terms, meaning it might miss images that don’t strictly match the search word (e.g., searching “coffee” may exclude items with related words like “espresso”), making it less comprehensive than some might expect.
• **Phone Call Transcription and Recording**: Apple Intelligence can transcribe and record calls, a feature that’s stored in the Notes app for easy access. This is helpful for capturing important conversations or meeting details, though its accuracy depends on the proximity of the phone to the speaker and background noise. Summarization is also available within these transcriptions, providing quick highlights of key discussion points.
Coming Soon in iOS 18.2:
• **Image Playground, Image Wand, and Genmoji**: These anticipated tools will add creative flexibility, letting users generate custom images or avatars. Genmoji, for instance, aims to create unique, AI-driven emojis tailored to users, while Image Playground and Image Wand will likely support artistic and imaginative visual creations.
• **Visual Intelligence**: This tool is expected to give more contextually aware image analysis, identifying detailed aspects of photos. For example, it could distinguish specific objects, landmarks, or environments, though it may be limited to the latest iPhone models to handle the processing requirements.
• **Enhanced Siri Actions**: The forthcoming Siri updates will include the ability to take more context-sensitive actions within apps and generate responses tailored to a user’s personal profile. This could transform Siri from a basic assistant to a more integrated, personalized helper with expanded functionality across multiple apps and situations.
Which tool are you most looking forward to using?
If you found this useful, subscribe to my newsletter ‘The Cognitive Courier’ where I cover the latest in AI and tech weekly.
Sources:
[4] https://www.theverge.com/2024/10/28/24282017/meta-ai-powered-search-engine-report
I randomly came across another LLM looking to license content from publishers and then share revenues back with the source of the training content...makes sense. That said, I fail to see how this little organization can offer any meaningful revenue to the publishers if the LLM they have built has very little (possibly zero) end user's searching against it.
Any ideas? Thx!
I'm trying to make a voice for a robot, and I want the voice to be created using the robot's sound effects. I was thinking there was some program to input audio files and then use text to speech or a speech sample to make the sound effects "speak English" like those Minecraft villager talking videos made with the villager's sound effects.
Let’s call it out like it is: AI is here to replace white-collar workers.
Microsoft just announced autonomous agents, Anthropic’s Claude launched Computer Use, and countless startups are racing to develop AI assistants that can take on entire jobs (remember Devin, the "first AI software engineer"?.
While AI isn’t on par with humans yet, I find myself asking the question: what if they succeed?
It's obvious how sufficiently capable AI could lead to unprecedented income concentration and labor market disruption. It would cause mass unemployment. Universal Basic Income (UBI) would be the only way to redistribute some of that wealth but governments would probably be slow to act.
The weird thing, though, is that while there is a world where AI automation outpaces the number of new jobs created, that day hasn’t arrived yet. Global productivity this year is actually DOWN and employment is UP (see graph).
There is another world where AI might solve a problem overlooked by some: aging populations and birth rate decline.
I lay out the arguments here in more detai: https://jurgengravestein.substack.com/p/the-economics-of-ai
I've actually had a wide ranging discussion with several different LLMs like Claude, ChatGPT, and Gemini about this subject. I can't make up my mind because it seems to depend on what level you are discussing it. The nature of an LLM seems to be an informal system, and yet that may be just the appearance of an informal system as it's probably using formal rules in its reasoning at some level. Even if it's just the matrix manipulation that is a formal system that should be incomplete in a Gödelian sense. Yet it's also true that at least from our perspective the output has a level of unpredictability that doesn't exist in most valid formal recognized systems.
If you aren't familiar with I incompleteness then I really recommend the nunberphile video to explain it.
https://youtu.be/O4ndIDcDSGc?si=jRuakJORpY9ZZwI1
There are also the related topic of the halting problem.
https://youtu.be/macM_MtS_w4?si=YH8J-gQm7Rfu2AYe
I'm actually going to take a side on this, and claim that it's mathematically undecidable. If you want to replicate some of my research for yourself you can just use the following prompt.
"How might godels incompleteness theorem apply to large language models, and other forms of generative AI?"
I just want to say that I don't have anything against AI art or generative art. I've been messing around with that since I was 10 and discovered fractals. I do AI art myself using a not well known app called Wombo Dream. So I'm mostly talking about using this to deal with misinformation which I think most will agree is a problem.
The way this would work is you would have real images taken from numerous sources including various types of art, and then you would have a bunch of generated images, and possibly even images being generated as the training is being done. The task of the AI would be to decide if it's generated or made traditionally. I would also include the metatdata like descriptions of the image, and use that to generate images via AI if it's feasible. So every real image would have a description that matches the prompt used to generate the test images.
The next step would be to deny the AI access to the descriptions so that it focuses in on the image instead of keying in on the description. Ultimately it might detect certain common artifacts that generative AI creates that may not even be noticeable to people.
Could this maybe work?
Enter a prompt, get a wiki homepage with image(s)! Articles generate on-demand when you click on the article links.
Image generation can take a minute or two (or even 15 minutes if the model is still waking up), so don't fret if you see a broken image link on a page. Just check back later :)
Thanks for your attention and feedback. Have fun!
Sources:
[2] https://techcrunch.com/2024/10/27/meta-releases-an-open-version-of-googles-podcast-generator/
[3] https://finance.yahoo.com/news/google-develop-ai-takes-over-210155614.html
Is mind-uploading like what was portrayed in the 2014 movie "transcendence" theoretically possible, or is it pure science fiction?
What would this process actually involve?
What is everyone thoughts on AI regulation? Imo, there needs to be an AI safety regulatory body with an accompanying regulation like there is for aerospace under the FAA and DO178 or medical tech under the FDA and ISO 13485. Yeah you can whine about "slowing innovation" but AI needs to be treated like a potentially dangerous tech on the order of nuclear energy.
It would not be the first technology to be regulated and it wont be the last. Those other techs were regulated for safety, and the US still maintains a competative advantage anyhow. This would not only cause the AI companies to slow their roll, and protect jobs by regulating what AI can and cant do (looking at you health insurance companies using AI to deny claims), but also create jobs by virtue of the regulation just like it has for these other fields. Compliance and audit professionals, safety critical engineers, QA analysts, etc. make up a huge part of biotechnology and aerospace. You could create an entire industry around AI safety and alignment.
What do you think about what he said?
At a recent AI+Robotics Summit, legendary director James Cameron shared concerns about the potential risks of artificial general intelligence (AGI). Known for The Terminator, a classic story of AI gone wrong, Cameron now feels the reality of AGI may actually be "scarier" than fiction, especially in the hands of private corporations rather than governments.
Cameron suggests that tech giants developing AGI could bring about a world shaped by corporate motives, where people’s data and decisions are influenced by an "alien" intelligence. This shift, he warns, could push us into an era of "digital totalitarianism" as companies control communications and monitor our movements.
Highlighting the concept of "surveillance capitalism," Cameron noted that today's corporations are becoming the “arbiters of human good”—a dangerous precedent that he believes is more unsettling than the fictional Skynet he once imagined.
While he supports advancements in AI, Cameron cautions that AGI will mirror humanity’s flaws. “Good to the extent that we are good, and evil to the extent that we are evil,” he said.
Watch his full speech on YouTube: https://youtu.be/e6Uq_5JemrI?si=r9bfMySikkvrRTkb
Who will announce it? OpenAI? Meta? Ilya alone? A new entity? The AGI itself?
What capabilities will be demonstrated in the presentation? How will they convince us it is an AGI?
What will happen right after?
We'll see if few years how accurate your predictions are.
Hey everyone,
A recent chat with my advanced voice mode got me thinking about the latest advancements in fine-tuning AI models based on real-world user interaction metrics. I’m sure it’s been explored, but the idea is to refine AI output (text, images or otherwise) based on user feedback by whatever means the user interacts with the device. I.e I can’t remember where I heard this, but some sort of generative operating system where every time you turn it on, it’s slightly different and more tailored towards being rhetorical ultimately OS and is primarily trained on user interactions with it in the past via keyboard and mouse.
I’m curious about the cutting-edge projects or research in this space. What are the most advanced or innovative approaches to leveraging user interaction data to fine-tune AI models? How are these projects shaping the future of AI-human interaction?
Thanks in advance!
The Library of Babel greatly improved my intuitive understanding of how neural networks can learn.
The Library of Babel is a library with every book imaginable in it. There lay the books with all possible combinations of words and thus all possible combinations of sentences. Therefore, it contains books with the answers to all of life, books with all theories that mankind hasn't found yet, but also books with a lot of gibberish. If you want to find an answer to your question in the Library of Babel, you probably will never find it by just randomly looking. You need a smart search algorithm that can find the right page with the answer to your question.
There is a direct parallel between neural networks and the Library of Babel. Neural networks are universal function approximators, meaning that they could approximate any function imaginable, be it a function with the answers to life or gibberish functions. Just like with the library of Babel, you need a smart search algorithm, this time not to find the right page, but to find the right neural network configuration.
The learning problem is thus actually just a search problem: Gradient descent and backpropagation are searching algorithms, while the reward function defines what we are searching for.
I found this way of thinking about NN very enlightening, definitely helping me to understand learning more intuitively. I made a more elaborate post on this just now!
I want to hook up ChatGPT to control my outdated but ahead of its time WOWWEE Rovio. But until I remember how to use a soldering iron, I thought I would start small.
Using ChatGPT to write 100% of the code, I coaxed it along to use an ESP32 embedded controller to manipulate a 256 LED Matrix "however it wants".
The idea was to give it access to something physical and "see what it would do".
So far it's slightly underwhelming, but it's coming along ;)
The code connects to WiFi and the ChatGPT API to send a system prompt to explain the situation "You're connected to an LED matric to be used to express your own creativity." The prompt gives the structure of commands on how to toggle the led's including color, etc. and lets it loose to do whatever it sees fit.
With each LED command is room for a comment that is then echo'd to serial so that you can see what it was thinking when it issued that command. Since ChatGPT will only respond to prompts, the controller will re-prompt in a loop to keep it going.
Here is an example of some (pretty creative) text that it adds to the comments...
Comment: Starting light show.
Comment: Giving a calm blue look.
Comment: Bright green for energy!
Comment: Spreading some cheer!
Comment: Now I feel like a fiery heart!
Comment: Let's dim it down.
Comment: A mystical vibe coming through.
Comment: Ending my light show.
And here is the completely underwhelming output that goes along with that creativity:
For some reason, it likes to just turn on then off a few lights in the first 30 or so of the matrix followed by a 100% turn on of the same color across the board.
I'm going to work on the prompt that kicks it off, I've added sentences to it to fine tune a bit but I think I want to start over and see how small I can get it. I didn't want to give it too many ideas and have the output colored by my expectations.
Here are two short videos in action. The sequence of blue lights following each other was very exciting after hours of watching it just blink random values.
https://reddit.com/link/1gcrklc/video/yx8fy2yl85xd1/player
https://reddit.com/link/1gcrklc/video/fqkb1cpn85xd1/player
Looking forward to getting (with a small prompt) to do something more "creative". Also looking forward to hooking it up to something that can move around the room!
All in all it took about 6 hours to get working and about $1 in API credit. I used o1-preview to create the project, but the controller is using 4o or 4o-mini depending on the run.
EDIT:
Based on feedback from u/SkarredGhost and u/pwnies I changed the initial system prompt to be about creating a dazzling show first, then explain the command structure to implement, rather than making the commands the intent (and then adding color to why the commands exist).
This completely changed the character of the output!
I'm now getting longer, more colorful full displays on the whole board, followed by a few quick flashes.
Curiously, the flashes always happen within the first 30 LED's or so like the initial run.
here are a few runs:
Comment: Starting the light show.
Comment: Setting a blue background.
Comment: Highlighting LED 4.
Comment: Highlighting LED 8.
Comment: Highlighting LED 12.
Comment: Changing to green background.
Comment: Highlighting LED 16.
Comment: Highlighting LED 24.
Comment: Changing to orange background.
Comment: Highlighting LED 31.
Comment: Ending the light show.
Comment: Starting the light show.
Comment: All LEDs glow red.
Comment: All LEDs change to green.
Comment: All LEDs change to blue.
Comment: Clearing LEDs for the next pattern.
Comment: Twinkle LED 0.
Comment: Twinkle LED 15.
Comment: All LEDs to white for a wash effect.
Comment: Fade out to black.
Sources:
[1] https://futurism.com/the-byte/claude-ai-bored-demonstration