/r/augmentedreality
AR News and Community: All about the Evolution ► AI Glasses ► Smart Glasses ► Augmented Reality ► Mixed Reality ■ INMO Rokid XREAL RayNeo Viture Even Realities Vuzix QONOQ Magic Leap HoloLens Snap Spectacles Apple Vision Pro Ray-Ban Meta Quest Orion Niantic 8th Wall ARKit WebXR Snapdragon Spaces Vuforia ARCore Android XR Lens Studio Optics Computer Vision Ubiquitous Ambient Spatial Computing
/r/augmentedreality
Snippets from Boz's end of year blog:
2024 was the year AI glasses [Ray-Ban Meta] hit their stride. When we first started making smart glasses in 2021 we thought they could be a nice first step toward the AR glasses we eventually wanted to build.
The biggest thing we’ve learned is that glasses are by far the best form factor for a truly AI-native device.
especially when it’s a multimodal system that can truly understand the world around you.
We’re right at the beginning of the S-curve for this entire product category, and there are endless opportunities ahead. One of the things I’m most excited about for 2025 is the evolution of AI assistants into tools that don’t just respond to a prompt when you ask for help but can become a proactive helper as you go about your day.
The next big step toward the metaverse will be combining AI glasses with the kind of true augmented reality experience we revealed this year with Orion.
We probably learned as much about this product space from a few months of real-life demos than we did from the years of work it took to make them. There is just no substitute for actually building something, putting it in people’s hands and learning from how they react to it.
the real impact of Orion will be in the products we ship next and the ways it helps us better understand what people love about AR glasses and what needs to get better. We spent years working on user research, product planning exercises, and experimental studies trying to understand how AR glasses should work, and that work is what enabled us to build Orion. But the pace of progress will be much more rapid from here on out now that we have a real product to build our intuition around.
Photorealistic rendering of a long volumetric video with 18,000 frames. Our proposed method utilizes an efficient 4D representation with Temporal Gaussian Hierarchy, requiring only 17.2 GB of VRAM and 2.2 GB of storage for 18,000 frames. This achieves a 30x and 26x reduction compared to the previous state-of-the-art 4K4D method [Xu et al. 2024b]. Notably, 4K4D [Xu et al. 2024b] could only handle 300 frames with a 24GB RTX 4090 GPU, whereas our method can process the entire 18,000 frames, thanks to the constant computational cost enabled by our Temporal Gaussian Hierarchy. Our method supports real-time rendering at 1080p resolution with a speed of 450 FPS using an RTX 4090 GPU while maintaining state-of-the-art quality.
Paper: Long Volumetric Video with Temporal Gaussian Hierarchy
Abstract: This paper aims to address the challenge of reconstructing long volumetric videos from multi-view RGB videos. Recent dynamic view synthesis methods leverage powerful 4D representations, like feature grids or point cloud sequences, to achieve high-quality rendering results. However, they are typically limited to short (1~2s) video clips and often suffer from large memory footprints when dealing with longer videos. To solve this issue, we propose a novel 4D representation, named Temporal Gaussian Hierarchy, to compactly model long volumetric videos. Our key observation is that there are generally various degrees of temporal redundancy in dynamic scenes, which consist of areas changing at different speeds. Extensive experimental results demonstrate the superiority of our method over alternative methods in terms of training cost, rendering speed, and storage usage. To our knowledge, this work is the first approach capable of efficiently handling minutes of volumetric video data while maintaining state-of-the-art rendering quality.
Project Page: https://zju3dv.github.io/longvolcap/
Imagia’s innovation is rooted in five years of metamaterial optics research from the scientists and engineers at Imagia. After years developing and perfecting metalens technology for optical components used in devices like AR/VR headsets, Imagia has widened its portfolio to explore performing mathematical convolutions directly in optical elements. The technology works by applying a set of mathematical convolutions in an array of optical filters. The light passing through a metalens is steered and transformed by billions of nanoscale components on each Imagia metalens that imparts a hard-coded pattern recognition algorithm to the signal.
Imagia has demonstrated a hand and gesture detector that works with only eight pixels of information and with a response time of only 80 microseconds. By contrast, traditional optics and processing typically take 30-40 milliseconds to process the millions of pixels for digital algorithmic approaches.
By processing the image directly in the optics, Imagia is able to realize a 500x reduction in detection latency for a fraction of the power compared to the traditional method of capturing an image and then processing that data in downstream software. Running at a comparable framerate to a standard image processing system, the Imagia solution consumes less than 1% of the power.
Imagia’s technical demo module built for gesture detection can sense a hand and its orientation. This system replicates the incoming image eight times using a metasurface lens array, performs different convolutions on each of the images with more metasurface optical filters, then pools information optically. If the required pattern is detected, light passes through the filters. If it is not, no light passes through. The result arrives in a matter of microseconds (two to three orders of magnitude faster than an electronic/algorithmic approach, according to Kress).
Imagia's approach: processing in optics
Applications like artificial intelligence and active feature detection in laptops and AR/VR headsets are set to receive outsized benefit from the innovation, which could extend battery life of these devices by 20% or more.
“The optical technology being developed at Imagia allows for compressive sensing of complex features in just a few pixels,” says Bernard Kress, Director of Google AR. “Asking photons to do the electron work allows for fast local processing at lower power in a smaller form factor, key assets of any all-day-use smart glass[es].”
Imagia today is launching an early access program that enables partners to explore solutions and applications running on the Processing Optics platform. The program comes on the heels of the successful traction of Imagia’s closed demos of Processing Optics during 2024 to multinational device makers in markets spanning semiconductors to consumer electronics. Imagia’s first product, a detector module for mobile devices powered by Processing Optics, is expected to launch in 2025.
SHARGE | loomos sent out the first mail with specs. The glasses were first announced a few months ago but have not shipped yet. But January seems realistic. These come with on-device AI powered by a UNISOC chip. So, here's the mail content:
Why Do You Need loomos?
Forget bulky wearables or AI devices that just mimic your smartphones. loomos AI Glasses let you capture stunning moments & memories with 4K photos and 1080P videos, enjoy immersive open-ear audio, and get instant help from the voice assistant. All packed in a lightweight & everyday design.
The perfect companion for your work tips and life hack. The idealI memory hub for your every valuable moment.
Sure, your phone's AI does great, but this isn't about replacing it. It's bout letting you do more-hands-free, immersive, and effortlessly simple.
What Makes loomos Stand-out?
Explore the key features:
[Capture] World's 1st 16MP camera on glasses. Capture more.
[Listen] Industry-1st AAC0920 speaker for Hi-Fi open-ear audio. Crystal clear.
[AI-ssistant] Powered by GPT-4o. More responsive & less friction.
[Battery] Market's largest 450mAh. All-day standby & Always-ready.
[Fit] 49g with adjustable hinger & nose pads. All-day fit for everyone.
Dropping early 2025.
Have you heard of any distinctions?
Probably both will run Android XR, right? Sony will not wait for a second gen headset to go with Android XR when the first one is not even released yet? Samsung's HMD will be released as the first Android XR device but that will probably mean it won't be long until it happens because Sony will probably want to release their HMD in 2025 as well.
Afaik, both will use the Snapdragon XR2+ Gen2. Both will have 4k resolution per eye. Both will have eye tracking.
The third Android XR passthrough HMD option: Lynx R2. Probably with HyperVision optics. Probably with eye tracking - at least as an option. They cancelled it for the R1 HMD because of cost but this time they will most likely use it. But will it have the same XR2+ Gen2 processor? Is Lynx an alternative for you?
The forth option: Maybe you are not interested in passthrough AR and rather wait for Samsung or Google smart glasses with Android XR. Or maybe Magic Leap? Magic Leap was announced as a company supporting Android XR. Does that mean there will be a Magic Leap 3 ?
HI,
Is there any company who has the AI+ AR display SmartGlasses ready to purchase as an sample. I am intesrested in buying few samples for the evaluation purpose
🕶️We're looking for passionate early adopters to test our latest products and share valuable feedback! If you're excited about exploring cutting-edge AR technology and shaping the future of innovation, we want you on board. Join us and be part of an incredible journey! Join our DISCORD for more info👇
https://discord.gg/7Xja4PXE
Are there any glasses out there that can translate text? I've seen that most can translate voices, but none that specify if they can also translate text.
Hi, I’m in the 3D/video production industry and I am always wondering what type of jobs will AR glasses create in a future without phones but only smart body tech. Always wanted to think a bit ahead and adapt before video productions are completely erased by quick Ai videos.
Hello, im looking for a good library for web AR to implement on my webiste. I want an image tracking AR to present a 3D model.
Ive tried AR.js and found that the model appears in different size and location when using different images or devices...
Is there a better library to use for web? or is there a simple solution for presenting the same way on all devices, no matter what image is tracked and what device is tracking?
Would be glad for any help,
Ty in advance, ODINN
I've watched this interview where Demis Hassabis, Google DeepMind CEO, talks about Project Astra smart glasses for cooking as a use case, and looking into new wearable form factors beyond glasses.
He also talks about going back to gaming and working on AI characters for games as well as auto-balancing. He says, generating games with AI is too far away. But certain things could improve gaming in the near future.
What do you think would be fun and interesting in this domain?
I think it would make sense to build a digital twin of a city and train the AI there. With increasing complexity. And with each new activated system the AI learns new things. From navigation to social interactions.
And at the end there could be characters that navigate the city in AR in something like a Pokemon Go, right? Does that make sense?
Along the way you use the new egocentric data from wearable sensors to train the system and at some point you end up with incredibly helpful AI that understands the human and the world of the human. It seems like AR is a necessary step towards AGI.
Its Relativly hard to find Clear Facts on the whole thing that is actually factual and condensed.
Is it on all current Glasses as easy as taking the resolution and dividing it threw the FoV or are some glasses able to improve the fidelity by showing different pixels of the images on to both eyes. I expect that Glasses able to give 3D aspect use two different pictures one for each eye.
Is the visual quality of the image greater threw then the simple math, threw Optical Perception Effects of our Eyes and Brain? Such as QuantumDots Color Saturation making the Brain think its even brightner then it compared to a Display with the same brightness just not as Saturated Colors.
I also know how Oleds are a other World when it comes to image clarity, and the signal/content being the worst problem why everything is not close to reality. Black level raise being the worst problem with the state of reproducing reality in content.
Sadly i do not own a large Oled Screen (tv) yet and have not seen a large Oled Screen in the pitch black.
I know every kind of interference after the picture goes away from the display, mirroring the picture onto your eyes threw in ways. Will have loss of clearity and other things because thats just how reality works.
In the end it means do i see more then Res/FoV - x amount of visual abnormalities of optics. Or is there more to it only speaking about generals and not on size is king and such normal screen stuff.
I also dont know anybody who has a pair of "movie ar" glasses.
Is there such a place of facts for these glasses yet(lol not like i'm talking in a subreddit for this but if we got something like this pinning a hard fact/spec source in the sub/or making a wikitab would be great)
There is just a intense amount of adverting sponsored content about it talking not much about the actaul product and not talking about the end result on this.
dont look on my profil if you dont want to see nsfw just saying. Thank you for your Time.
Which glasses best fit the following critera today:
- Prescription lenses
- Real time translation of spoken Spanish to written English text in the lens (vs looking at a phone)
- Navigation
This would be a gamechanger for my visits to Mexico.
Machine translation: In this software update, the company has partnered with NTT Sonority Inc. (hereinafter, NTT Sonority) to incorporate NTT's patented technology "Intelligent Microphone". This makes it possible to distinguish and capture your own voice from the other person's voice . MiRZA will continue to strive to improve both its hardware and software, and work to improve usability.
NTT's patented technology "Intelligent Microphone" is a hybrid of two technologies: "Beamforming" which recognizes the acoustic space from the time difference when the sound reaches two microphones and identifies the speaker, and "Spectral Filter" which removes noise and extracts only the voice. This allows only the speaker's voice to be naturally extracted and delivered to the other party.
This time, by partnering with NTT Sonority to equip MiRZA with an "Intelligent Microphone", a function has been added that separates the voice of the wearer of the glasses from the voice of the person speaking in front of the glasses by limiting the sound collection area like a directional microphone.
Previously, it supported omnidirectional sound collection and sound collection only of the wearer's voice during a call, but new surrounding sound collection and forward sound collection functions have been added. This makes it possible to collect sound that separates the wearer's voice from the voice of the person speaking, and to collect only the voice of the person speaking in front of the glasses.
In the future, it will be possible to record with different sound collection range settings in various application development.
We already know about Google's Astra project, which uses Gemini in real-time to answer questions from videos. It also has multimodal memory, meaning it can remember the last 10 minutes of events(not sure about exact minutes, but I read somewhere) , like noticing that your keys are on the table, for example.
Recently, Google released videos showcasing Gemini integrated with Samsung headsets and AndroidXR, featuring an always-on assistant. This offers a very helpful and improved interface for using AI. However, at the same time, it raises concerns about privacy, which can be quite scary.
What if Gemini examines my view while I’m looking at important documents?
Yes, we can always pause the assistant at such moments, but it’s very easy to forget to pause something that runs in the background.
So, what do you think? How will Google address these privacy concerns? One solution could be running Gemini (or similar MLLMs) on-device, like Apple does with Apple Intelligence. But I don’t think this is feasible for Head-Mounted Displays or Smart Glasses.
Share your thoughts on this.
Does anybody know if Google glasses have been scrapped or is Google continuing this?
Any idea how to develop for their platform? Does it all go on the Google App Store or do we have a different platform for this?
Any help would be much appreciated.
Hey Everyone!
When I visited INMO's HQ in September I told them about the r/augmentedreality of course. And we were thinking of ways how we can involve the community and do something together. Back then I mentioned that we were talking about a Q&A / AMA but we also talked about the possibility to find early teasters here 😎 And the INMO team was really sweet. They said that I can decide who it will be and they trust my judgement.
So, on the other hand that means that I'm writing this without thinking too much about it 😅 Just let me know why you want to be an early tester for the device. I don't think there's a date yet for the international version. But, if you are a dedicated user of our subreddit or have your own channels where you talk about smart glasses or AR/VR and have a good use case for the main use case of the glasses, which is translation, then that will definitely be a good reason to choose you! 🤞
About INMO Go 2 — smart glasses with binocular display, microLED and waveguides with 2000 nits brightness to the eye. Monochrome Green. Waveguide Front Light Leakage Reduction. 15 Degree Downward Tilt.
UNISOC quad-core processor. Android 9.0. Dual Batteries: 440mAh. Charged to 80% in 20 Minutes. Battery Life 150 Minutes.
Main Use Case: Offline Translation and Transcription in 8 Languages: Chinese, English, Japanese, Korean, French, Spanish, Russian and German. Online Translation in 40 Languages. Recognition of 90 Accents. Customization for Industry-Specific Technical Jargon.
Another Use Case: Teleprompter. Discretely Contolled via the INMO RING 2.
Price: 3999 Yuan ($550). Launch Discount Price: 3299 Yuan ($455)
Launch in China: December. International launch: No date yet.
WASHINGTON, December 11, 2024—The Federal Communications Commission today adopted new rules to expand very low power device operations across all 1,200 megahertz of the 6 GHz band alongside other unlicensed and Wi-Fi-enabled devices. This added flexibility in the 6 GHz band will bolster a growing eco-system of cutting-edge applications like wearable technologies and augmented and virtual reality, which will enhance learning opportunities, improve healthcare outcomes, and bring new entertainment experiences. The FCC has, in recent years, expanded unlicensed use between 5.925 and 7.125 GHz, helping to usher in Wi-Fi 6E, set the stage for Wi-Fi 7, and support the growth of the Internet of Things.
The Report and Order permits the very low power (VLP) class of unlicensed devices to operate across 350 megahertz of spectrum in the U-NII-6 (6.425-6.525 GHz) and U-NII-8 (6.875-7.125 GHz) portions of the 6 GHz band at the same power levels and technical/operational protections as recently approved for the U-NII-5 (5.925-6.425 GHz) and U-NII-7 (6.525-6.875 GHz) bands while protecting incumbent licensed services that also operate in the band. These VLP devices will have no restriction on locations where they may operate and will not be required to operate under the control of an automatic frequency coordination system. To ensure the risk of interference remains insignificant, the devices will be required to employ a contention-based protocol and implement transmit power control while prohibited from operating as part of a fixed outdoor infrastructure.
VLP devices operate at very low power across short distances and provide very high connection speeds, which are ideal for the types of high-data rate cutting-edge applications that will both enrich consumer experiences and bolster the nation’s economy. The FCC’s actions in the 6 GHz band will spur innovation by providing more capacity for emerging technologies and applications, such as augmented reality and virtual reality, in-car connectivity, wearable on-body devices, healthcare monitoring, short-range mobile hotspots, high accuracy location and navigation, automation, and more.
Action by the Commission December 11, 2024 by Third Report and Order (FCC 24-125). Chairwoman Rosenworcel, Commissioners Carr, Starks, Simington, and Gomez approving. Chairwoman Rosenworcel and Commissioners Starks issuing separate statements.
ET Docket No. 18-295, GN Docket No. 17-183