/r/DeepGenerative

Photograph via snooOG

This is a place to share and discuss deep generative models i.e. Variational Auto-Encoders (VAEs) and Generative Adversarial Networks (GANs).

Deep generative models is a new sub-field of machine learning that uses deep neural networks to generate convincing "samples" from existing data.

This sub aims to be an intersection of industry, researcher, and hobbyist interests in this burgeoning field. Posts lacking effort will be deleted, and simple questions should be asked in the weekly simple questions thread. This is not a place for anger or off-topic discussions. That being said, hopefully we can end up with a tight-knit community here on reddit.

/r/DeepGenerative

714 Subscribers

2

[Tutorial] Master Deep Voice Cloning in Minutes: Unleash Your Vocal Superpowers! Free and Locally on Your PC

0 Comments
2023/05/14
14:27 UTC

8

I have thousands of architecture photographs (it's my job) and I would love to do something like this with them. Could anybody help point me in the right direction? I'm not the most AI/Deep advanced person.

0 Comments
2021/09/20
10:38 UTC

1

Brad Pitt in Titanic - deepfake by facemagic app. Use Python to write the data preparing, designed network, and training/test code. Rate the performance!

1 Comment
2021/08/19
03:26 UTC

2

[Practical] adversarial attacks on neural networks: fast gradient sign method

We’ll try to prepare a very popular attack, the Fast Gradient Sign Method, to demonstrate the security vulnerabilities of neural networks.

We cover all three steps:

  • Calculate the loss after forward propagation,
  • Calculate the gradient with respect to the pixels of the image,
  • Nudge the pixels of the image ever so slightly in the direction of the calculated gradients that maximize the loss calculated above.

Exploring the Fast Gradient Sign Method

0 Comments
2021/03/01
10:17 UTC

1

[article] AI Limits: Can Deep Learning Models Like BERT Ever Understand Language?

It’s safe to assume a topic can be considered mainstream when it is the basis for an opinion piece in the Guardian. What is unusual is when that topic is a fairly niche area that involves applying Deep Learning techniques to develop natural language models. What is even more unusual is when one of those models (GPT-3) wrote the article itself!

Understandably, this caused a flurry of apocalyptic terminator-esque social media buzz (and some criticisms of the Guardian for being misleading about GPT-3’s ability).

Nevertheless, the rapid progress made in recent years in this field has resulted in Language Models (LMs) like GPT-3. Many claim that these LMs understand language due to their ability to write Guardian opinion pieces, generate React code, or perform a series of other impressive tasks.

To understand NLP, we need to look at three aspects of these Language Models:

  • Conceptual limits: What can we learn from text? The octopus test.
  • Technical limits: Are LMs “cheating”?
  • Evaluation limits: How good are models like BERT?

So how good are these models?

Can Deep Learning Models Like BERT Ever Understand Language?

0 Comments
2020/12/01
15:48 UTC

7

GAN Loss Functions and Their Inconsistencies in Performance

If you don't mind I would like to show you what we recently prepared.

Generic GANs setup is widely known: G and D play min-max game where one is trying to outsmart the other.

That’d be all fine if it was that simple when you’re actually implementing them. One common problem is the overly simplistic loss function.

Here, we analyse this problem by examining different variations of the GAN loss functions to get a better insight into how they actually work. We look at many loss function formulations and analyse issues like mode collapse, vanishing gradients and convergence.

We've attempted to give that insight in the article so, hopefully, you find this helpful/useful.

GAN Loss Functions

0 Comments
2020/11/04
09:08 UTC

2

Generate text for report/ tabular data trends and statistics

Hello geeks,

 I am new to deep generative models, I have a problem statement where I want to generate text for trends in tabular data showing trends. Any ideas how this can be achieved?
0 Comments
2020/06/11
02:00 UTC

0

"A different energy" - RunwayML + GPT-2 short sci-fi story

0 Comments
2020/04/06
12:45 UTC

3

Deepfake Mobile App Launch - Create your own high-quality celebrity deepfakes in minutes

Hi guys,

We got our start making deepfakes on reddit channels, and now we've launched our new mobile app that lets everyone make deepfakes. We're live on product hunt today. Check it out. We'd love your feedback: https://www.impressions.app/

0 Comments
2020/03/31
22:21 UTC

8

Delving deep into Generative Adversarial Networks (GANs)

A curated, quasi-exhaustive list of state-of-the-art publications and resources -sorted by citations/start- about GANs &their applications.

GitHub repository

https://i.redd.it/n3ql79bhhkv21.gif

2 Comments
2019/05/01
09:17 UTC

1

[R] MGBP : Multi-Grid Back-Projection super-resolution

0 Comments
2019/01/30
23:09 UTC

3

Some questions about Text-to-Image Synthesis

I start to focus on Text-to-Image Synthesis on complex Dataset (like MSCOCO) Using GAN these days.

After searching, some relevant works are StackGAN, Hong et.al. and AttnGAN

It seems there are mainly two methods for synthesis: either generating from scratch (low resolution) to reality (high resolution) or generating from bbox to shape(Mask) and finally to image.

Here are some of my questions about current situation of Text-to-Image Synthesis research:

  1. Is there any other method to deal with this kind of task?
  2. What are the pros and shortcuts of these two methods?
  3. In a view of such a high Inception Score AttnGAN has achieved (nearly 170% improved), it seems rather difficult to get improvement. Is it possible to get my paper accepted if I don't exceed AttnGAN?
0 Comments
2018/08/05
11:29 UTC

1

[D] Combining AE with PG-GAN

I want to synthesise high-res images by concatenating two latent vectors (meaning, not from a random sample). Does it makes sense to train the AE with GAN loss or is it better to first train the AE and as a second step to improve the decoder with further training using a GAN loss? Does any of this makes sense?

2 Comments
2018/05/16
08:06 UTC

8

[P] Implementation of Progressive Growing of GANs in PyTorch

Hi everyone, Here is my implementation of the Progressive Growing of GANs from Nvidia Research: https://github.com/Latope2-150/Progressive_Growing_of_GANs-PyTorch

The original paper is this one: Progressive Growing of GANs for Improved Quality, Stability, and Variation

For now, there is only an example of MNIST but it is not very complicated to adapt it to other datasets. I haven't had the time to train it on large datasets but I have tested it on 320x320 images so I know it works for higher resolutions.

This implementation is as close as possible from the original one in default configuration but can easily be modified. I trained it on a single Nvidia Tesla P100 and I still need to add [efficient] multi-GPU training.

Future work includes testing GroupNorm as normalization, making it conditional, changing the loss function (WGAN-GP for now), etc.

If you have any question, feel free to ask!

7 Comments
2018/05/08
18:42 UTC

3

[D] Why is Z-dimension for GANs usually 100?

1 Comment
2018/04/29
20:21 UTC

5

Specific resources to learn GANs?

I was just wondering what is the standard resource that people refer to when learning about GANs?

Thanks!

0 Comments
2018/04/17
00:30 UTC

3

StackGAN + CycleGAN = Text guided image-to-image translation?

I am looking to build a model that implements a version of text guided image translation.

For example, an image of a man + "walking" --> Image of man walking. Or something even simpler, but you get the basic idea. I am unable to find any existing research for this. Any suggestions/ new ideas will be very helpful :)

3 Comments
2018/04/15
17:10 UTC

4

[P] Deep Pensieve™ - The 2017 Great American Eclipse Roadtrip

0 Comments
2018/04/01
15:18 UTC

2

[D] State of art models for Image captioning using GANs?

2 Comments
2018/04/01
07:54 UTC

8

[D] Stabilizing Training of GANs Intuitive Introduction with Kevin Roth (ETH Zurich)

0 Comments
2018/04/01
06:47 UTC

1

[Hobbyist] Generating baseball players from outlines with pix2pix

This a project I played around with using affinelayer's pix2pix implementation. The goal was to generate baseball player headshots with an eye towards using them in the Out of the Park computer games for fictional players. I didn't quite get that far into it, but I did get some interesting results. You can see a sample of the system running on held-out test data here.

In most cases, pix2pix is able to correctly impute a variety of features of the original image from only a rough black-and-white sketch. It colors old-timey pictures black and white, it usually (not always) correctly colorizes hats based on team logos, and can often make a reasonable guess of a player's skin color. There are a handful of failure cases in the bunch, although some of them are failure cases of the process I used to generate the outlines.

The data set I used is a compilation of over thousands of photos of almost everyone who's ever played Major League Baseball, available here. Photos of modern players are very consistently framed, but as you go back in time, you get more and more variety. Some players from the 1800s are merely sketches or extremely grainy, low-resolution blurs. I generated the training outlines using imagemagick's edge detector, although I think I need to tune the settings a bit to get a more consistent output - a few players came out almost completely blank.

For reference, the original pix2pix paper is here

2 Comments
2018/04/01
05:40 UTC

Back To Top