The Checkpoint Issue #8 – StyleGAN sliders and lions with hands
News and articles
Google Imagen – Google recently announced Imagen, a text-to-image AI model that creates more realistic images than DALL-E 2, the previous state-of-the art.
Open source version of Imagen started – Sadly Google opted not to release the code or model due to bias fears. But happily, Lucidrains (who was also behind the DALL-E 2 implementation), has begun an open source version. Follow the progress at the link.
LAION Aesthetic – A subset of the LAION 5B dataset that only contains aesthetic images (as determined by AI, of course). Instructions for using it are on the Github page, and it's also available in Majesty Diffusion.
CogVideo – Impressive-looking new model for creating text-to-video. No code as yet, but you can try out CogView, the still image version, here. Plus the lion drinking water is hilarious.
Flexible diffusion modelling of long videos – Another video creation model, but this one can generate photorealistic, coherent videos that last over an hour after being given just a few starting frames. Paper.
Featured notebook – Pixel Alchemist
PixelAlchemist is a fun interface to StyleClip, providing the ability to edit images in realtime with custom prompts and handy controls. The notebook comes with a bunch of different models, including FFHQ, churches, cars, animals and posters, but it's also possible to load your own.
Dragging a slider and seeing the effect immediately really brings home how powerful these models are, and it's a lot of fun to experiment with.
Get it:
Thanks for reading!
If you have anything you’d like to be featured or want to get in touch, give me a shout on Twitter or via email. Please also consider supporting me on Patreon so that I can spend more time creating content like this.