Meta unveils Emu Edit: Precise image editing via text instructions Existing systems struggle to interpret edit instructions correctly. Emu Edit tackles this through multi-task training.
You can predict disease progression by modeling health data in latent space Forecasting personalized disease progression by modeling clinical data in a latent space
Researchers taught GPT-4V to use an iPhone and buy things on the Amazon app It's still early, but a GPT-4V agent can navigate smartphone GUIs using a combination of image processing and text-based reasoning.
AI can discover hidden relationships in tabular data Identifying new classes within unlabeled data sets
LCMs are a new way to generate high-quality images much faster LCMs achieve similar quality results to LDMs, but in just 1-4 steps instead of hundreds.
They found a new NeRF technique to turn videos into controllable 3D models Creating realistic, animated 3D models from video footage has been a longstanding challenge in the field of computer graphics due to the complexity of human movement and the subtleties of appearance under varying conditions. Traditionally, this process has relied on costly and labor-intensive techniques such as multi-camera setups and detailed
Diffusion might be a better way to model probability in PPLs DMVI uses diffusion models to approximate the probability distributions for faster, more accurate automated inference.