Adobe’s suite of photo and video editing software has long leveraged the assistance of machine intelligence to help its human users do their jobs, having employed the Sensei AI system for more than a decade to power features like Neural Filters in Photoshop or Acrobat's Liquid Mode. On Tuesday, Adobe revealed its next generation of AI features, a family of generative models the company has collectively dubbed, Firefly — the first of which will generate both images and font effects.
“Generative AI is the next evolution of AI-driven creativity and productivity, transforming the conversation between creator and computer into something more natural, intuitive and powerful,” said David Wadhwani, president, Adobe’s Digital Media Business, said in Tuesday’s release. “With Firefly, Adobe will bring generative AI-powered ‘creative ingredients’ directly into customers’ workflows, increasing productivity and creative confidence for all creators from high-end creative professionals to the long tail of the creator economy.”
With it, would-be digital artists are no longer limited by their sub-par dexterity or sheer lack of artistic talent — they will be able to speak into existence professional-quality illustrations using only the power of their words. And it's not just text-to-image — Firefly’s multimodal nature means that audio, video, illustrations and 3D models can all be generated via the system and enough verbal gymnastics.
The first model of the Firefly family is, according to the company, trained on "hundreds of millions" of images from Adobe's Stock photo catalog, openly licensed content and stuff from the public domain, virtually guaranteeing the model won't result in lawsuits as StableDiffusion did with the Getty unpleasantness. It also helps ensure that Stock photographers and artists will be compensated for the use of their works in training these AIs.
Engadget was afforded a brief preview of the system ahead of Tuesday’s announcement. The input screen, where users will enter their text-based prompt to the system, features a curated selection of generated pieces as well as the prompts that instigated them. These serve to highlight the models generative capabilities and inspire other users to explore the bounds of their machine-assisted creativity.
Once the user inputs their text prompt (in this case, Adobe’s PR used an adult standing on a beach with a double exposure effect using images derived from Adobe’s Stock photo database), the system will return around a half dozen or so initial image suggestions. From there, the user can select between popular image styles and effects, dictate their own edits to the prompt, collaborate with the AI and generally fiddle with the highly-steerable process until the system spits out what they’re looking for. The resulting image quality was nearly photorealistic, though none of the images from the demo features hands so we weren't able to count fingers for accuracy.
Initially, the trained image database will be Adobe's own licensed Stock library though the company is looking into allowing individual users to incorporate their own portfolios as well. This should allow photographers with their own established styles to recreate those aesthetics within the model so that what it generates fits in with the user's existing motif. The company did not provide a timeline for when that might happen.
The first model also has a sibling feature that can create customized font effects and generate wireframe logos based on scanned doodles and sketches. It’s all very cool but could potentially put just an unconscionable number of digital artists out of work if it were to be misappropriated. Adobe’s Content Authenticity Initiative (CAI) seeks to prevent that from happening.
The CAI is Adobe’s attempt to establish some form of ground rules in this new Wild West industry of Silicon Valley. It is a set of proposed industry operating standards that would establish and govern ethical behaviors and transparency in the AI training process. For example, the CAI would create a “do not train” tag that works in same basic idea as robots.txt does. That tag would be persistent, remaining with the art as it moves through the internet, alerting any who came across it that it was made by a machine. So far around 900 entities worldwide, "including media and tech companies, NGOs, academics and others," per the release, have signed on to the plan.