As generative AI continues to dominate the headlines, it’s hard sometimes to find actual working business use cases among the hype. Writer is a San Francisco startup that is working to create generative AI writing products with the enterprise in mind. Today, the company announced a new capability for its Palmyra model that generates text from images, including graphs and charts, they call Palmyra-Vision.
May Habib, company co-founder and CEO, says that they made a strategic decision to concentrate on multimodal content, and being able to generate text from images is part of that strategy. “We are going to be focused on multimodal input, but text output, so text generation and insight that is delivered via text,” Habib told TechCrunch.
By following that guiding star, the company decided to analyze images, rather than produce them (at least for now). She reserves the right to create charts and graphs at some point from data, but that’s not something they are doing at the moment. This particular release is focused on generating text from those kinds of images.
The company uses a multiple model approach to produce the Palmyra-Vision results, where each model has a specific job to do in determining what is in the image and then generating the text with four nines of accuracy, according to Habib.
This has a number of use cases, including an e-commerce website generating text from thousands of changing images to populate the website with the latest merchandise without having a human keep up with every change, or interpreting key takeaways from charts and graphs automatically. Another example is compliance checking. For instance, a pharmaceutical company could use Palmyra-Vision to perform an automated FDA compliance check against ad copy, making sure that the ad is compliant with FDA regulations as outlined in an associated document, as in the example below.
Finally the product can interpret and summarize handwritten notes into text, but Habib says that it requires training the model for individual use cases such as medical or insurance, so that the accuracy is there.
Habib says that she does not recommend using these tools without a human review as part of the workflow. She believes this is absolutely essential because any model can hallucinate (make things up) or simply get facts wrong, and it’s important to have people checking the results. While they always recommend this to every customer, and most understand it at this point, she believes that it’s eventually going to require a more automated workflow to make it happen consistently across customers, something she says they are working toward.
The company has raised $126 million to date, per Crunchbase data, and is currently talking to the big cloud infrastructure platforms about partnering as they attempt to scale the company. Its most recent round was a $100 million Series B last September led by Iconiq.
The latest Palmyra release with the image to text capabilities is available starting today.