Since the launch of ChatGPT, I can’t remember a meeting with a prospect or customer where they didn’t ask me how they can leverage generative AI for their business. From internal efficiency and productivity to external products and services, companies are racing to implement generative AI technologies across every sector of the economy.
While GenAI is still in its early days, its capabilities are expanding quickly — from vertical search, to photo editing, to writing assistants, the common thread is leveraging conversational interfaces to make software more approachable and powerful. Chatbots, now rebranded as “copilots” and “assistants,” are the craze once again, and while a set of best practices is starting to emerge, step 1 in developing a chatbot is to scope down the problem and start small.
A copilot is an orchestrator, helping a user complete many different tasks through a free text interface. There are an infinite number of possible input prompts, and all should be handled gracefully and safely. Rather than setting out to solve every task, and run the risk of falling short of user expectations, developers should start by solving a single task really well and learning along the way.
At AlphaSense, for example, we focused on earnings call summarization as our first single task, a well-scoped but high-value task for our customer base that also maps well to existing workflows in the product. Along the way, we gleaned insights into LLM development, model choice, training data generation, retrieval augmented generation and user experience design that enabled the expansion to open chat.
LLM development: Choosing open or closed
In early 2023, the leaderboard for LLM performance was clear: OpenAI was ahead with GPT-4, but well-capitalized competitors like Anthropic and Google were determined to catch up. Open source held sparks of promise, but performance on text generation tasks was not competitive with closed models.
To develop a high-performance LLM, commit to building the best dataset in the world for the task at hand.
My experience with AI over the last decade led me to believe that open source would make a furious comeback and that’s exactly what has happened. The open source community has driven performance up while lowering cost and latency. LLaMA, Mistral and other models offer powerful foundations for innovation, and the major cloud providers like Amazon, Google and Microsoft are largely adopting a multi-vendor approach, including support for and amplification of open source.
While open source hasn’t caught up in published performance benchmarks, it’s clearly leap-frogged closed models on the set of trade-offs that any developer has to make when bringing a product into the real world. The 5 S’s of Model Selection can help developers decide which type of model is right for them: