Friday, November 22, 2024
5.5 C
New York

Nvidia’s new tool lets you run GenAI models on a PC

Nvidia, ever keen to incentivize purchases of its latest GPUs, is releasing a tool that lets owners of GeForce RTX 30 Series and 40 Series cards run an AI-powered chatbot offline on a Windows PC.

Called Chat with RTX, the tool allows users to customize a GenAI model along the lines of OpenAI’s ChatGPT by connecting it to documents, files and notes that it can then query.

“Rather than searching through notes or saved content, users can simply type queries,” Nvidia writes in a blog post. “For example, one could ask, ‘What was the restaurant my partner recommended while in Las Vegas?’ and Chat with RTX will scan local files the user points it to and provide the answer with context.”

Chat with RTX defaults to AI startup Mistral’s open source model but supports other text-based models, including Meta’s Llama 2. Nvidia warns that downloading all the necessary files will eat up a fair amount of storage — 50GB to 100GB, depending on the model(s) selected.

Currently, Chat with RTX works with text, PDF, .doc, .docx and .xml formats. Pointing the app at a folder containing any supported files will load the files into the model’s fine-tuning dataset. In addition, Chat with RTX can take the URL of a YouTube playlist to load transcriptions of the videos in the playlist, enabling whichever model’s selected to query their contents.

Now, there’s certain limitations to keep in mind, which Nvidia to its credit outlines in a how-to guide.

Chat with RTX

Image Credits: Nvidia

Chat with RTX can’t remember context, meaning that the app won’t take into account any previous questions when answering follow-up questions. For example, if you ask “What’s a common bird  in North America?” and follow that up with “What are its colors?,” Chat with RTX won’t know that you’re talking about birds.

Nvidia also acknowledges that the relevance of the app’s responses can be affected by a range of factors, some easier to control for than others — including the question phrasing, the performance of the selected model and the size of the fine-tuning dataset. Asking for facts covered in a couple of documents is likely to yield better results than asking for a summary of a document or set of documents. And response quality will generally improve with larger datasets — as will pointing Chat with RTX at more content about a specific subject, Nvidia says.

So Chat with RTX is more a toy than anything to be used in production. Still, there’s something to be said for apps that make it easier to run AI models locally — which is something of a growing trend.

In a recent report, the World Economic Forum predicted a “dramatic” growth in affordable devices that can run GenAI models offline, including PCs, smartphones, Internet of Things devices and networking equipment. The reasons, the WEF said, are the clear benefits: Not only are offline models inherently more private — the data they process never leaves the device they run on — but they’re lower latency and more cost-effective than cloud-hosted models.

Of course, democratizing tools to run and train models opens the door to malicious actors — a cursory Google Search yields many listings for models fine-tuned on toxic content from unscrupulous corners of the web. But proponents of apps like Chat with RTX argue that the benefits outweigh the harms. We’ll have to wait and see.

source

Hot this week

Banking as a Service: Meaning, Examples, Benefits and Future

The push for open banking has led to a...

What is Fintech?

Fintech: A term used to refer to innovations in...

Best fintech blogs and websites

Fintech (financial technology) has been an interesting part of...

How to buy shares online

Buying shares online in India has come a long...

Is it worth investing in life insurance over 60?

Is it worth investing in life insurance over 60? As...

TrueLayer Cuts Workforce Amid Profitability Push and $50M Funding Boost

Workforce Reductions: TrueLayer, an open banking payments company...

Amundi Acquires Aixigo to Expand Wealth Management Technology Offerings

Strategic Acquisition: Amundi, Europe’s leading asset manager with...

Celero Commerce Acquires Precision Payments to Expand SME Payment Solutions

Acquisition Announcement: US fintech Celero Commerce has acquired...

Trust Payments Appoints Laurence Booth as New CEO to Drive Growth

Trust Payments, a leading London-based paytech company, has...

NatWest Partners with NCR Atleos to Modernize 5,500 ATMs

NatWest Group has expanded its collaboration with NCR...

Gate City Bank Partners with Alkami for Enhanced Digital Banking Solutions

Gate City Bank Embraces Alkami's Technology: North Dakota-based...

UK Government Unveils Strategy to Boost Financial Services Growth and Innovation

Driving Competitiveness in Finance: The new Labour government,...

Related Articles

Popular Categories

spot_imgspot_img