Skip to page content

Startups to Watch: Unstructured.io makes artificial intelligence tools easier for the average user


BrianRaymondUnstructured
Brian Raymond, CEO, Unstructured.io
Courtesy of Unstructured.io

Each year, Sacramento Business Journal Inno reporter Mark Anderson identifies the top local startups set to make waves in the year ahead. Unstructured.io is one of 11 that made the cut in 2024.


Unstructured.io

Launched in July 2022, Sacramento-based Unstrucutured.io is developing technology that makes it easier for users to access the promise of large language models like ChatGPT-4 by creating clean and curated data for them to use.

Unstructured offers technology to make it easier for customers to access data no matter the file type, document location or layout, so that the user can better use artificial intelligence. The company raised a $25 million round in July 2023 led by Seattle venture capital firm Madrona Venture Group, with participation from New York-based Bain Capital Ventures and San Francisco-based M12 Ventures (formerly Microsoft Ventures), among others.

At the time of the funding, Unstructured had 100 customers. Three months later, the company had more than 1,000 customers using its open-source product, said founder and CEO Brian Raymond.

Unstructured helps a user put their own digital information into a format that is readable by programs that use generative artificial intelligence like Open AI Inc.’s ChatGPT-4. The popular large-language model that can create new content with generative artificial intelligence was released in September 2021.

Unstructured says some 80% of enterprise data lives on formats like HTML, PDF, PNG, CSV and others, and those formats can be difficult to access by ChatGPT-4 without manually transferring the data. The alternative to using Unstructured is literally loading documents by hand.

Also, ChatGPT-4 only knows publicly available information about a user, its documents or the organization itself. For a user to be able to get better performance from the platform, a user must input the correct information in its own vector database, which allows the user to park a collection of data in a private channel. The user then forces the large-language model to use that database. Going a step further, a user can then force the software to account for its reasoning by creating a bibliography and to cite sources and data.

The company’s team of 35 employees works fully remote, he said. The employees are almost all engineers, data scientists and software engineers.


Keep Digging

Awards
Awards
Awards
News
Awards


SpotlightMore

Image via Getty
See More
SPOTLIGHT Awards
See More
Image via Getty Images
See More
SPOTLIGHT Tech News from the Local Business Journal
See More

Upcoming Events More

Want to stay ahead of who & what is next? The national Inno newsletter is your definitive first-look at the people, companies & ideas shaping and driving the U.S. innovation economy.

Sign Up
)
Presented By