AI2 Logo
incubator logo

Menu

edges
Insights

Insight 11: Peak of Inflated Expectation?

June 21   Vu Ha Vu Ha
Cover Image
It has been 6 months since Insight 10 (and ChatGPT’s release), so we have a lot to catch up on. In this edition we first cover updates around the AI2 Incubator and the Pacific Northwest’s pre-seed startup community. We then give a whirlwind tour of major developments in AI technologies in the first half of 2023, focusing agents, community models, AI infra/tooling, and AI regulations, sprinkling in our advice to early-stage founders how to surf this latest wave of AI advances. A quick summary:
  • Pacific Northwest’s startup incubators, studios, and pre-seed funds raised $86M in May & June 2023.
  • GPT-4 powered agents don’t work reliably yet. For high-stake scenarios, developers need to reduce ambition/scope and prepare to go all-in across the fast-moving AI stack.
  • AI Infrastructure and tooling opportunities for new startups are limited.
  • Specialized foundation models can be preferable to general-purpose ones. Specialization can be based on a domain (e.g. healthcare), or modality (e.g. audio).
  • Startup pick: Poolside.
  • Research paper pick: A cookbook of self-supervised learning (by Meta AI).

AI2 Incubator and PNW startup update

On May 9, the AI2 Incubator announced a new $30M fund that tripled our previous fund, backed by returning investors Madrona and Sequoia Capital as well as new investors such as Vinod Khosla and Evergreen Ventures. The day before, Ascend, headed by AI2 Incubator’s investor in residence, Kirby Winfield, shared the news of a $25M fund for pre-seed startups. Then on May 18, Pioneer Square Labs unveiled their third, $20M fund. On June 21, Madrona Venture Labs announced a $11M fifth fund. Founders in the Pacific Northwest have great options to consider as they aim for the first major milestone: seed funding.
During the pandemic lockdown, the AI2 Incubator started to look beyond the Pacific Northwest region. This allowed us to discover and partner with founders from various parts of the country. Two of the most recent Incubator graduates are based in San Diego and New York, respectively. We will continue this path as we believe our community benefits from a more diverse group of entrepreneurs.
In the first half of 2023, AI2 Incubator startups continue to make steady progress.
  • On April 6, Benji Barash and Yves Albers-Schoenberg announced a $4.8M seed round for Roboto. Coming from decades of experience in robotics and AI, including recent stints at Amazon Prime Air and AWS, Benji and Yves are building a copilot for robotics engineers working with multimodal sensor and log data. The round was led by Unusual Ventures with participation from Seattle-based FUSE.
  • On April 19, the Lexion.ai team announced a $20M series B led by Point72 ventures, with Citi ventures participating. Lexion is thus the first AI2 Incubator alum that reached the series B milestone, having tripled revenue in 2022.
  • On May 16, Bill Lennon, Matt Busigin, and Max Fergus launched HeyOllie.ai. Ollie is ChatGPT for gifts, a shopping copilot that helps you find the perfect gift. Bill, Matt, and Max have an ambitious vision to fundamentally change the way we shop. Together with another still stealthy alum, we now have two companies going after the CoPilot model for different use cases.
  • On June 13, WhyLabs launched LangKit, an LLM observability toolkit that helps detect issues that may arise in LLM applications such as jailbreaks, prompt injection, refusals, and toxic behaviors.
Last, but not least, on May 11, the Allen Institute for AI announced OLMo, an “open language model made by scientists, for scientists”. Everything created for “OLMo will be openly available, documented, and reproducible, with very limited exceptions and under suitable licensing.” The artifacts released as part of the OLMo project will include training data, code, model weights, intermediate checkpoints, and ablations. AI2 announced that Ali Farhadi is returning to the institute as the CEO, effective July 31. Ali co-founded XNOR, an early Incubator company that was acquired by Apple in 2020 for $200M.

AI overview at ChatGPT’s 6-month mark

It is difficult to capture the progress in AI in the last month in a newsletter, yet Matt Turck did in a single tweet.
ChatGPT turns 6 months old today! It's time to take a look at its early impact:
  • 245,064 new AI newsletters
  • 1 trillion Twitter threads about prompt engineering
  • 27 VC firms who decided "if only there was a big AI meetup in NYC! Let's create one"
  • 1 company per minute pivoting from Web3
  • 6,523 VC market maps of the AI ecosystem
  • 17,547 new VC podcasts about AI
  • YC expands batch size to 9,000 Generative AI startups
  • 100% of VC firms claiming to have "always been big believers in AI
In other words, it has been a breathless six months of non-stop AI news, startup activities, chatter about AGI and doomsday scenarios, and a sense of overwhelming noise. Here is our attempt to capture and summarize the key moments, without the aid of AI.
  • December

    2022:

    ChatGPT

    was launched.

    OpenAI

    showed the way, again.
  • January/February

    2023: It is big tech’s turn.

    Microsoft

    launched

    Bing chat

    , powered by the soon-to-be-announced GPT-4.

    Google

    issued code red and announced the intent (!) to launch

    Bard

    .

    Meta

    emerged as a major proponent of open-source AI, with LLaMA as a prime example.

    LangChain

    , one of the most popular open-source AI application frameworks, quickly raised $10M seed funding from Benchmark (and a later up round shortly led by Sequoia Capital).
  • March

    2023: More goodies from

    OpenAI

    :

    GPT-4

    and

    ChatGPT

    plugins

    .

    Anthropic

    announced

    Claude

    , the only remotely competitive alternative to ChatGPT (Bard was still an announcement with a waitlist at this point).
  • April

    2023:

    Agents

    ! BabyAGI and Auto-GPT started and quickly exploded.

    YC

    Winter 2023 demo days packed with AI foundation model wrappers and AI tools with early pilots being their batch mates.
  • May

    2023: Senate’s hearing on

    AI regulation

    . Google (IO) and Microsoft (Build) surprised no one with their all-out bets on AI. Bard is finally available.
  • June

    2023:

    Apple Vision Pro

    . Apple think different, showing that tech is more than just AI.
There is a lot to unpack here. In the rest of this newsletter, we will dive deep into four topics: 1) AI tools/plugins/agents, 2) foundation model ecosystem, 3) AI infrastructure and tooling, and 4) AI safety and regulation. We will try to focus on aspects that are relevant to early-stage startup founders.

Tools/plugins/agents: use at your own risk

Auto-GPT was first released on March 30, by Toran Bruce Richards, the founder of gaming company Significant Gravitas. In about three weeks, the project garnered 100K GitHub stars. Today, it has 140K GitHub stars and counting. For context, LangChain currently has about 48K stars. Many wondered what happened. Is AGI nigh?
To explore the answer to this question, let us go back to September 2022, the early days of the Generative AI era, when Nat Friedman and Daniel Gross announced AIGrant.org, a grant program that backs entrepreneurs to start companies taking advantage of AI advances. Nat and Daniel issued the following challenge:
Researchers have raced ahead. It's time for entrepreneurs to catch up!

Tools

A sense of an impending revolution was palpable in the Fall of 2022. When ChatGPT was unveiled in December, the AI zeitgeist was elevated to another notch. ChatGPT captured the imagination of the public, not just entrepreneurs. Yet as mind-blowing as the ChatGPT experience is for many, we all saw one glaring limitation. ChatGPT would respond to questions about recent events (e.g., who won the 2022 world cup) with this response:
As an AI language model, I don't have real-time information or access to the internet to provide you with the latest news. My knowledge was last updated in September 2021 …
Wouldn’t it be nice if ChatGPT were given a

tool

to access the latest information on the Web? The obvious answer was Bing chat, launched on February 7. Great, how about we give ChatGPT access to other tools such as calculators, machine translators, calendars, etc.? On February 9, Meta AI submitted a paper to arXiv titled “Toolformer: Language Models Can Teach Themselves to Use Tools”, that suggested just exactly that. This paper was tweeted by Yann Lecun, and subsequently caught the attention of the now rapidly growing AI tinkerer community. Are we set? Can ChatGPT now use tools? If the answer is yes, imagine the possibilities of building AI products that go from giving me the answer/insight to doing stuff for me!
On February 15, Meta AI submitted to arXiv a survey of research on language models with tools (title: Augmented Language Models: A Survey). This is a fantastic survey that paints a cautious outlook on techniques such as retrieval augmented generation, chain-of-thought reasoning, and ReACT (Synergizing Reasoning and Acting in Language Models). One TL; DR is that they don’t work reliably yet, as the field is still early. Practitioners should carefully evaluate to make sure they meet performance expectations in intended scenarios. OpenAI released ChatGPT with tools, aka plugins, on March 23, after extensive development and testing work.

Agents

The first sighting of language models with tools however was as early as November 23, 2022, when Harrison Chase introduced the concept of

agents

in LangChain. Never mind the odd example, "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?", the LangChain community has now accepted Nat’s and Daniel’s challenge, even racing ahead of researchers to the very bleeding edge of AI.
When GPT-4 was unveiled in mid-March, the AI tinkerer community kicked up their ambition to another level with Auto-GPT and BabyAGI. Give GPT-4 a high-level goal, and watch it reason, plan, and execute a complex sequence of tasks using a variety of tools autonomously. AI researchers have now been left so far behind by AI tinkerers that it could take them decades to catch up!
To illustrate this point, I conducted the experiment of asking prominent AI models a simple question: When was the Toolformer paper submitted to arXiv? GPT-4-powered Bing Chat gave me the 4/10/2023 date for a different paper (Graph-Toolformer: To Empower LLMs with Graph Tools), Bard produced an answer for a fictitious paper (The paper "Toolformer: Self-Supervised Learning for External Tool Control'' by Meta AI was published on June 8, 2023). Only perplexity.ai gave the correct answer (2/9/2023). Disclaimers about anecdotes aside, should we trust AI to autonomously research proteins or perform sales prospecting? Do I even trust AI to write a newsletter like this autonomously?

Advice to founders with respect to agents

My view on agents is thus long term bullish but short term cautious; AGI is

not

nigh. My advice to founders is the following:
  • Be

    wary

    of building CoPilot products for scenarios where there is zero/low tolerance for inaccuracies/hallucination. Apply the principle of Minimal Algorithmic Performance (MAP) judiciously as the technical feasibility gap could simply be too large to overcome.
  • To maximize the chance of getting to the required performance,

    specialize

    and

    focus

    . Pick a niche domain where you can take advantage of its idiosyncrasies to get to the performance the customers demand. With enough constraints and simplifications, your agents can improve their chances of success. This of course must be balanced with the need to address a sufficiently large market. This is not easy, but building a successful startup is never easy, to say the obvious.
  • In balancing between specialization and market size in the early stage,

    bias on the side of specialization

    . You need to solve a real customer problem well—give them a 10x better tool—to gain a wedge before generalizing to a broader audience/market. This new wave of AI advances gave me the confidence that generalization in this context is very much doable (see the discussion on task-centric AI in Insight #4).
  • Even with specialization and focus, prepare to be

    all-in across the emerging AI stack

    , all the way to fine-tuning or even pre-training your own domain-specific foundation models if necessary. Don’t rely on an orchestration library (e.g., LangChain), a shallow tech provider (e.g., vector databases), or a prompt whisperer (e.g., Riley Goodside) to magically solve the technical challenges. As investors, we look for teams that have what it takes to do this. We don’t ask questions about moats which are meaningless in early-stage startups. You need to go the last mile in UX, product, go to market, and of course, AI.
If you want to follow AI researchers who share our views around current AI agents and their deep-rooted limitations, I recommend Yann Lecun (e.g., this tweet) and Subbarao Kambhampati (e.g., this tweet). Lecun in particular has a vision to address these limitations. Under his leadership, Meta AI has made some early progress toward this vision.

Is it the iPhone moment for AI?

Many in our industry, founders and investors alike, believe that the answer is an empathetic yes. Personally, I am not sure. The iPhone created or reinvented multiple industries with big winners because it transformed the cell phone into a mobile computing platform for transportation, entertainment, gaming, etc. Imagine an Uber ($88B) or a TikTok ($300B) that required laptops! Mobile gaming is now a $100B industry. The iPhone made those success stories possible due to a confluence of advances on multiple fronts: GPS, touch interface, affordable mobile broadband, SoC (system-on-chip) design innovations, battery technologies, etc. We are still in the early innings of exploring the impact of foundation models in the business world. I suspect that we will see the first wave of significant achievements once researchers have made some progress in their efforts to catch up with Auto-GPT.

BYOM (Bring Your Own Model): startup and paper picks

GPT-4 was a watershed moment in AI. It achieved state of the art performance in pretty much any popular AI benchmark one can think of, often beating prior SOTA by large margins. So why did I suggest fine-tuning/pre-training your own foundation models? The reason is the old “jack of all trades” wisdom about specialization. By specialization, we can go the last mile in terms of performance, latency, cost efficiency, security, continuous fine-tuning, etc., the types of considerations that matter a lot to customers. It is possible to work with OpenAI’s (or Microsoft’s/Google’s) models to accomplish these goals. There will be (a lot of) cases where that is the better choice. But the ones where the BYOM approach is superior are the opportunities where startups can build deep defensibility, against both incumbents and other startups. Wrappers-be-gone.
A crucial factor that enables the BYOM approach is the rapidly moving community of open-source foundation model developers, catalyzed by the leaked LLaMA model from Meta AI (although we started covering open-source models more than a year ago in Insight #7). This led to the now famous “No Moat” essay, supposedly authored by a Google engineer who wrote that OSS models will soon catch up with closed source ones such as OpenAI’s and Google’s. The author made several good points, but also an oversight around how big a gap there is between GPT-4 and the rest of the world (Google’s models included). For entrepreneurs, we should not care that a future OSS model will match OpenAI’s crown jewels in AGI. We should care that we have the tools, infrastructure, and know how to beat OpenAI in one narrow domain that matters to our customers. Snorkel’s Alex Ratner coined the phrase GPT-You to describe this.
One vector of specialization for foundation models is the industry that they target. For finance, there are BloombergGPT (proprietary) and FinGPT (OSS). For healthcare, there’s Hippocratic AI (announcement only). For e-commerce, there’s Meta AI’s CommerceMM. For software, there’s

Poolside, our startup pick

(announcement only). As discussed in Insight #10, we believe generative AI for software is a big opportunity. We have an exciting project in this space at the Incubator and look forward to experimenting with Poolside’s code-specific foundation model. Poolside’s positioning is interesting: building a foundation model that aims to be the foundational layer for (the future of) software. “We are going after narrow AGI through software and code,” Jason Warner (GitHub’s former CTO and Poolside’s co-founder) told TechCrunch.
Another vector of specialization is modality, e.g., Stable Diffusion for images and Meta AI’s Voicebox for speech. (Note how I carefully chose the term foundation models, which are a superset of LLMs).
It is interesting to see Meta carrying the flag for the open-source AI community. They recently shared a cookbook of self-supervised learning containing chockfull of hard learned lessons building foundation models. For this reason, we selected it as the

research paper pick

for this Insight.

AI infrastructure and tools: the bear case

In my chats with other VCs, there's a worry that LLMOps will become the new MLOps.
Aaron Yang (@IAmAaronYang) commented on YC Winter 2023’s LLMOps crop
In a gold rush, the conventional wisdom is to start a pick and shovel (P&S) business. When it comes to technology startups, this is a dangerous analogy for the following reasons.
  • The gold rush was a short-lived phenomenon. P&S businesses made some quick bucks, but none went on to become an enduring company.
  • AI is moving at a break-neck pace, making it challenging to identify/predict where the miners are going (or will they change jobs at some point). A year ago, Andrew Ng was out evangelizing data-centric AI, a concept that now seems quaint in a post-GPT-4 world fascinated with agents, vector databases, and prompt engineering. Eleven months ago, TechCrunch covered Tecton’s $100M Series C from heavy hitters such as Kleiner Perkins, Sequoia, and A16Z in a story that did not mention the impending GenAI wave, focusing on a MLOps future where feature stores are a cornerstone. Would they still be betting on a feature store today?
  • The enduring opportunities often lie in the low-level neighborhood of the AI stack, where incumbents such as NVIDIA and Microsoft are making a large profit.
If you are in the current YC batch (summer 2023), please learn from the earlier batch and resist the path of selling to your batch mates who may be trying to sell to a hypothetical enterprise customer who is still deciding what to do with their data labeling or feature store investments. You will be competing not just with other seed stage startups, but also with existing well-capitalized MLOps companies eager to jump on the LLM bandwagon. You may also feel that the ground seems to be shifting every time OpenAI makes an announcement.
While quick-hit accelerators have helped to shape a generation of startups, we live in a different world today: an AI-first world. And, when working to build an AI-first company, you might consider choosing entrepreneurship support that is purpose-built to serve AI-first founders. Unlike the VCs and accelerators mentioned in Matt Turck's tweet, many of whom are _just now _jumping on the AI bandwagon, the AI2 Incubator has been building successful deep tech startups since 2017. More importantly, the specialized support that we provide AI-first founders is unparalleled. AI is all we do, and all we’ve ever done. Please reach out and let’s chat!

AI Safety/Regulation

In May, Geoffrey Hinton quit Google and warns over dangers of misinformation. Yoshua Bengio said Big Tech’s arms race threatens ‘the very nature of truth.’ Earlier this year, Sam Altman warned that the worst-case scenario of misuse of AI is “light out for all of us”. On the other side of the debate, Yann LeCun and Subbarao Kambhampati believe the fear is overblown and counterproductive. In a 7,000-word essay, Marc Andreessen dismissed AI doomerism as a cult, opining that AI will save the world. A consensus that we have seen is that, while no one has articulated a large-scale dangerous scenario involving AI, we should continue to pay close attention to such possibilities and build appropriate solutions for them.
For founders and innovators, the opportunity lies in building technologies and products that improve our understanding of how AI works. Foundation models are currently black boxes that can do amazing things but also show biased, toxic, misleading, and other harmful behaviors. At the AI2 Incubator, we are always on the lookout for opportunities to partner with founders who are building solutions to these challenges. We are excited about WhyLabs’ LangKit, an AI observability solution. With its ability to help AI products monitor negative behaviors, this is a step in the right direction.

Miscellaneous advice to founders

There is an emerging divide between AI companies that focus on generative AI and the rest, with the former group receiving much more attention from investors. The size of seed rounds, in terms of both amount raised and valuation, is increasing despite recession concerns. A round of $5M at a $20M pre-money valuation no longer raises eyebrows. There is a category of mega seed rounds for startups with supersized ambitions, with Hippocratic AI's $50M being a recent example. In addition, rounds are happening at a blinding speed. The overall feeling is one of exuberance.
I believe that this froth will subside. Startups that raised seed rounds in 2023 will face a stern reality check when series A comes around in 18 months, when traction, instead of potential, will be under the spotlight. My advice for seed stage founders is to stay grounded, humble, and lean. Embrace AI and GPT-4 to enhance productivity, to do more with fewer resources, from R&D to sales and marketing to recruiting. Go the last mile where established incumbents or other competitors will not or cannot. When it comes to products and user experience, demand the highest standards and ambitions from yourself. A Q&A app built with OpenAI, LangChain, and Pinecone may have blown minds a year ago. This same app is now considered a “Hello World” starter app, as user expectation has risen substantially.

Stay up to date with the latest
A.I. and deep tech reports.

edges
Insights

Insight 11: Peak of Inflated Expectation?

June 21   Vu Ha Vu Ha
Cover Image
It has been 6 months since Insight 10 (and ChatGPT’s release), so we have a lot to catch up on. In this edition we first cover updates around the AI2 Incubator and the Pacific Northwest’s pre-seed startup community. We then give a whirlwind tour of major developments in AI technologies in the first half of 2023, focusing agents, community models, AI infra/tooling, and AI regulations, sprinkling in our advice to early-stage founders how to surf this latest wave of AI advances. A quick summary:
  • Pacific Northwest’s startup incubators, studios, and pre-seed funds raised $86M in May & June 2023.
  • GPT-4 powered agents don’t work reliably yet. For high-stake scenarios, developers need to reduce ambition/scope and prepare to go all-in across the fast-moving AI stack.
  • AI Infrastructure and tooling opportunities for new startups are limited.
  • Specialized foundation models can be preferable to general-purpose ones. Specialization can be based on a domain (e.g. healthcare), or modality (e.g. audio).
  • Startup pick: Poolside.
  • Research paper pick: A cookbook of self-supervised learning (by Meta AI).

AI2 Incubator and PNW startup update

On May 9, the AI2 Incubator announced a new $30M fund that tripled our previous fund, backed by returning investors Madrona and Sequoia Capital as well as new investors such as Vinod Khosla and Evergreen Ventures. The day before, Ascend, headed by AI2 Incubator’s investor in residence, Kirby Winfield, shared the news of a $25M fund for pre-seed startups. Then on May 18, Pioneer Square Labs unveiled their third, $20M fund. On June 21, Madrona Venture Labs announced a $11M fifth fund. Founders in the Pacific Northwest have great options to consider as they aim for the first major milestone: seed funding.
During the pandemic lockdown, the AI2 Incubator started to look beyond the Pacific Northwest region. This allowed us to discover and partner with founders from various parts of the country. Two of the most recent Incubator graduates are based in San Diego and New York, respectively. We will continue this path as we believe our community benefits from a more diverse group of entrepreneurs.
In the first half of 2023, AI2 Incubator startups continue to make steady progress.
  • On April 6, Benji Barash and Yves Albers-Schoenberg announced a $4.8M seed round for Roboto. Coming from decades of experience in robotics and AI, including recent stints at Amazon Prime Air and AWS, Benji and Yves are building a copilot for robotics engineers working with multimodal sensor and log data. The round was led by Unusual Ventures with participation from Seattle-based FUSE.
  • On April 19, the Lexion.ai team announced a $20M series B led by Point72 ventures, with Citi ventures participating. Lexion is thus the first AI2 Incubator alum that reached the series B milestone, having tripled revenue in 2022.
  • On May 16, Bill Lennon, Matt Busigin, and Max Fergus launched HeyOllie.ai. Ollie is ChatGPT for gifts, a shopping copilot that helps you find the perfect gift. Bill, Matt, and Max have an ambitious vision to fundamentally change the way we shop. Together with another still stealthy alum, we now have two companies going after the CoPilot model for different use cases.
  • On June 13, WhyLabs launched LangKit, an LLM observability toolkit that helps detect issues that may arise in LLM applications such as jailbreaks, prompt injection, refusals, and toxic behaviors.
Last, but not least, on May 11, the Allen Institute for AI announced OLMo, an “open language model made by scientists, for scientists”. Everything created for “OLMo will be openly available, documented, and reproducible, with very limited exceptions and under suitable licensing.” The artifacts released as part of the OLMo project will include training data, code, model weights, intermediate checkpoints, and ablations. AI2 announced that Ali Farhadi is returning to the institute as the CEO, effective July 31. Ali co-founded XNOR, an early Incubator company that was acquired by Apple in 2020 for $200M.

AI overview at ChatGPT’s 6-month mark

It is difficult to capture the progress in AI in the last month in a newsletter, yet Matt Turck did in a single tweet.
ChatGPT turns 6 months old today! It's time to take a look at its early impact:
  • 245,064 new AI newsletters
  • 1 trillion Twitter threads about prompt engineering
  • 27 VC firms who decided "if only there was a big AI meetup in NYC! Let's create one"
  • 1 company per minute pivoting from Web3
  • 6,523 VC market maps of the AI ecosystem
  • 17,547 new VC podcasts about AI
  • YC expands batch size to 9,000 Generative AI startups
  • 100% of VC firms claiming to have "always been big believers in AI
In other words, it has been a breathless six months of non-stop AI news, startup activities, chatter about AGI and doomsday scenarios, and a sense of overwhelming noise. Here is our attempt to capture and summarize the key moments, without the aid of AI.
  • December

    2022:

    ChatGPT

    was launched.

    OpenAI

    showed the way, again.
  • January/February

    2023: It is big tech’s turn.

    Microsoft

    launched

    Bing chat

    , powered by the soon-to-be-announced GPT-4.

    Google

    issued code red and announced the intent (!) to launch

    Bard

    .

    Meta

    emerged as a major proponent of open-source AI, with LLaMA as a prime example.

    LangChain

    , one of the most popular open-source AI application frameworks, quickly raised $10M seed funding from Benchmark (and a later up round shortly led by Sequoia Capital).
  • March

    2023: More goodies from

    OpenAI

    :

    GPT-4

    and

    ChatGPT

    plugins

    .

    Anthropic

    announced

    Claude

    , the only remotely competitive alternative to ChatGPT (Bard was still an announcement with a waitlist at this point).
  • April

    2023:

    Agents

    ! BabyAGI and Auto-GPT started and quickly exploded.

    YC

    Winter 2023 demo days packed with AI foundation model wrappers and AI tools with early pilots being their batch mates.
  • May

    2023: Senate’s hearing on

    AI regulation

    . Google (IO) and Microsoft (Build) surprised no one with their all-out bets on AI. Bard is finally available.
  • June

    2023:

    Apple Vision Pro

    . Apple think different, showing that tech is more than just AI.
There is a lot to unpack here. In the rest of this newsletter, we will dive deep into four topics: 1) AI tools/plugins/agents, 2) foundation model ecosystem, 3) AI infrastructure and tooling, and 4) AI safety and regulation. We will try to focus on aspects that are relevant to early-stage startup founders.

Tools/plugins/agents: use at your own risk

Auto-GPT was first released on March 30, by Toran Bruce Richards, the founder of gaming company Significant Gravitas. In about three weeks, the project garnered 100K GitHub stars. Today, it has 140K GitHub stars and counting. For context, LangChain currently has about 48K stars. Many wondered what happened. Is AGI nigh?
To explore the answer to this question, let us go back to September 2022, the early days of the Generative AI era, when Nat Friedman and Daniel Gross announced AIGrant.org, a grant program that backs entrepreneurs to start companies taking advantage of AI advances. Nat and Daniel issued the following challenge:
Researchers have raced ahead. It's time for entrepreneurs to catch up!

Tools

A sense of an impending revolution was palpable in the Fall of 2022. When ChatGPT was unveiled in December, the AI zeitgeist was elevated to another notch. ChatGPT captured the imagination of the public, not just entrepreneurs. Yet as mind-blowing as the ChatGPT experience is for many, we all saw one glaring limitation. ChatGPT would respond to questions about recent events (e.g., who won the 2022 world cup) with this response:
As an AI language model, I don't have real-time information or access to the internet to provide you with the latest news. My knowledge was last updated in September 2021 …
Wouldn’t it be nice if ChatGPT were given a

tool

to access the latest information on the Web? The obvious answer was Bing chat, launched on February 7. Great, how about we give ChatGPT access to other tools such as calculators, machine translators, calendars, etc.? On February 9, Meta AI submitted a paper to arXiv titled “Toolformer: Language Models Can Teach Themselves to Use Tools”, that suggested just exactly that. This paper was tweeted by Yann Lecun, and subsequently caught the attention of the now rapidly growing AI tinkerer community. Are we set? Can ChatGPT now use tools? If the answer is yes, imagine the possibilities of building AI products that go from giving me the answer/insight to doing stuff for me!
On February 15, Meta AI submitted to arXiv a survey of research on language models with tools (title: Augmented Language Models: A Survey). This is a fantastic survey that paints a cautious outlook on techniques such as retrieval augmented generation, chain-of-thought reasoning, and ReACT (Synergizing Reasoning and Acting in Language Models). One TL; DR is that they don’t work reliably yet, as the field is still early. Practitioners should carefully evaluate to make sure they meet performance expectations in intended scenarios. OpenAI released ChatGPT with tools, aka plugins, on March 23, after extensive development and testing work.

Agents

The first sighting of language models with tools however was as early as November 23, 2022, when Harrison Chase introduced the concept of

agents

in LangChain. Never mind the odd example, "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?", the LangChain community has now accepted Nat’s and Daniel’s challenge, even racing ahead of researchers to the very bleeding edge of AI.
When GPT-4 was unveiled in mid-March, the AI tinkerer community kicked up their ambition to another level with Auto-GPT and BabyAGI. Give GPT-4 a high-level goal, and watch it reason, plan, and execute a complex sequence of tasks using a variety of tools autonomously. AI researchers have now been left so far behind by AI tinkerers that it could take them decades to catch up!
To illustrate this point, I conducted the experiment of asking prominent AI models a simple question: When was the Toolformer paper submitted to arXiv? GPT-4-powered Bing Chat gave me the 4/10/2023 date for a different paper (Graph-Toolformer: To Empower LLMs with Graph Tools), Bard produced an answer for a fictitious paper (The paper "Toolformer: Self-Supervised Learning for External Tool Control'' by Meta AI was published on June 8, 2023). Only perplexity.ai gave the correct answer (2/9/2023). Disclaimers about anecdotes aside, should we trust AI to autonomously research proteins or perform sales prospecting? Do I even trust AI to write a newsletter like this autonomously?

Advice to founders with respect to agents

My view on agents is thus long term bullish but short term cautious; AGI is

not

nigh. My advice to founders is the following:
  • Be

    wary

    of building CoPilot products for scenarios where there is zero/low tolerance for inaccuracies/hallucination. Apply the principle of Minimal Algorithmic Performance (MAP) judiciously as the technical feasibility gap could simply be too large to overcome.
  • To maximize the chance of getting to the required performance,

    specialize

    and

    focus

    . Pick a niche domain where you can take advantage of its idiosyncrasies to get to the performance the customers demand. With enough constraints and simplifications, your agents can improve their chances of success. This of course must be balanced with the need to address a sufficiently large market. This is not easy, but building a successful startup is never easy, to say the obvious.
  • In balancing between specialization and market size in the early stage,

    bias on the side of specialization

    . You need to solve a real customer problem well—give them a 10x better tool—to gain a wedge before generalizing to a broader audience/market. This new wave of AI advances gave me the confidence that generalization in this context is very much doable (see the discussion on task-centric AI in Insight #4).
  • Even with specialization and focus, prepare to be

    all-in across the emerging AI stack

    , all the way to fine-tuning or even pre-training your own domain-specific foundation models if necessary. Don’t rely on an orchestration library (e.g., LangChain), a shallow tech provider (e.g., vector databases), or a prompt whisperer (e.g., Riley Goodside) to magically solve the technical challenges. As investors, we look for teams that have what it takes to do this. We don’t ask questions about moats which are meaningless in early-stage startups. You need to go the last mile in UX, product, go to market, and of course, AI.
If you want to follow AI researchers who share our views around current AI agents and their deep-rooted limitations, I recommend Yann Lecun (e.g., this tweet) and Subbarao Kambhampati (e.g., this tweet). Lecun in particular has a vision to address these limitations. Under his leadership, Meta AI has made some early progress toward this vision.

Is it the iPhone moment for AI?

Many in our industry, founders and investors alike, believe that the answer is an empathetic yes. Personally, I am not sure. The iPhone created or reinvented multiple industries with big winners because it transformed the cell phone into a mobile computing platform for transportation, entertainment, gaming, etc. Imagine an Uber ($88B) or a TikTok ($300B) that required laptops! Mobile gaming is now a $100B industry. The iPhone made those success stories possible due to a confluence of advances on multiple fronts: GPS, touch interface, affordable mobile broadband, SoC (system-on-chip) design innovations, battery technologies, etc. We are still in the early innings of exploring the impact of foundation models in the business world. I suspect that we will see the first wave of significant achievements once researchers have made some progress in their efforts to catch up with Auto-GPT.

BYOM (Bring Your Own Model): startup and paper picks

GPT-4 was a watershed moment in AI. It achieved state of the art performance in pretty much any popular AI benchmark one can think of, often beating prior SOTA by large margins. So why did I suggest fine-tuning/pre-training your own foundation models? The reason is the old “jack of all trades” wisdom about specialization. By specialization, we can go the last mile in terms of performance, latency, cost efficiency, security, continuous fine-tuning, etc., the types of considerations that matter a lot to customers. It is possible to work with OpenAI’s (or Microsoft’s/Google’s) models to accomplish these goals. There will be (a lot of) cases where that is the better choice. But the ones where the BYOM approach is superior are the opportunities where startups can build deep defensibility, against both incumbents and other startups. Wrappers-be-gone.
A crucial factor that enables the BYOM approach is the rapidly moving community of open-source foundation model developers, catalyzed by the leaked LLaMA model from Meta AI (although we started covering open-source models more than a year ago in Insight #7). This led to the now famous “No Moat” essay, supposedly authored by a Google engineer who wrote that OSS models will soon catch up with closed source ones such as OpenAI’s and Google’s. The author made several good points, but also an oversight around how big a gap there is between GPT-4 and the rest of the world (Google’s models included). For entrepreneurs, we should not care that a future OSS model will match OpenAI’s crown jewels in AGI. We should care that we have the tools, infrastructure, and know how to beat OpenAI in one narrow domain that matters to our customers. Snorkel’s Alex Ratner coined the phrase GPT-You to describe this.
One vector of specialization for foundation models is the industry that they target. For finance, there are BloombergGPT (proprietary) and FinGPT (OSS). For healthcare, there’s Hippocratic AI (announcement only). For e-commerce, there’s Meta AI’s CommerceMM. For software, there’s

Poolside, our startup pick

(announcement only). As discussed in Insight #10, we believe generative AI for software is a big opportunity. We have an exciting project in this space at the Incubator and look forward to experimenting with Poolside’s code-specific foundation model. Poolside’s positioning is interesting: building a foundation model that aims to be the foundational layer for (the future of) software. “We are going after narrow AGI through software and code,” Jason Warner (GitHub’s former CTO and Poolside’s co-founder) told TechCrunch.
Another vector of specialization is modality, e.g., Stable Diffusion for images and Meta AI’s Voicebox for speech. (Note how I carefully chose the term foundation models, which are a superset of LLMs).
It is interesting to see Meta carrying the flag for the open-source AI community. They recently shared a cookbook of self-supervised learning containing chockfull of hard learned lessons building foundation models. For this reason, we selected it as the

research paper pick

for this Insight.

AI infrastructure and tools: the bear case

In my chats with other VCs, there's a worry that LLMOps will become the new MLOps.
Aaron Yang (@IAmAaronYang) commented on YC Winter 2023’s LLMOps crop
In a gold rush, the conventional wisdom is to start a pick and shovel (P&S) business. When it comes to technology startups, this is a dangerous analogy for the following reasons.
  • The gold rush was a short-lived phenomenon. P&S businesses made some quick bucks, but none went on to become an enduring company.
  • AI is moving at a break-neck pace, making it challenging to identify/predict where the miners are going (or will they change jobs at some point). A year ago, Andrew Ng was out evangelizing data-centric AI, a concept that now seems quaint in a post-GPT-4 world fascinated with agents, vector databases, and prompt engineering. Eleven months ago, TechCrunch covered Tecton’s $100M Series C from heavy hitters such as Kleiner Perkins, Sequoia, and A16Z in a story that did not mention the impending GenAI wave, focusing on a MLOps future where feature stores are a cornerstone. Would they still be betting on a feature store today?
  • The enduring opportunities often lie in the low-level neighborhood of the AI stack, where incumbents such as NVIDIA and Microsoft are making a large profit.
If you are in the current YC batch (summer 2023), please learn from the earlier batch and resist the path of selling to your batch mates who may be trying to sell to a hypothetical enterprise customer who is still deciding what to do with their data labeling or feature store investments. You will be competing not just with other seed stage startups, but also with existing well-capitalized MLOps companies eager to jump on the LLM bandwagon. You may also feel that the ground seems to be shifting every time OpenAI makes an announcement.
While quick-hit accelerators have helped to shape a generation of startups, we live in a different world today: an AI-first world. And, when working to build an AI-first company, you might consider choosing entrepreneurship support that is purpose-built to serve AI-first founders. Unlike the VCs and accelerators mentioned in Matt Turck's tweet, many of whom are _just now _jumping on the AI bandwagon, the AI2 Incubator has been building successful deep tech startups since 2017. More importantly, the specialized support that we provide AI-first founders is unparalleled. AI is all we do, and all we’ve ever done. Please reach out and let’s chat!

AI Safety/Regulation

In May, Geoffrey Hinton quit Google and warns over dangers of misinformation. Yoshua Bengio said Big Tech’s arms race threatens ‘the very nature of truth.’ Earlier this year, Sam Altman warned that the worst-case scenario of misuse of AI is “light out for all of us”. On the other side of the debate, Yann LeCun and Subbarao Kambhampati believe the fear is overblown and counterproductive. In a 7,000-word essay, Marc Andreessen dismissed AI doomerism as a cult, opining that AI will save the world. A consensus that we have seen is that, while no one has articulated a large-scale dangerous scenario involving AI, we should continue to pay close attention to such possibilities and build appropriate solutions for them.
For founders and innovators, the opportunity lies in building technologies and products that improve our understanding of how AI works. Foundation models are currently black boxes that can do amazing things but also show biased, toxic, misleading, and other harmful behaviors. At the AI2 Incubator, we are always on the lookout for opportunities to partner with founders who are building solutions to these challenges. We are excited about WhyLabs’ LangKit, an AI observability solution. With its ability to help AI products monitor negative behaviors, this is a step in the right direction.

Miscellaneous advice to founders

There is an emerging divide between AI companies that focus on generative AI and the rest, with the former group receiving much more attention from investors. The size of seed rounds, in terms of both amount raised and valuation, is increasing despite recession concerns. A round of $5M at a $20M pre-money valuation no longer raises eyebrows. There is a category of mega seed rounds for startups with supersized ambitions, with Hippocratic AI's $50M being a recent example. In addition, rounds are happening at a blinding speed. The overall feeling is one of exuberance.
I believe that this froth will subside. Startups that raised seed rounds in 2023 will face a stern reality check when series A comes around in 18 months, when traction, instead of potential, will be under the spotlight. My advice for seed stage founders is to stay grounded, humble, and lean. Embrace AI and GPT-4 to enhance productivity, to do more with fewer resources, from R&D to sales and marketing to recruiting. Go the last mile where established incumbents or other competitors will not or cannot. When it comes to products and user experience, demand the highest standards and ambitions from yourself. A Q&A app built with OpenAI, LangChain, and Pinecone may have blown minds a year ago. This same app is now considered a “Hello World” starter app, as user expectation has risen substantially.

Stay up to date with the latest
A.I. and deep tech reports.

horizontal separator

Join our newsletter

AI2 Logo
incubator logo

Join our newsletter

Join our newsletter

AI2 Logo
incubator logo

© AI2 Incubator · All Rights Reserved

Seattle, WA

Twitter

LinkedIn

Privacy

Terms of Service