If you have been reading this substack, you’ll know that this post is different from the others here:
It’s not specifically about chips
It’s from the present (Not the 1960s like one of my posts)
But AI is at the center of tech today, and plays a big role in the semiconductor industry too. I got an opportunity to attend YC’s AI startup school, and here are some learnings from the speakers.
Here’s the list for you to quickly navigate through:
Opening remarks from Garry Tan, President and CEO at Y Combinator
This is a great time for the technology industry: Intelligence can be accessed using an API
It is also a time for great agency: every industry is changing
Sam Altman, CEO at OpenAI
OpenAI wasn’t built to be big - things that become big usually don’t start that way
Today, there is a product overhang in AI: which means models are progressing faster than applications using them
Vision for GPT-5 and beyond: Multimodal (image, code, video) input and output + much better memory functions
In knowledge work, there is a pattern: Work few hours, wait for feedback, and repeat. This kind of work is perfect for AI agents
Open AI’s hardware product vision: A better interface than the smartphone to interact in the real world
We have only had 2 big computer interface revolutions so far: Mouse, and Touch. The 3rd will come with AI
If the current trend continues, the creation of GPT will be seen by future generations like the invention of the transistor
His hiring philosophy is simple: Hire smart, driven people with track record of getting things done
In the next 10 years, small teams with big agency will be the most successful
His two personal interests in the 2010s were AI and Energy. Today, they are interlinked - we are converting energy to intelligence.
Favorite startup advice: Be contrarian, but right (from Peter Theil)
It’s very hard to do - When GPT1 came out, Elon Musk said it had 0% chance of success.
You just got to keep going, it’s going to be tough
Elon Musk, who needs no introduction
He never sets out to build something great - just wants to build something useful
When the internet was happening, he just wanted to be a part of it. Applied to get a job at Netscape, but couldn’t get it. So he started something on his own.
2008 was his toughest year - Spacex’s 3rd launch failed, and Tesla was running out of money. At that time, everyone said Elon was an internet guy who shouldn’t try real engineering
It’s important to build truth seeking environments in companies - you cannot fool math and physics
“Don’t inspire to glory, inspire to work”
To build a new AI model, you just need access to three things:
Compute
Unique data
Talented people
Intelligence is very rare - it is possible that we are the only species to possess intelligence. That makes it even more important that we are multi-planetary.
He predicts that in the next 100 years, there will be more humanoid robots than the human population
Satya Nadella, CEO at Microsoft
The ultimate measure of AI shouldn’t be intelligence or AGI - it should be the amount of economic growth that AI can drive
He doesn’t believe in anthropomorphizing AI (i.e. giving it human traits) - AI is just a tool
Be open minded that the last big algorithm breakthrough in AI is not done - LLMs are not the end, fundamental AI research still matters
AI isn’t going to take away software engineering jobs - instead, the traditional SWE role would change to something like “Forward Deployed Software Engineer” (FDSE), a role pioneered by Palantir
In the past, typist (someone that uses the typewriter) was a job - now everyone does it. Software engineering will become like that too
The most important factors in AI deployment are going to be: Privacy (for individuals), Security (for organizations), and Sovereignty (for countries)
Microsoft’s breakthrough in Quantum computing is massive - they have finally solved for a stable Q-bit
Today, AI is helping make Quantum computing better. In the future, Quantum computing will enable better AI
Access to copilot has been the best intervention ever in the field of education, making it the one domain that Satya is watching out for
One lesson he learnt in his career: Do every job like it’s the greatest job you ever had - don’t wait for your next promotion
His favorite question while hiring: Describe how you managed a project that was going nowhere - a good answer would highlight three key skills:
Can you solve a problem you are faced with
Can you bring clarity to an uncertain situation
Can you enable a team to work together
Advice to anyone building products: Build something that makes you feel empowered
Aravind Srinivas, CEO and Co-Founder of Perplexity AI
Perplexity’s next big bet is the browser - agents are going to be like the different open tabs you have right now
They have partnerships with a lot of websites to make this work
Triaging and fixing bugs is an important skill, even as a CEO
You can’t strategize your way to success - any smart idea will get copied. You just have to work incredibly hard
Competition is a great thing, because it tells you that something is worth doing
He started perplexity without a clear idea of what to do - founders are advised against this, but it is important to just start something
Competing with ChatGPT is hard; competing with Google is easy
AI apps have not figured out how to have network effects yet
Perplexity’s profit margins will never be as high as Google - no company will have such profit margins anymore
Whenever he feels like failing, he goes to Elon Musk’s video about failure
Fei Fei Li, Stanford Researcher, the godmother of AI (created Imagenet)
She started with a simple goal: How to make machines see. The lack of data to solve this led to the Imagenet dataset.
Alexnet was a big AI breakthrough, but it was also a big hardware breakthrough - it was the first time two GPUs were put together to run a workload
A lot of her research draws inspiration from the evolution of the human brain - her new venture World Labs aims to solve the problem of Spatial Intelligence
Language is purely generative - it does not come from nature. So language alone cannot approximate the world, and get us to AGI
She pursued her early research with newer professors in the field - taking such risks matters
Graduate school is a place where you can be purely driven by curiosity. But if you are running a startup, you won’t have that freedom
She was an immigrant at spent her 20s running a laundromat. Her advice to anyone feeling like a minority: Develop an ability not to over-index on it. Gradient-Descent your way to success.
Andrej Karpathy, Former Director of AI, Tesla
This is the third big shift in software - today’s engineer should be fluent in all three
SW 1.0: Code (to program computers)
SW 2.0: Weights (to program neural networks)
SW 3.0: Prompts (to program LLMs)
LLM Analogy 1: It is like a utility (ex. electricity)
Building the grid is like training, but instead of serving electricity, it serves intelligence
Your access is metered (cost per token)
There are few big providers to switch between
When an LLM goes down, it feels like a power outage
LLM Analogy 2: It is like a fab
The capex to train is huge
Each model has it’s own secret recipe (like TSMC/Intel do)
Some users go fabless (use general purpose Nvidia GPUs); others manufacture in-house (like Google TPUs)
LLM Analogy 3: It is like an OS
There are closed and open ecosystems (like Windows vs Linux)
Different applications can be built on top of them
Easy to pirate (Once trained, cloning an LLM models is like stealing a CD for Windows)
LLM Analogy 4: They mimic human psychology
Hallucinations
Jagged intelligence
Anterograde amnesia
Unlike most breakthroughs in computing, which started with defense or government contracts, (HDLs too, as I covered in one of my posts.) LLMs started with consumers - which is something very new for the computing industry
AI will support different levels of autonomy
Augmentation (like code/image generators)
Partial autonomy apps - like Github copilot/Cursor (coding), Perplexity (search)
Full autonomy - we are not there yet
All software is going to at least be partially autonomous: So build interfaces for LLMs, not humans
For example, product documentation should have markdown in addition to plain text/images - LLMs can access them more easily
Operator is a great way to control a computer - but it is too expensive to use for everything - so in the near future, we need better LLM interfaces
Karpathy first rode a fully autonomous car in 2013. Yet, we still don’t have full self driving. That’s because there is always a big gap between demo and product
Don’t think of it as the year of Agents, think of it as the decade of Agents
Andrew Ng, one of the greatest AI educators
Execution speed is one of the strongest predictor of a startup’s success
Today, the biggest opportunity in AI is at the application level - not at the model, cloud or chip level
Specifically, a new agentic orchestration layer is forming - every application would need this
Vague ideas are always seen as right, but are always wrong. With concrete ideas, you get clear feedback about right or wrong.
The ratio of product managers to engineers will change in the near future
Today, on average, we have 4 engineers per product manager
In the future, there would be 2 product managers for each engineer
So as an engineer today, it is important to have better product instinct
Think of building AI products like building a structure with Legos
The more blocks (i.e. underlying libraries, models, and so on) that you have, the better your outcome would be
AI will push the speed of building products by 10x, so moats will not exist anymore. Brands will be more defensible in the future.
John Jumper, Distinguished Scientist at Google DeepMind, and Nobel Prize winner
Why AlphaGo won?
They had the same public data as everyone
They used a 128 core TPU to run experiments (which is underwhelming compared to today’s LLMs)
It was all about ideas from the team - that made the difference
Trust is built through word of mouth - put your work out there and get feedback
To publish papers in academia, you need ideas that work and are also beautiful. In industry, you just need a working idea
To build a low cost AI product: think about how you can reduce the cost of failed ideas
Narrow AI systems will win out eventually (this is different from what Jared Kaplan said)
It’s easy to come up with dogma - instead, be ruthless and empirical
Chelsea Finn, Assistant Professor at Stanford, Co-Founder of Physical Intelligence
In traditional robotics, a robot is trained to function in very specific environments. But the goal of her new company is different - Build a robot to do anything
For LLMs, scale is the most important - more data + GPUs usually means better models. But to build robots, we need the right kind of data - with sufficient diversity
In her talk, she walked us through the steps they followed to train a laundry folding robot
But none of the steps followed were specific to laundry folding - they just involved a gradual increase in difficulty level - this can apply to any task
The foundational models used to train such general robots is called a Vision Language Model (VLM). It works like this:
The robot processes user input along with vision input from cameras
The VLM uses this data to generate language commands describing how the robot should respond
These language commands are used to control the robot
To make their robots more robust and handle open ended prompts - the same VLMs were used to generate synthetic prompts, and these prompts were used to train the VLM
She believes general purpose robotics will be more successful than purpose built robots, since the foundational VLMs will keep getting better
This was a lot like what Jared Kaplan from Anthropic mentioned in Day 1 about LLMs
Jared Kaplan, Co-Founder and CSO at Anthropic
The time taken for a human to do the same task and AI can do is doubling every 7 months - this is like Moore’s law.
Scaling laws will continue to grow - if teams find that scaling laws are failing, it means their training methodology is flawed.
To prepare for an AI future:
Start building technology that doesn’t work now - by the time you are done, AI would have caught up (similar to chip designers using Moore’s law)
Use AI to integrate AI - this is the only way to keep up
There are two types of tasks
Tasks that can be done with 70-80% accuracy - AI already excels at these
Tasks that need 100% accuracy - this will be solved by future AI
The real value of AI will come in knowledge tasks that needs us to put information from different sources together - like Biology
When integrating AI into existing businesses, one needs to think carefully about the bigger picture - for example, when the electric motor was invented, it was not used to make the steam engine better - instead, an electric engine was redesigned.
Varun Mohan, CEO and Co-Founder of Windsurf
Personally, I loved this talk - Varun walked in with no slides, and simply had a candid conversation with the audience.
His first company was Exafunction - which build GPU virtualization software
It was quite successful and was used by a lot of autonomous vehicle companies
Their USP was to abstract away underlying hardware architectures - but with Nvidia gaining dominance, they felt this application wasn’t valuable. So they pivoted
They came up with Codium, a Github co-pilot alternative. Later, this became Windsurf, an agentic IDE.
His advice to founders: Be irrationally optimistic, but uncompromisingly realistic
To stay ahead of the curve, build products where 50% of the ideas don’t work today - by the time you finish, AI would have caught up and everything will work
The reason startups win over big companies is: startups are desperate, if they fail, the company dies
Strategic moats and switching costs are dying - don’t go by traditional VC advice
There are a lot of companies in the world that are still technology starved - this is your opportunity
When asked how he manages the stresses of being a founder, he said: “I don’t manage it. There is no way to escape it. If you fail, just get up and keep going.”
Francois Chollet, CEO and Co-Founder of Ndea (also the creator of Keras)
If we have tasks that humans do that AI cannot - that means we do not have AGI
Scaling laws will not get us to AGI (Contrary to what to Sam Altman and Jared Kaplan said, actually)
There are two types of data abstraction
1. Value centric - abstraction in the continuous domain
2. Program centric - abstraction in the discrete domain
Today’s AI like transformers work well in the continuous domain. But for AGI, we need AI that can handle both domains
Suhail Doshi, serial entrepreneur (Mixpanel, Mighty Computing, and Playground AI)
AI will have a second movers advantage - be confident if you want to build consumer AI applications even today
The world will soon enter mass amateurization: what an expert can do today, will be done by AI soon. With agents, these tasks can be done continuously for years
Don’t just focus on AI applications that give immediate feedback - like Chatbots. Consumers are ready to wait for hours if they get value out of AI
To identify opportunities, think about “what-ifs” in the future. For example:
What if nobody drives a car
What if everyone has a personal robot
What if we can never say what’s real
Recommended reading: https://andrewchen.com/the-next-feature-fallacy-the-fallacy-that-the-next-new-feature-will-suddenly-make-people-use-your-product/
Jordan Fisher, CEO at Standard AI
Everyone says focus is important. But if you are running your own company, you need to focus on a lot of different things. It’s not easy.
If you are building something today, build it assuming AGI is coming in 2 years
There will be a big difference between products built as “AI first”, vs existing products that retrofit AI
There are many open questions about the AI-centric world we are entering
Will software become a commodity? Do you only need product managers?
What’s the point of downloading an app if you can generate an app on demand?
How can users ensure an AI agent is right if everything happens under the hood?
Some companies will have an advantage in the AI era
Companies with data and data creation opportunities (Meta, Reddit)
Companies with secret recipes (like TSMC, ASML)
I hope you found my notes to be useful, especially if you were not able to make it to the event. Many of these talks were also recorded, and I urge readers to check them out here: https://events.ycombinator.com/ai-sus
As usual, please share this post with someone that it might benefit. And subscribe to stay tuned for upcoming posts.