Andrew Ng's 2-Hour Prompting Course

From AI novice to power user — context, reasoning, research, writing, images, code, and data analysis

← Part of the AI knowledge base · Productivity knowledge base

Course: Andrew Ng (DeepLearning.AI) — 2-hour prompting course

Discovered via: Roan (@RohOnChain) — Read the post on X

Summary

Andrew Ng walks from "AI as Google search" to expert-level prompting across ~2 hours of modules with hands-on labs. The core thesis: today's models are far more capable than most people use them for — and the gap between novices and power users comes down to context, iteration, neutral framing, and treating AI as a thinking partner rather than a text generator.

Module 1: Novice vs. Power User + How AI Gets Its Knowledge

Hard questions need time. Upload documents (car specs, insurance quotes) and tell the model to read everything and think hard — it may spend minutes and return a detailed report.
Context is everything. Think of AI as a smart, motivated new grad who doesn't know you yet. Short prompts produce generic output (e.g. a self-review with no knowledge of what you actually did).
Empathy for the AI. Power users ask: "Does this person have enough information to do a good job?" Upload project trackers, docs, voice memos — then assign the task.
Beat sycophancy. Don't say "I have a great business idea — critique it." Use neutral framing and rubrics (problem, market, competitive advantage) so the model gives honest scores, not validation.
Avoid AI slop in writing. Don't ask for final text immediately. Outline → critique outline → iterate → expand to bullets → draft. Treat AI as a brainstorming partner.
Viral failures aren't representative. Strawberry R-counting and "walk to wash your car" went viral, but modern models handle deep research, personal data analysis, and even building websites.
Pre-trained knowledge. Models learn from internet text (Reddit, Wikipedia, books, news). Common topics (cooking, celebrities) are reliable; niche topics (quasars) less so. Your proprietary company data isn't in there unless you provide it.
Typos are fine. Models handle misspellings well — don't waste time perfecting grammar in quick prompts.
Web search. Knowledge has a cutoff date. Questions about recent events, locations, or niche topics trigger search. Steer toward reliable sources (WHO, FDA) vs. defaulting to Reddit. Understand the two-model architecture: user-facing model sees summaries of pages, not full text — citations can misrepresent sources.
Deep research. Underused superpower: agentic multi-step research across dozens of sources over many minutes. Approve a research plan, let it loop (search → evaluate → search again), get a cited report. Can turn into webpages/infographics (Gemini).

Module 2: Thought Partner, Context & Reasoning

Brainstorming beyond lists. Generating 200 uses for a brick is easy; better brainstorming = more context + longer iteration. Unique/creative answers are statistically rare — push into creative space with specific constraints (trampoline, cat, no squats).
Iterate to discover context. Start broad ("help me pay off debt"), get generic advice, then refine through back-and-forth until the model asks the right clarifying questions.
Context window. Everything in a conversation (prompts, uploads, replies) fills context. Longer iterative workflows compound useful information.
Desktop co-working apps. AI can discover files on your computer and take actions (read, write, move) — powerful for gathering context automatically.
Reasoning models. "Think step by step" is largely obsolete. Say think hard or ultra-think — models can now handle tasks that take humans hours. Use the best available model, give full expert-level context, and assign real hard tasks.
Reasoning loop. Think → maybe use a tool (web search, read files) → think more → repeat until done.

Module 3: Sycophancy, Writing & Editing

Sycophancy management. Models trained to please users. Use neutral questions, pros/cons without hinting preferred answers, and explicit rubrics.
What is AI slop? Em-dashes, "delve," "nuanced," lists of three, "it's not X but Y," vague importance — sounds polished sentence-by-sentence but lacks substance. 40% of US employees received work slop in one survey.
Progressive outlining. Research → brainstorm outline options → pick and refine → counter-argument section → draft. Upload your own stories and evidence.
Editing workflow. Paste your draft, ask for specific feedback (clarity, tone, structure) — don't ask AI to rewrite everything blindly.
Voice and style. Provide examples of writing you admire; ask AI to match tone without copying.

Module 4: Multimodal — Images, Audio, Video, Code

Images as input. Whiteboards, handwriting, receipts, gym equipment — AI sees coarse structure well, misses fine details. Multiple images (brainstorm photos + notes) work great for meeting summaries.
Image generation. Restoration, creative prompts (ask a text model to write the image prompt). Iteration is expensive (tens of seconds, many cents per image) vs. cheap text tokens.
Build apps without coding. Goal + inputs + outputs in your prompt. Simple games, Pomodoro timers, bill splitters, outfit pickers — one prompt often works. Multiplayer/live feedback = harder. Deeper path: "Build with Andrew" course.
Data analysis. Upload spreadsheets (running data, sales records) — AI writes code, plots charts, surfaces insights. Not replacement for top data scientists, but fast for basic analysis.
Capstone lab. Brainstorm research questions iteratively → deep research with sources → build a quiz/minigame/infographic app from the report and share it.

Key Power User Habits (Cheat Sheet)

Situation	Power User Move
Complex decision	Upload all docs + "read everything and think hard before answering"
Personalized output	Upload context first (trackers, notes, screenshots) — then assign task
Honest feedback	Neutral framing + rubric — never telegraph the answer you want
Writing	Outline → iterate → bullets → draft (never jump to final text)
Recent/niche info	Web search or deep research; specify high-quality sources
Hard problems	Best model + thinking mode + "think hard" + full expert context
Creative brainstorming	Weird specific constraints + multiple iteration rounds
Images	Upload what you're looking at — faster than describing in words

Expert prompting is now a high-demand skill in virtually every job role. Ng's closing advice: keep trying new models, assign hard real tasks, provide high-quality context, and use these powers to help yourself and others.

Related Guides

Context Profiles for AI

Reusable context foundations that make every AI session smarter — directly applies Ng's context lessons.

Read the deep dive →

AI Roles & Character Definitions

Precise role definitions improve output quality — complements progressive outlining and style control.

Read the deep dive →

Advanced Claude Skills

Production-grade patterns for using Claude as a true coworker and system builder.

Read the deep dive →

Harness: AI Coding Tips

Practical prompting for coding — prototype first, plan, rubber-duck, subagents.

Read the tips →

Loops, Not Prompts

Next level beyond prompting: autonomous loops instead of one-shot prompts.

Read the deep dive →

The 5 Roles in AI-First Companies

How prompting skill maps to team archetypes in AI-native orgs.

Read the framework →

Full Transcript

Complete transcription of Andrew Ng's 2-hour prompting course. Video embedded above.

Using AI well is one of the most impactful skills you can develop.

And people that are not yet at the cutting edge of AI usage often run into AI generating frustrating outputs.

I want to make sure you're an expert prompter and can take advantage of today's AI tools, which are much more powerful than they were even a year ago.

Let's take a look at two different experiences, the AI novice and the AI power user.

Many AI experts have learned to use it to answer hard questions.

In contrast, many people, including AI novices, may have gotten used to using AI for simple questions, as if you were prompting it like a Google search.

So you ask it, does Taco Bell still have the double decker taco?

And maybe you get an answer like that, which is fine.

But if you have much harder questions, you can also ask it of the AI and give it time to think.

For example, if you are looking to buy a car, you can upload to most of the commercial services like ChaiGPT, Gemini, Anthropics Cloud, or others, a set of documents including car specs, quotes, insurance plans, and ask it, what are the trade-offs for these different cars I'm thinking about, and tell it to read everything and to think hard before answering.

And this can cause AI to spend many seconds or even minutes to think and then compile a detailed report for you.

I find this a huge time saver for a lot of things I have to do.

Another example, AI power users have learned to provide the right context or the right background information to the AI to set it up for successfully answering your question.

In contrast, I see some AI novices use a short prompt and hope the AI will fill in the blanks.

But if you think of AI as maybe being akin to a really smart, fresh college grad, highly motivated, but that doesn't really know that much about you yet, then a short prompt sometimes doesn't give it enough information or enough background context to answer your question accurately.

So if you tell AI, please write a good self-review to send to my boss.

The AI doesn't know what you've actually done over the last year because you haven't told it yet, and it might write a very generic self-review, which isn't that helpful.

In contrast, I find that AI power users almost have empathy for the AI.

I don't want to overly anthropomorphize the AI, but if you could put yourself in the shoes of someone getting a set of instructions from you, you can ask yourself, will they actually know enough about you to do a good job on the task you're assigning them?

So an AI power user in comparison might upload a lot of information to the AI, maybe give it a screenshot of a project tracker showing what you worked on, recent project docs, maybe voice memo notes where you talk through the projects, and then tell it to write a self-review to send to my boss.

And that could do a much better job capturing what you're most proud of.

One of the things power users have learned to do is how to prompt AI to get honest feedback.

A big problem with AI is it often wants to please you.

In fact, many AI systems were trained to try to make their users happy.

And if you ask it a biased question, it will often give a biased answer because it's trying to tell you what it thinks you want to hear.

For example, if you say I have a great business idea, mobile tie dyeing, critique it.

Because you called it a great business idea and you're saying it's your idea, the AI will naturally want to please you and say, what a great idea.

We sometimes call this sycophancy.

And it's well known that if you give even a hint of what answer you're hoping for, there's a good chance the AI will just reflect back your preferences or your preconceptions.

In contrast, AI power users tend to ask neutral questions that don't give any hint to the AI for what answer you're hoping for or not hoping for, or if you give it a rubric or grading criteria to tell the AI how to form the basis for its answer, that also forces it to be more objective.

For example, if you were to say, please analyze the following business idea objectively, mobile tie dyeing, and don't just make up a bunch of things for what you think, use the rubric of the grading criteria above, such as, is there a problem, is there a market?

Do I have a competitive advantage?

If you give instructions like this to the AI, then the AI doesn't know.

Are you hoping it'll tell you it's a great idea or that it will save you from spending a lot of time on the bad business idea?

And it's much more likely to then tell you something like, oh, this idea is a $8.100 and also why the school's low.

In case you run a mobile tie dye business, I wish you really best of luck, and AI could also help ask some useful questions to help you think through how to make the business even better.

Lastly, I found that AI novices and AI power users ask AI to write in very different ways.

Novices will just ask AI to write stuff, like write a blog post about the BlackBerry, and it will generate a bunch of text that maybe looks like this, which sounds like AI slop.

A bunch of generic text that's just not that interesting and takes up a lot of space.

In contrast, an AI power user will often not ask the AI system to just jump in writing directly, but instead ask the AI to first outline an article and then critique the outline and maybe iterate a few times with the outline to shape the article, and only then ask AI to start to draft the final article.

So given a set of uploaded notes as context, an expert may say, outline a blog post about the BlackBerry based on my notes so it knows what you want to talk about, and the AI may start by giving an outline, and you might then give feedback to the AI about what you like and what you don't like about the outline, and even iterate a few times, have a few back and forth rounds before you have an outline that you're satisfied with, and maybe only then, expand the outline into bullet points, and maybe even go back and forth a few times to critique the bullet points before you're satisfied with that, and then expand it into the final text.

This type of power user workflow is much more likely to generate some text that you're happy with as opposed to AI slop, and in this type of workflow, you're treating the AI as a thinking partner to almost help you brainstorm and explore different options for what you might want to write.

AI systems do make mistakes, but maybe fewer than most people think, especially if you prompt it well.

They made a lot more mistakes back in 2022 or 2023 than they do now, but a lot of widely publicized mistakes that AI has made, some of which went viral on social media, has made people think that AI maybe makes even more mistakes than it actually does.

There's a well publicized one where people asked it how many R's are there in the word strawberry and it thinks there are two R's, and here's one that I found amusing.

I want to wash my car, should I walk or drive there, and the AI says, walk, which would leave you there or wash your car, but these viral examples are not representative of AI capabilities.

In contrast, power users know that AI can deliver a significant value through tasks like doing deep research and writing research reports, or taking your personal data like your health or heart rate or running time data and analyzing that for you, or something we'll talk about later, even building websites for you.

I've seen being an AI power user tremendously benefit individuals as well as their businesses.

It'll save you time and improve your professional and personal lives.

It'll help you to build lots of cool things.

You learn how later in these videos, and being able to prompt AI at an expert level is a highly in demand job skill, no matter what job role you're in.

In the rest of these videos, I hope to take you from wherever you are today to being an AI power user.

Much isn't said about AI being useful.

I find using AI really fun as well and you'll see a few examples of that in these videos too.

Now, one foundational piece of knowledge that helps you work with AI is understanding where it gets its knowledge from, so that you can better predict when you'll get something right and when you maybe shouldn't count on this answer.

Let's go on to the next video to learn about how AI gets its knowledge.

How did you learn to write as a child?

Probably it involved reading a lot of things.

Well, it's the same for AI.

AI systems have learned patterns from reading large amounts of text from the internet.

By understanding what's in that text that AI has read, you'll be better able to predict how they'll behave.

AI models can answer questions on a variety of topics.

If you were to ask, I dropped my phone in soup, what should I do?

Then hopefully you'll make some useful suggestions.

Or why do cats stare at walls like they're seeing ghosts?

My daughter loves cats, she was actually curious about this.

It turns out cats can detect subtle sounds and movements that we as humans often miss.

Because of the amount of things is read on the internet, they will even possess niche knowledge that few people know about.

If you were to ask, what kind of things were on the vinyl record sent into space?

Years ago, NASA had a spacecraft called Voyager 1 that launched in the 1970s and is now about 25 billion miles away from Earth.

But AI will know about this and be able to tell you what is on that vinyl record.

I think it's cool that NASA chose to send readings in 55 different languages, so whoever may come across that spacecraft, if anyone does.

AI models are trained on many, many different sources of information, mainly from the internet.

And training on all of these very diverse sources of knowledge produces is pre-trained knowledge.

The term pre-trained is a technical term that you don't have to worry about.

It turns out AI systems are trained in multiple steps.

And this is one of the first steps of training that somehow wound up being called pre-training, which isn't a great term, but I wouldn't worry about why we call it pre-training, it's just what AI has learned from.

But these knowledge sources may include a lot of texts from social media like Reddit, which have answers to questions like, what are your must watch films?

Or it may have read a book on Lego micro cities, or read a Wikipedia article on fairy bread and lots of other things, or read a bunch of news articles, as well as read a lot of research articles.

On the internet, there's a lot of texts on internet forums and social media like Reddit and Quora.

There are a lot of books that AI will have read from.

There are encyclopedias like Wikipedia, news websites, research articles, and much more.

And so these trillions or tens of trillions of words will go into training the AI model's brain.

Now different types of data appear with different amounts of frequency on the internet.

And so this pre-trained knowledge reflects the frequency or the patterns in the training data.

For example, cooking is a very universal human experience, so there are a lot of articles on the internet on cooking.

There are also a lot of articles online on celebrities, on movies, and so AI will have seen a lot of texts on these topics.

In contrast, there are more specialized topics like quasar, which is an astronomical term referring to really bright objects in the sky powered by supermassive black holes.

I think they're fascinating, but there's just a lot fewer articles on quasars than on cooking on the internet.

Now while most of the internet is in English, AI systems will also have learned from some data that's written in other languages, like Cantonese.

Over 80 million people speak Cantonese, but that's far less than English, and Cantonese data represents maybe less than 0.1% of all internet content.

Lastly, there are things that AI models know nothing about at all, such as your company's secret proprietary data, which hopefully is not on the open internet, but which an AI system will therefore not have learned from.

So I find that thinking about how frequent data appears on the internet gives you a good rule of thumb for thinking about how reliable an AI system's responses are.

Now because of the data that AI has learned from, sometimes it can exhibit surprising understanding of things.

If you were to type very quickly, can you cook eggs in microwave, like shown on the left, it can actually understand this type of misspelled text very well, pretty much as well as asking can you cook eggs in the microwave.

And by the way, I've exploded a few eggs in the microwave myself, so if you ever want to avoid that, feel free to ask the AI system how to do so, so you don't have to learn the hard way.

And a reason that it's so good at understanding misspelled words is because it's actually learned from a lot of sources that could include typos.

So if you look online, you will see phrases misspelled.

And that's why when you're using the AI system, I'm not encouraging you to use bad grammar or to misspell words, but it turns out that if you're typing quickly and you have a few typos or even a lot of typos, don't worry too much about it, it's pretty fine to just send a prompt to AI and not spend too much time fixing every little grammatical error.

Now the bad news is a lot of AI sources also have misconceptions and outdated information.

So one of the skills in using AI is how to prompt it to have it give you back answers that reflect fewer misconceptions and does not overly reflect outdated information.

By understanding AI's knowledge sources, called as pre-trained knowledge, you'll be able to better predict how it will respond to your prompts.

But this pre-trained knowledge is not enough for all applications, including those that need real-time information.

For that, you need web search.

Let's go on to the next video to learn more.

At some point, the people building the AI model had to stop its training.

So there's some last date where its information cuts off.

That is, the AI's read the internet only up to a certain date and time, and its knowledge gets frozen in time as of that date.

But of course, the world moves on past that date, new things happen, movies come out, and so on.

Let's see how AI models handle gathering new information using web search so that it can address questions that even relate to things after its knowledge cutoff date.

If you're using one of the popular AI model providers like Chagipi, Gemini, and Cloud, there are certain questions that will probably trigger it to do a web search.

For example, if you ask it, what is the 6-7 meme from 2025, there's a good chance it will search on the internet to tell you that the 6-7 meme, which is pronounced six-seven, which is kind of fun to say, that this is a viral internet slang widely seen on a few social media platforms.

And the reason it triggers a web search when you ask it, what's the 6-7 meme from 2025, the cue 2025 causes the AI to realize that it may benefit from more updated online information because this could be a meme that appeared on the internet after its knowledge cutoff date.

Here's what I mean.

The specific AI model's pre-trained knowledge is frozen in time, even though the internet continues to evolve over time.

And so if this line represents time, then for a long time, the internet will have had pieces of text that say six times seven equals 42, text that talks about the children's joke, why was six afraid of seven?

Because seven, eight, nine.

But if the knowledge cutoff date was at a certain moment in time, and the 6-7 meme came after that, then the 6-7 meme will not have been seen in the pre-trained knowledge of the AI model.

So if you ask it, what is the 6-7 meme from 2025, the AI model will realize that it doesn't know about this 6-7 meme from 2025, and that it should do a web search in order to get more updated information.

Like the GPD 5.4 model from OpenAI, its knowledge cutoff date was August 2025, and this graph shows how many Google web searches there were for what does 6-7 mean.

So this 6-7 meme had taken off after this GPD 5.4 knowledge cutoff date, which is why the model doesn't really know about this meme.

Now there are certain types of questions that an AI will answer using its pre-trained knowledge, and there are certain types of questions that will tend to trigger web search.

For example, if you tell it, please find me a highly rated gym near Mountain View, California.

Then what is highly rated, what may be open and what may be closed, does change over time, and there's a good chance that this will trigger a web search.

Or if you ask it, what is the Marquette Mountain Cheese Roll, because this is a niche piece of information, it's probably not read a lot of information online about this cheese roll, there's a good chance that it will search the internet in order to get you an answer.

And if you're curious, this is actually a pretty fun event where people chase a rolling wheel of cheese down a hill.

Let's take a look more broadly at when an AI model needs to do some web search to gather more information to answer your question.

If you're asking what to do if you drop your phone in soup, or why do cats stare at walls, or walls on the Voyager 1 record in outer space, then these questions it could probably answer using its pre-trained knowledge, because these are represented in common knowledge on the internet.

But if you were to ask it about current events or something happening very recently, then it'll need to do a web search to get that real-time information.

If you ask it location-specific information, and doing a web search makes sense, or if you ask it for other types of niche information, there's also a good chance it'll realize it doesn't know enough about that topic that doing a web search to gather more information would help it give you a better answer.

For most of the popular AI model providers, web search can be triggered in either of two ways.

Sometimes the AI model will decide by itself to carry the web search, or you can also explicitly trigger web search sometimes by clicking one of the buttons in an AI model provider's web interface, or just writing a prompt, please do a web search for this, and it will comply and use a web search to answer your question.

Not all AI models have web search enabled, but the most popular ones that you're probably using mostly do have this capability.

AI will do better on many of the tasks you want to use it for if it does web search.

And web search allows it to augment its pre-trained knowledge with more current information.

But like our web search, it can return bad sources.

Let's take a look at when this is an issue, and when and how to get it to use more reliable sources to get you more reliable answers.

Web search is a very valuable but imperfect tool.

Just like when you search the web yourself, you might not always find what you're looking for.

It has limitations like finding old or inaccurate sources.

But you can work around these limitations to get AI to give you more accurate and up-to-date answers.

Let's take a look.

If you ask an AI system, how safe are green market peptides, which is a type of supplement, it may search online and find posts on social media or public forum sites like Reddit and Quora.

Or you may find websites that are in the business of selling peptides and so would have an inclination to tell you that they're safe.

And you may get back answers that may or may not be accurate.

But if you encourage the AI model to use sources from official organizations or look at studies that are backed by rigorous science, then it's more likely to look up resources from the World Health Organization, from the U.S. Food and Drug Administration, from the European Medicines Agency, and so on, and hopefully give you more reliable and scientifically credible answers.

Web search, whether done by a human on Google or Bing, or done by AI, has a tendency to draw from popular sources.

According to one report, the most cited website by an AI model was Reddit, followed by Wikipedia, YouTube, then Google itself, Yelp, and so on.

And some of these sources are more trustworthy than others.

There's just a lot of text on the internet, from social media, blogs, online forums, and the amount of text from highly reliable, scientifically verified sources is just much smaller.

So if you don't steer the model in terms of what types of sources you prefer, there's a chance that it'll tend to pull text from whatever is most available rather than what's most reliable.

So that's why if you ask it, how safe are gray market peptides, it might base a lot of its answer on social media, blogs, and forums, and only a little bit on the more reliable sources.

Whereas if you tell it to use sources from official health organizations, it may pull much more from these reliable sources.

Another limitation of web search is that sometimes web pages can be outdated.

That can lead the AI model to also not provide the most current information.

A friend of ours, AI, recently helped me find places to run in Henderson, Nevada.

This is a location-specific niche query, and so this triggers web search, and it found this list of places to go for a jog.

But it turns out that unfortunately, this pulled from a web page from more than two decades ago, and unfortunately, the location that suggests it was a school that, unlike decades ago, is no longer open to the public to go running in.

To help build intuition about how AI searches the web to use that information, let me briefly explain how web search actually works under the hood.

It turns out to be a multi-step process.

Imagine that you're asking questions of a customer service team of two people.

There's the user-facing AI model, that's what you are talking to, and the user-facing AI model has a second assistant AI model that it can ask for help to do web search.

So when you send a prompt, you are talking to the first model, the user-facing AI model, and it will occasionally decide to call up the assistant AI model, the second AI, to say, hey, please do a web search for me to gather more information.

This assistant AI model will then search on a web search engine very similar to Google and Bing and other web search engines that we as people might use, and it will scan the return results, filter out the irrelevant results, and download the most relevant web pages and then summarize them.

The second assistant AI model will then present the summaries back to the first model, the user-facing AI model, and the first model will then use these summaries in order to generate the final answer for you.

You are speaking only to the user-facing AI model, and one interesting quirk to keep in mind is the user-facing AI model has not actually read in its entirety all of the web pages it may be citing for you.

Instead it's only seen summaries of those web pages, and sometimes this causes it to misinterpret what one of these underlying web pages actually says, which is why you may have seen funny results where AI cites a web page and says the web page justifies a conclusion, but if you look at that web page yourself, it doesn't actually justify what the user-facing AI model says it is doing.

To walk you through one example of this process, if you ask the user-facing AI model, that's like the customer service agent talking to you, it will ask, what should I know before hiking Machu Picchu?

The second model may do some web searches with phrases like Machu Picchu pyramids, Machu Picchu weather, or the social customs, and so on, and it will then scan the returned results, much like you may scan a page of Google results to decide what's relevant, and filter out irrelevant results, and summarize the most relevant web pages to provide back to the first agent that then generates the final answer for you.

Now I frequently use AI models like Chakri, Gemini, God, and I also frequently use web search engines like Google and Bing.

When should you use an AI model, and when should you use a web search engine?

If you want to quickly scan multiple sources, a search engine can be useful for that.

Or if you want to navigate to a specific website, but have forgotten what's the name of that website, a web search engine can be very good for helping you find it.

Or if you want to look at data in its original form, such as if you want to buy a 2013 Honda Civic air filter, you know, you want to find a website to go to to buy that air filter, so a web search engine is very good at that.

In contrast, if you want to get a synthesis from multiple sources, or if you're searching for more complex information with pros and cons that you want weighed, or if you just want to contrast multiple sources to come up with a more thoughtful conclusion, then an AI model can do a web search and put together the results of multiple web pages for you quite efficiently, thus maybe saving you time of having to read a lot of web pages yourself.

There might be some good Google or other web search habits that you've developed, and those habits will serve you well when working with web search enabled AI models as well.

Things like looking for reliable sources and also double-checking the sources.

But if you want to go beyond searching a handful of web pages, it turns out AI models are capable of a much more extensive type of research called deep research.

This is a very powerful capability that I think is really underused by many people.

Let's go on to the next video to see what it is and when and how to use a deep researcher.

Sometimes you may want your AI to synthesize not just a handful of sources, but many, maybe many dozens of sources, and do lots of thinking to come up with the best possible deeply researched answer to a question that you have.

Popular AI trapped interfaces like Chagigly, Gemini, and Cloud all have a deep research mode.

I found this to be a very valuable and often underutilized tool.

Let's take a look.

Let's say you want to use an AI model to help you plan your Halloween haunted house.

I'm going to write a prompt to ask it to help me set up a haunted house in my front yard for Halloween and give it some information about where I am, what's the size of my front yard, what's the experience I want.

So I give it lots of context to set it up to plan it out for me appropriately.

With a prompt like this, an AI model might come up with a research plan in which it tries to think through what are the types of sources it needs to research.

Many systems will give you an opportunity to approve or potentially edit the research plan.

And if you're happy with it, I'll often launch the research plan without updating it unless I see something that just looks really wrong.

It will then go ahead and start to do online searches.

So in this example, it starts by gathering Palo Alto's rules on permits, Halloween ordinances, and so on.

And then it will read some of those webpages and synthesize what it's learned so far.

And it may then decide to do some more searches online to gather more information about fire safety guidelines.

And then it may after that decide to look for decoration ideas.

So loosely follow the original research plan, but also have the flexibility to keep looking deeper into certain areas if it thinks it needs that information.

After searching for a while, maybe many minutes, it will finally write a detailed research report for you.

This process, by the way, is an example of agentic AI.

And what that refers to is that through this de-research process, the AI model has some flexibility to make decisions by itself on what to do next, such as what additional searches, if any, to carry out.

The output of this can then be a fairly detailed and thoughtful plan with different sections aligning what you might need to think about in terms of structural and regulatory framework, safety, and so on.

If you're using Google's Gemini AI model for this, one of the neat features is it makes it easy to take the de-research that's done and help you turn it into a webpage or infographic or handful of other things.

Here's a webpage that was generated by Gemini using the Gemini de-researcher, and I think it's pretty neat that it's created a webpage with four different sections, pie charts for budget, pretty neat visualizations for noise ordinance, and I think it's pretty neat that this even has a little checklist that I could use to plan out my Halloween event.

To give you a sense of how a de-researcher works, this is loosely what it does.

After formulating a research plan, an AI model can actually issue many web searches at the same time and get back multiple webpages at the same time.

And this is one of the nice things about using an AI de-researcher.

It doesn't have to do the web searches one at a time.

It can do many of them at the same time, which lets it be very efficient in fetching lots of webpages.

The AI system can also take a look at all of these sources and quickly assess which ones are relevant and which ones are less relevant.

And based on that, it may decide whether or not to go back to do additional web searches, maybe using different web search terms.

Finally, after going around this loop a few times of doing web search, evaluating sources, deciding whether or not to go back to get more sources, it'll hopefully decide it's done.

And then lastly, take all of the pages it has downloaded and maybe summarize and synthesize all that into a report that it adds citations to and that it then presents to you.

Both web search-enabled AI as well as deep research use the internet or do web search.

The basic web search-enabled AI is good at queries like this, find me a highly rated gym, what's the weather in Dubai this week?

Whereas deep research I would tend to use for tasks that require synthesizing multiple views such as if I want to know what's the impact of daily steps on long-term health and if I wanted to search the most recent scientifically justified articles and think through the answer rather than just tell me whatever people tend to say on the internet.

Or if I wanted to deeply think through how does weather affect tourism in Dubai?

And again, not just take one or two popular answers found on a social media site but read up on weather, read up on tourism, read up on Dubai and to really think through the implications to give me a more thoughtful answer.

That's when deep research could be particularly helpful.

To give you another framework to think through when to use web search versus deep research, if you have a single question you want answered, doing work that would take me just a few seconds based on a handful of sources, that's when web search would be particularly helpful.

And web search as we've seen can be triggered either automatically or by the user.

Whereas deep researchers often is trying to draw a complex of the conclusions that may require answering multiple questions or answering multiple dimensions that relate to a question.

And I think of this as doing work that would take me minutes to maybe even hours if I was doing this manually.

And I may want many sources of information synthesized and as we've seen, deep researcher is usually triggered explicitly by the user unless you select it in the user interface.

Most AI models would not care about a deep researcher and keep you waiting for many minutes for an answer.

To recap, if you're asking, hope I dropped my phone in soup, it doesn't need to look up any online sources, we're not worried about freshness, it'll give you an answer in just a few seconds and this is good for finding basic facts, definitions, summaries for things that occur commonly on the internet.

Web search may download a handful of sources and it will find relatively up-to-date information and it may take many seconds to get you back an answer and we've seen what types of information this is useful for.

And lastly, deep research may download often dozens or more of sources, it will get up-to-date information and it will spend many minutes or longer to get you back an answer.

And it's great at answering complex questions that involve synthesizing many sources of knowledge.

Finding information is one of the most common tasks that people use AI models for.

We've seen three different paths that you can take advantage of for this type of information finding task.

You could use just a pre-trained knowledge or web search or deep research and we also walk through how and when to use these different options.

I want to make sure that you have good intuitions about when to use each of these options.

So let's go on to take a look next at a practice hands-on lab for this module which I think you find a fun way to compare and contrast what each of these three options do and more importantly will also help you hone your intuition on when to use each.

In this module, you learn how to use AI models to find information.

In this practice lab, you can explore how web search, deep research and different prompts affect the AI models output.

Let's take a look.

When you open up the lab, it starts off with a tutorial on how to use the lab.

And I'm just going to close it out here.

You can always access this tutorial again via this button up here.

And what I hope you do is follow the instructions written here.

And when you're done, click mark as complete to mark this item as completed.

Notice these buttons down here correspond to different things you might try.

So this one, current events, compares a question with and without web search.

So what's the 6-7 meme?

This on the left is without web search.

The one on the right is with web search enabled.

And if you compare, then without web search, it gives these answers.

It doesn't know about the meme.

But with web search, it tells you what is this 6-7 meme as well as the origins of this meme.

And if you want, you can also follow up and ask, how do I use 6-7 appropriately in a black tie tuxedo party?

And then hit this red button to go see his answer.

Let's go back to the whole page by clicking new chat up here, and I hope you try out the other examples, such as find me a highly rated gym and hit compare, or when's the next Avengers movie scheduled, or what major news happened today in the US, or feel free to enter your own country, and compare these results with and without web search.

This example over here shows the difference between web search, denoted by this globe icon, versus deep researcher, denoted by this microscope icon.

So I'll say if I grade market peptides, use high quality sources, and this will give you a sense of what web search results looks like versus deep researcher results.

I hope you run it and see what results you get.

One more example.

If I want to ask, can I keep my rocket propelled monster truck in my garage?

If you ask it this question, the AI model may or may not know the answer.

But if you were to also upload the lease agreement, then maybe your lease agreement, which states the terms under which you are renting your place, may have restrictions on that.

And so you will see different answers depending on whether or not you upload additional information, in this case a lease agreement, that is helpful for the AI to give you a thoughtful answer.

We haven't talked much yet about uploading your own files in this module, but this is something we'll dive into more deeply later in these videos, but feel free to play with this now.

Lastly, one fun example.

Here is a version of why do cats stare at walls with lots of typos.

Here's one with nicely formatted grammar, no typos, and you'll find that the answers are maybe surprisingly similar, that the AI system is pretty good at answering a question like this, even if it has lots of typos.

Once you've tried out these examples reflected by these buttons down here, come over your own, like, is the weather good for a picnic in Palo Alto today?

And try this with or without web search, and see what answers you get.

Or pick your own example and try comparing the results you get using web search, and if you want, uploading your own files.

So that's it for Module 1 of this course.

Great job getting this far.

I hope you enjoyed playing around with the lab.

Next, please join me in the next module where you hear about using AI as a thought partner, including having it brainstorm with you and explore ideas with you.

This has helped me shape the direction of many projects, and I'm confident you'll find it useful too.

And we'll also explore getting AI to help you with your writing and editing.

I'll see you in the next module.

One of the most helpful uses of AI is as a thought partner.

When I'm trying to think through a complex problem or make a complex decision, it's nice to have a human expert as a thought partner.

That is, someone to talk things through with.

And if there isn't a human expert readily available, AI, which actually knows a lot about a lot of things, can be a really good resource for this.

We'll go through together multiple examples of this, but to get started, brainstorming is one great such use case.

Now, I know a lot of people ask AI to help brainstorm lists of ideas.

But there are more effective ways to use it as a brainstorming partner than just having it generate a list.

Let me show you what I mean.

According to data released by OpenAI analyzing chat GPT conversations, about half of chat GPT chats are asking for writing and practical guidance.

And in fact, creative ideation accounts for 3.9%, almost 4% of all chats.

I've found using AI to help me brainstorm to be really valuable.

Let me share with you some ways to do so.

AI can be pretty good at generating options.

There's a common creativity test which asks people to name 200 potential users for a brick.

So given a brick like this, how many users can you think of it?

This is actually pretty difficult.

Some people think, oh, it could be a paperweight, maybe a planter, and oh, it could be used to build a house, too, I guess, but to come up with 200 examples is not that easy.

But if you ask an AI model, there's a good chance you can come up with a long list of ideas, and your role, if you're actually trying to use a brick for something, would be to evaluate these options to pick out which ones are the ones that you like.

Brainstorming common guidance is the more ideas, the better.

And so sometimes having AI generate a lot of ideas for you to pick from can be a powerful way to find one or two good ideas.

So this is the maybe more common use of AI as a brainstorming partner.

I want to show you a different form of brainstorming in which you give it more context, and then also iterate with the AI longer, meaning have a longer back and forth conversation to help get you to better options.

So if you tell it, help me build a workout plan, I'm 38, breaking the level, have 10 pound dumbbells in 15 minutes a day, then the AI may give fairly generic answers like three workout plans, start with 10 squats, 10 push-ups, pretty reasonable, very sensible, common sense answer.

But if you want more creative options, giving it more context can be helpful.

So if you say, I can't stick to these, give me hacks to stay on track, I have a trampoline and a cat, by encouraging it to give you trampoline and cat related workout options, which is an unusual way of approaching workouts, it may ask you to consider trampoline breaks or cat-triggered micro-workouts, where maybe every time you see your cat wag its tail or something, go do a tiny little workout.

But these are certainly more creative ideas.

AI models have some inherent creativity because they've trained on a lot of texts on the internet, which covers a lot of very different ideas, including some creative ones.

And AI's output is a little bit random.

So if you ask it multiple times, help me build a workout plan, it'll probably give you slightly different answers.

But if you give the AI basic questions, then common sense, relatively generic responses like do squats, push-ups and so on are more likely.

Let me plot a conceptual diagram where on the horizontal axis, I'm going to plot how unique a response is, how creative a response is.

On the left were responses like normal weightlifting exercises like bicep curls, which is very common sense, to then maybe slightly more unique things like standing on one leg with a yoga block on your head, to the really creative ones like cat-triggered micro-workouts.

And on the vertical axis, I'm going to plot the probability of AI giving these different responses.

And it turns out that it's much more likely to give a common sense response than a highly unique creative response.

There's a reason for this, namely it was trained on internet techs, and there's a lot more internet techs talking about dumbbell curls than there are cat-triggered micro-workouts.

And for most questions, this is actually okay, because the average information on the internet is probably decently factual.

So when you're seeking information such as what's the tallest building in the world, it's actually the Burj Khalifa, most internet techs will say it's the Burj Khalifa.

There are smaller amounts of techs that will name other buildings, but the average response, the most common response on the internet, is usually the factual one for questions like what's the tallest building.

But if you're brainstorming, then giving the average information and the most common response ends up with squats, push-ups, and almost never trampoline breaks, and pretty much never cat-based sessions.

Which is why if you ask an AI model to brainstorm with you, you get a lot of common sense ideas rather than the more creative ideas, which depending on your goal, may or may not be what you want.

So what do you want to do if you want to get high-quality, more creative ideas from AI?

We've seen with a basic prompt, you get responses from the common-sense space.

But if you give the AI model more context, so you give your age, your level, but also tell it you have a mini-trampoline like a cat, trouble-sitting motivator, no squats.

Then this context pushes it into the more relevant and creative space, and it's more likely to give a custom answer rather than to generate common-sense answers.

Now one problem that you may face when brainstorming is if you're trying to come up with creative ideas, there's so much context you could potentially give the AI model.

What should you prioritize telling the AI model?

It turns out there's a technique that is very helpful for driving what context you decide to give the AI model, which is to iterate with the AI.

Let me show you what I mean.

If I want to ask AI to help me brainstorm plans for paying off my debt, I have $1,100 of credit card debt at 19% interest, monthly minimum payments of $40, a student loan, 8% interest, and a family loan, $900.

So this gives decent background context.

Then one thing you could do is ask AI not to give you one option or tell you what to do, but to give you multiple options to choose from.

I'll often ask you to give me three to five options.

And so the AI may come up with a few different plans.

Plan one is liquidity first to preserve cash, plan two is eliminate the highest interest loan, plan three is prioritize paying your family back first.

So these are all actually reasonable ideas, and I've not yet given it enough context to know which of these plans it should favor.

And it turns out that one of the really good ways to figure out what additional context to give to AI is to give it feedback on the options it presents to you, highly relevant feedback that allows it to then give you the next set of options.

I don't like option one, it's too passive.

I do like the idea of paying off the 19% interest loan.

Oh, and I forgot, I actually have $450 cash coming, and I'm also moving house soon.

And then with this additional context, it now knows among plans one, two, and three, maybe what you like and what you don't like, and you can ask it to create three new plans.

And then once again, by giving it feedback on these plans, you are giving it additional context that will help shape the AI model's thinking.

And you can keep on iterating like this for a while until it comes up with a plan that you do like, and then maybe have it flesh out the details of the one plan or two plans that you like the most.

I found that giving feedback to the AI on what it thinks are good ideas is just a very useful mechanism for very efficiently figuring out what helpful context to give to the AI.

To summarize, if you're brainstorming, consider giving AI as much of the relevant context as you can in advance, and then ask it for a handful of options.

Then give it feedback on the different options, and ask it for more options, and iterate multiple times, get more options, get feedback, get more options, get feedback, and do that a few times until you have one or more ideas that you're satisfied with.

If you follow this recipe for brainstorming, I think you'll find you get consistently more useful and creative ideas.

Now, you've heard me use the word context quite a few times.

It's important to give your AI model the right context so that it knows enough to do what you want it to.

Let's take a look at the next video at how context works and how it is used to produce a response.

According to psychologists, most humans can keep only about seven things in their active working memory at a time.

That's why remembering a grocery list of about seven items is just barely doable if you aren't thinking about other things, but remembering a grocery list of 15 or 20 items is much harder.

Interestingly, AI can use a large amount of context.

Some models can have context sizes of hundreds of thousands of words.

Let's see how an AI model's context works and how you could take advantage of it.

AI models can read and reason over very large amounts of context.

For example, if you are trying to choose an apartment, you can upload hundreds of pages of lease contracts and upload tenant reviews and neighborhood statistics and ask AI to read all of this and to tell you the pros and cons of each of the options.

And you might write a prompt like, pros and cons of each apartment, read everything and think really hard before answering.

By the way, telling AI to think hard or think really hard is another common prompting pattern that we'll come back to a little bit later.

Context refers to all the text and files that the model uses to generate this custom response to your query.

If you give it a prompt like pros and cons of studying physics versus zoology, then it will generate an output based on this very limited prompt, very limited context that you have given it, and the response will probably be fairly generic.

But if you give it more context, maybe give it your career assessment results and give it your high school schedule so it knows what class you've been taking, and then ask the same question, all this additional context will help it to give a much more custom and likely higher quality response.

If you're trying to think about what context to give to the AI model, think about what's all the information that a trusted advisor would need in order to think at length and reason and then give you a good answer to your question.

And a smart advisor that knows nothing about you but has just asked what are the pros and cons of studying physics versus zoology, the best they could really do is give a pretty generic response that's not custom to you because the context in the example on top just doesn't have anything specific to you.

AI models start with some amount of built-in context, and leading AI models today can accept maybe up to around 750,000 words as context.

And this corresponds to about the first four or five Harry Potter books, so that's a lot of text, or to several days of continuous speech.

So many people underestimate how much information or how much context you can give to an AI model.

Now, when you ask the AI model a question, by default, its context is filled with a few things.

First, there's something called a system prompt, which usually is how the AI model knows what's the current date, knows the name of the model, basic capabilities, maybe general instructions to be helpful to the user.

And then, if your AI model is able to use tools like a web search engine, in its context will also be written descriptions of what are these tools and how to use them, such as what is a web search engine and how should it use a web search engine.

Before you've written your prompt, the context includes the system prompt and these two definitions.

And when you then write your prompt, your prompt is added to the AI model's context, and it will then use all of these things as input to generate a response.

Like you have the options, full body, upper lower split, low impact strength.

The input text, that is the prompts you've written, as well as the AI responses are called the chat history.

And the chat history gets incrementally added to the context of the AI model as well.

Now, if you start off giving the AI model more context, such as write a longer prompt, as well as maybe upload a handful of documents to tell it more about your workout schedule or your workout preferences, then all of these files can be included into the AI model context and used to generate the response.

Now, if you continue the conversation to say, I like this about the first plan, I don't like that about the third plan, what you say here is added to the AI model context, and then this additional response in this type of brainstorming workflow is further added to the context.

And this is why whenever you ask the AI to go back and forth to generate additional answers, it knows everything that's been said so far in the conversational history.

Now, let me take this in a different direction.

Imagine you had asked it for the workout plan, like I've shown here, and it's given you a few workout plans.

If you were to, instead of give feedback on these plans, go in a totally different direction and say, now come up with a workout plan for my mom.

Well, a lot of the context the AI model has, including your schedule, your workout preferences, all that isn't really relevant for your mother's workout plan, I guess, unless the two of you work out together.

And so all this context would be distracting for the AI system and might lead it to generate a worse answer.

And in fact, it can be hard to know whether this answer was influenced by the previous context.

This is why if you're going to go off onto unrelated topic, it's better to start a new conversation so that you can empty out the context and start with just the new prompt or just the information that's helpful context for the new question you want to answer.

You've seen how context is used to produce high quality responses.

Thinking about what's in the AI model's context and managing that so it has just the relevant information and hopefully not too much irrelevant information, although it can ignore a little bit of it, that will help you get better answers from your AI.

One way of handling lots of context is to allow the AI model access to your computer so that it can explore relevant files and pull in relevant files into the AI model's context only as needed.

Let's take a look in the next video at this very powerful technique.

AI is moving beyond just chat interfaces.

You may have heard of applications like Cloud Co-Work or Microsoft Co-Pilot Co-Work or Google Antigravity.

These are applications that can, with your permission, gather context agentically from your computer, meaning they can find and read files from your computer in order to give themselves the information they need to do a task.

This is a new and exciting way of using AI to accomplish real work.

Let me illustrate a common use case for these AI desktop apps.

If you have been doing research on a topic and have a messy folder with lots of PDF research reports, images, and so on, you can, with one of these apps, ask it to read through the files in the folder and propose a new organization for it based on what it finds.

In this example, the AI is able to look through this folder and apply a bunch of changes, renaming files, moving files around, creating subdirectories to come up with much more sensible organization for this.

Let me show you the process of how it is actually done.

You might start off asking it to organize the folder, first understand what's there, and it can then automatically or agentically look at how different files are named.

After it has figured out enough of what are the files in this folder, it then comes up with an initial proposal for how to reorganize it.

And if you take a look at the initial proposal, you may be not fully satisfied and give it a little bit further instructions to tell it what to do with this data.

And finally, it comes up with a refined proposal, which I'm happy with, so I'm going to say go ahead and carry these instructions.

And then it reorganizes my folder to make all the files much neater.

Here's how AI desktop apps work, they are powered by an AI model, so it comes with the AI's pre-trained knowledge.

And it also has a set of tools like web search, so you can choose to carry on a web search if it needs to do so to accomplish this task.

And has additional tools to work with files on your computer, including the ability to search through files, to read files, to write files, to move and rename files, and so on.

So when asking a desktop app to do things in a computer, a best practice workflow would be for you to tell what task you want done, such as organize the files in the folder.

Let the AI system propose an action plan, but not yet take action.

You can then review the plan, give critique, maybe have it update this plan if needed, and only when you're satisfied with it, then tell it to execute the plan, which you can go ahead and do on your computer.

One neat aspect of these desktop apps is that they can automatically explore files and manage context by reading files only when needed.

If you are using an AI chat, then you have to decide in advance what files to upload to give the AI model context.

So for example, if you need to write a schedule for filming, you might upload a file that aligns your filming procedures, and based on that, it can generate a schedule.

And notice that you had to decide in advance what context to provide to the AI.

But with an AI desktop app, if you start the application in, say, your document slash filming folder, then if you tell it, write a schedule for filming this week, then the AI can decide to explore the files in this folder so that it can see what files are there, load the relevant files, and then on that basis, figure out what is a good filming schedule.

And in this example, surprise surprise, it actually notices that one of your crew member's birthday is the week of filming, and so maybe you'll fold in a celebration for your crewmate Mia.

And note on using these desktop apps safely.

Desktop apps can get access to and can edit or even delete your files.

And while mishaps of deleted files are pretty rare, they have happened to people.

So I encourage you to choose the most relevant folder to run the AI desktop app in.

For example, instead of running this in your home folder and giving this access to all your files, maybe just give it access to the subset of files it really needs for a task in a certain folder.

When the AI system makes a permission request, I would encourage you to carefully review the permission request to make sure you know what it is reading and writing.

And so I'll give AI access only to the documents I want it to know about and let it write only to the files of places I want it to.

And when an AI desktop app deletes a file, it often does not go to your recycle bin, so there may not be a way to recover the file.

And if it edits a file, it behaves a bit differently than if you were editing the documents.

And in particular, edited files usually don't have an edit history, and so it's not possible to go back if it made some change that you may not like.

So until you are very familiar with these tools, I'd encourage you to look carefully at the permissions requests it makes to decide what you do and do not want the AI system to be allowed to do.

AI desktop co-working apps are a powerful tool you can use to give an AI model the ability to discover relevant context, as well as to take actions such as reading, writing, moving, renaming files.

Now, we've talked about context a lot in these last several videos.

And using as much relevant context as possible enables your AI system to help you with what's called reasoning tasks, by which I mean tasks where you want your AI to think, maybe for a long time, to give you the best possible answer.

Let's go on to the next video to see examples of reasoning with AI.

The latest AI models have very strong reasoning capabilities.

What that means is, they can think rigorously and at length about the task when given the right context.

I find myself more and more using AI as I've recently mentioned.

Let's take a look.

Here's an example where thinking for a long time can help get a better response.

If you are car shopping and are considering trade-offs among multiple cars, you might upload spec sheets for the cars, insurance plans, column quotes, lots of documents, and then ask, what are the trade-offs for each car?

Read everything and think hard before answering.

The AI model may then think for quite a long time to read the documentation, maybe do some online search, then maybe think through what are the evaluation criteria that will be right for you, and then generate reports on the pros and cons of different cars.

And just as doing research on what car to buy could involve gathering a lot of information and thinking at some length about the pros and cons, AI can help you with that.

As AI models get better, their ability to carry out long-running tasks has grown rapidly.

This is a study by an organization, Metre, which plots for tasks at different levels of difficulty, as measured by how long it will take a human to do the task, that's the vertical axis, how well can AI do these tasks.

So for example, a task like finding a fact on the web may take a human just several seconds.

Summarizing a few pages of text may take a human an hour.

Write a blog post a couple hours, audit legal documents, explore the complex cybersecurity vulnerability may take a human many hours, and so on.

And around 2024, 2025, models could start to do tasks that took seconds to many seconds to many minutes or tens of minutes.

And in 2025, models could start to have a decent success rate for doing tasks that took humans longer and longer and longer.

To the point where now AI models can do tasks that can take humans many, many hours to do.

Often, the AI model doesn't need 10 hours to do a task that takes a human 10 hours to do, but also takes a bit longer than just a few seconds.

And this is what reasoning models has enabled for the AI to think at great length in order to do these more complex tasks.

If you remember the how many hours in Store, for example, that's a task that takes a human just a few seconds to do.

And several years ago, AI models used to sometimes get this wrong.

And it was also in that era, maybe 2023, 2024, that you may have heard advice like, tell the AI model to think step by step.

And back then, this was good advice, but this advice is largely obsolete now.

And I no longer tell my AI model to think step by step.

Instead, I'm more likely to just tell it to think hard, and it knows what that means, and that it should reason at length, not necessarily step by step, but in even more complex ways in order to accomplish the task successfully.

And so rather than think about AI as counting hours in strawberries or having to be told to think step by step, today, you can ask AI models much more complex questions like what are the trade-offs for each car?

Look at all this context and hope to create a strip for a custom polycast.

So for these more complex tasks, I recommend trying to use one of the more modern models if you're able to access one.

And there may well be models even more modern than the ones listed on the slide that could be available to you now.

So this is how AI reasoning or how AI thinking at length works.

If you ask it to plan the fastest way to visit five landmarks in Rome in one day, then it might want to gather quite a lot of information to check map distances via web search, expect walking times, search opening hours, reorder the stops, and so on, and then generate your optimized itinerary.

The reasoning process may require the AI to think at length and repeatedly gather additional information and then to think some more until it's satisfied with the answer.

Conceptually, you can think of reasoning as a process like this.

Given your prompt and other input context, it will reason or think for a while using that context.

And then depending on where it gets to, maybe it'll decide it's done and then just give you the final answer.

Alternatively, after thinking for a while, it may decide that it needs to use a tool to gather more information, maybe via web search or maybe by reading more files from a computer if it's a desktop app, and then use that additional context to reason longer until it either again decides to use a tool to gather more information or decides it's done.

And so it can go through a few rounds of gathering more information and reasoning longer before it decides the answer is good enough to present to you.

If you are working on a complex task and you want a model to think at length, then one way to do so is to just tell the model to think.

A few of the interfaces among the popular AI model providers will have a thinking option, and if you select that option, that's a cue to the model that you want it to think longer.

Alternatively, in your prompt, you can also just tell it to think really hard about this and it'll usually obey your instructions, or some people will use the phrase ultra-think, and that's another keyword that the models understand as a cue to, you know, think really hard about it.

And if you do so, an AI model will sometimes think for many tens of seconds or even minutes or maybe sometimes even over 10 minutes in order to give you a good answer.

Especially with reasoning or with thinking models, I also encourage you to try giving the model hard tasks to see what it can do for you.

So if you are building a startup, maybe give it lots of context on what you're doing and maybe tell it to design a 12-month plan for a full-person startup with limited cash.

I encourage you to give the AI model real job tasks, real problems that you want to think through or that you want to solve, and as part of setting up for success, try to give it all the context that a human expert would need in order to complete the task to make sure that the AI model has the sufficient information that really anyone would need to do the task successfully.

To wrap up, if you want to use AI to reason at length about complex problems, I encourage you to use the best models available.

The best models are often better than models that are maybe 6 to 12 months older.

Remember to give it as much context as is needed to carry out the task.

It's okay to try giving it hard tasks to see what it can do.

So don't just give it trivial tasks.

And lastly, either select thinking mode or just tell it in the prompt to think hard.

We've spoken about when it's helpful to use the cutting-edge models, but it turns out even the most up-to-date models have some common issues, one being the tendency to tell you whatever it thinks you want to hear.

This is called sycophancy.

Let's go on to the next video to see how to manage this behavior of AI models.

AI models will act in ways to try to please you.

Because of the way they've been trained, they have a strong bias to tell you what you want to hear.

This is called sycophancy.

And avoiding it is a key prompting skill and involves prompting neutrally and keeping context factual.

Let's take a look at how to avoid sycophancy because this will help you get much better answers from your AI.

If you're considering the pros and cons of remote work versus in-office work, and you ask it, don't you think remote work is better than office work?

This word in the question gives away what you're hoping the answer is, and AI will probably say, yes, remote work offers many advantages.

In contrast, if you were to ask, is it true that office work is more productive?

Then it will probably agree with you that office work has these strong benefits.

So depending on how you ask the question, it will tend to reinforce your own preferences, your own biases.

And this may not be the most helpful thing for you to make objective fact-based decisions.

In a study by the Washington Post on chat GPT responses, it was much more likely to respond with phrases like, that's correct, good point, you're on the right track, compared to not quite right, that's not the case, or actually, and in fact, it tended to agree strongly about 10 times more than it disagreed.

And some models from chat GPT have said things like, dude, you just said something deep without even flinching, you're 1000% right.

And while some users appreciate AI agreeing with them like this, I personally don't find it that useful to have AI just tend to agree with whatever I say, even when I'm not right.

While leading AI model companies are working to reduce sycophancy, it is still a problem.

Models are trained to be hopeful assistants using human feedback, and this reinforces sycophancy.

For example, if you ask an AI model, I feel like it's better to be an introvert, don't you?

If the AI model responds, that's an interesting idea, here's why I tend to agree, then most people are more likely to hit thumbs up on the feedback button, because it's a nice answer, makes you feel good.

But if AI were to say, not necessarily, both types, introverts and extroverts, carry real trade-offs.

You know, people just don't feel as good about that answer, and so they're less likely to hit thumbs up or may even hit thumbs down.

Because of this type of feedback, AI, which has been trained to generate more answers that leads to thumbs up positive feedback, will learn to try subtly to agree with people more often than not, and this leads to sycophancy.

And sycophancy feels hopeful, but actually degrades answer quality.

Sometimes maybe sycophancy is easier to spot, if you were to say, I'm really proud of this essay, what do you think?

Well, they'll probably agree with you.

But other times, it's harder to detect sycophancy.

If you were to say, analyze this data and find all the positive measures of performance this quarter, you're subtly signaling that you're looking for positive measures of the company's performance, and so it's more likely to say something like, data clearly shows revenue growth, strong retention, improving margins, and less likely to point out problems.

To avoid sycophancy, try to use neutral framing of your questions and avoid giving any hints as to what is the answer you want to hear.

So for example, if I were to ask, aren't carbon taxes bad for small businesses, I'm really telling it what answer I'm hoping for.

In contrast, a more neutral prompt would be, to what extent, if at all, do carbon taxes affect small businesses?

If someone reads the question on the right, it's actually not clear what answer they want, so it's harder for AI model to be sycophantic.

Or if you were to ask, do you agree that AI will create a lot of jobs?

I actually happen to agree, but if I'm actually doing research, I don't want it to just tell me what I want to hear.

Instead, you can ask, what does current research say about AI's effect on jobs?

Or instead of asking, does remote work reduce worker productivity, maybe ask, how does productivity compare between remote and in-office work?

One common pattern is to lay out two options, such as remote and in-office work, and just ask it for pros and cons of the compare, but without hinting which of the two options I am hoping will come out ahead.

Don't you think that was the best video ever?

Oh, what do you think of the pros and cons of the videos you just saw?

Sometimes it's nice to be told what you want to hear, but generally that won't help you to do better work.

Sycophancy, despite a lot of attempts to combat it, is still one of the pervasive issues with practical AI usage today, and so implementing the strategies from this video, especially taking more neutral framing, will help you get more objective and valuable feedback from AI models.

Now, one of the most common tasks that people use AI for is writing.

Let's go on to the next video to see how to work with AI models to help with your writing.

In a study by OpenAI, writing accounted for 24% of tasks that people asked ChatGPD to do.

This is the single largest group of tasks.

Writing is really a kind of thinking.

I find that when I'm writing, I have to think.

And so AI, which is really good at thinking and reasoning, can help you with this.

But just asking AI to write for you often leads to AI slop, or writing that sounds like AI writing, which somehow feels different than human thoughtful writing.

Let's take a look at some techniques for getting AI to write effectively for you, and how to take advantage of AI reasoning and avoid AI slop.

What makes AI slop?

Many people have noticed that AI writing often includes the M-dash, that is the long dash, much more than normal human writing.

On the social media site Blue Sky, the use of the N-dash has been trending upward ever since the release of ChatGPD.

A recent survey showed that 40% of US-based employees have recently received work slop in the last month.

And the term AI slop refers to content that's generated by AI, and that looks good if you don't read it too carefully.

So maybe every sentence read in isolation sounds like it's well written, but collectively the text just lacks substance.

It feels like it was written without much deep or careful thought.

Another property of AI slop is it often contains sentences like this.

But it does change everything.

It's kind of vague, empty sounding, but also somehow overly important sounding text.

AI's distinctive writing style comes from certain words and patterns that it tends to overuse.

AI tends to use fewer unique words, and over-represents some words and phrases.

For example, AI tends to overuse the words nuanced and delve.

And it tends to use lists of three more than most people will.

And it tends to use fewer nouns, leaving the phrases like, this is a robustly structured and highly insightful paper.

And this not X but Y is not just about speed, it's about availability.

In fact, I've been noticing on social media a lot more of this not X but Y type of verbiage often with X and Y both vague things.

Like it's not about infrastructure, it's about architecture.

And a lot of phrases like those are just vague and don't reflect a deeply insightful point of view.

Interestingly, because humans are using AI models so much, humans are themselves starting to sound more like AI.

And humans, ever since ChatGPT was released, are using the word delve more in podcasts and talks.

And this is true both for spontaneous speech as well as for prepared speeches.

So this isn't just a case of people using AI to write scripts for them.

It looks like people are picking up speech patterns or texting patterns from spending a lot of time with AI.

So what's a better way to write that avoids generating AI slop?

One technique that I think you'll find helpful is to use progressive outlining in which you don't ask AI to write the final text right away, but instead have it write an outline, refine the outline, and iterate a few times before having it generate the final text.

For example, here's a prompt that says I'm writing an article about small AI teams moving faster than large teams that don't use AI.

And I just want to acknowledge that this is not a neutral prompt where I'm asking AI to help me decide if small AI teams do move faster, but instead this is writing an article from a certain point of view.

But you can ask AI to research evidence for and against this hypothesis.

So the AI model may search online and find a handful of articles.

Next, if you wanted to help you brainstorm a handful of options for the story outline, you can tell it to brainstorm or to create three different outline options.

You tell it to include a counter-argument section and also provide or upload a handful of stories from AI teams, say, that you work with.

With this input context, the AI may give back a few different options for what the outline for your article might look like.

Option one could be to tell the three stories and then conclude with a thesis.

Option two might explore different patterns of how AI teams work and so on.

Then following what you saw in how to brainstorm with AI, you might give feedback on these options.

And so you might say, let's use option one and keep all the stories, but move the thesis right after story one, and also maybe, say, you want to add a historical analogy, Pixar creating the Toy Story in the 90s, which is actually a really inspiring story where Pixar, at that time a small company, created Toy Story, which was the first fully computer animated feature-length film, just using a really small team.

Based on your feedback, the AI maybe gives you back a revised outline.

If you're satisfied with this outline, then you can tell it to expand each heading, not even into the final text, but just into bullet points.

And following that, you might decide to give it more feedback on the bullet points and iterate on the bullet points before finally having it generate a text for the article.

And it turns out that starting with an outline speeds up review.

Let's say you're working on a fun article about whether a flying squirrel can carry a coconut.

In contrast to having AI first work with you on the outline, then on bullet points, and then on the final text, maybe you ask the AI to write the final text right away, just from the start.

If it writes a sentence like this, you may be unhappy with a few words, and you can edit a few words, but changing each word just changes one word, and the rest of the paragraph stays the same.

In contrast, if you write an outline first, and you're unhappy with part of the outline, then changing the outline causes an entire section, that's a lot of words, of the final article to change.

So that's why editing the outline or iterating with the AI system on the outline is very high leverage, because you can figure out how to change just a few words in the outline, and this will result in an entire paragraph or entire section of your final article changing.

And this ends up being a much more efficient way for you to think through what you want to say in an article and adapt it to what you want it to be.

Writing is one of the most common use cases of AI.

According to OpenAI's data, of their writing-focused chats, about two-thirds involve starting from some pre-existing text rather than starting from scratch or starting from an empty sheet of paper.

When you already have something written up, it can be very helpful that AI critique it for you and help you make it better.

Let's take a look at some techniques for this in the next video.

You often have some idea where you've already written some text about your idea, but want an AI model to help you edit and refine your text.

I often show my writing to AI to help me make it better.

AI is great at this task, and it always is time to read your work, whereas finding a human to help you out might be trickier.

You've already learned how to avoid sycophancy, but how do you get the best quality editing and critique?

Let's take a look.

One useful technique for editing with AI is to edit your article piece by piece, such as one sentence at a time or one paragraph at a time, rather than telling it to edit the entire article all at once, and to do a little brainstorming around each paragraph until I've nailed down one paragraph before going on to the next one.

For example, if someone's written this sentence, they'll probably think achieving AGI, artificial general intelligence, means computers would be as small as people.

You might ask AI to help brainstorm a few different ways to say this, and so it may come up with a punchy way to say this, a visionary way to say this, a conversational way to say this, and depending on your editing goals, you could even iterate a little bit until you pick some version of rephrasing this part of the text that you like.

After you've nailed this down, then go on to the second sentence or maybe the second paragraph and work on that a little bit with the AI, and then go on to the next sentence and next paragraph and so on until you get through your entire article.

And I find that working on one piece of a long article at a time makes for a much more manageable workflow than if it were to change a lot of things all at the same time, and then you're reading this very long edited article to figure out what has changed and what you like and don't like.

Now if you want high-level, more holistic feedback about an entire piece you've written, it turns out AI can help with that too.

But because of sycophancy, AI is often not a very good objective critic.

For example, if you wrote a sci-fi short story about an astronaut stepping out of his ship, and if you ask AI without further instructions to critique it, there's a good chance it'll tell you whether you did this fantastic work.

In contrast, there's a very helpful technique to guide AI in how to evaluate your work to give you more helpful, critical feedback.

And that is to give it a rubric, and that means a grading criteria.

For example, you might write a rubric that specifies what are the most important criteria by which to grade or to judge the work.

So you may say that characters of the story is worth 25 points out of 100 to plot 25 points, world building, writing craft, and establish a point system, and then also develop detailed instructions on how to evaluate each of these criteria.

So to evaluate characters, maybe ask if every named character has a goal, and that's worth 25 points, conflict between two characters' goals, and so on.

And giving AI very explicit criteria on how to judge work forces the AI to be more objective.

One thing to notice about these criteria is that each of them is very clearly and unambiguously defined.

So for each text, each of these criteria is either true or false, yes or no, and there's nothing in between.

So either it's true that every named character has a goal or it's not true.

And these completely objective criteria forces AI to look at whatever you're giving it through a objective, well-specified standard with no ambiguity.

And by the way, if you're not sure what rubric or what grading criteria to use, you can brainstorm with AI to develop that rubric, and AI is actually pretty good at this too.

After you've written the rubric, you can then provide the rubric as well as the story, and in your prompt, ask the AI to be objective.

So critique the attached sci-fi story, assign the score per category, then sum the scores at the end.

And by giving the AI these very clear instructions on what to do to sum the scores at the end, it then hopefully gives a more objective assessment of your story.

If you want, you can also ask it to then give you suggestions on how to improve the story to do better on this rubric, and this will cause it to give you more focused suggestions to improve it in the dimensions that you think matter the most.

In contrast, poorly written rubrics encourage sycophancy and ambiguous or less objective thinking.

For example, if you say, I'd work on the sci-fi story, please score it out of 100.

One of the things that's strange about this prompt is you first ask it to score out of 100, so that will tend to cause it to leap the conclusion about what score, and then only after that, assign a score per category.

Characters plot while building a writing craft, and these are ambiguously defined categories because we're not told how the score characters plot and so on.

And so this will tend to cause the AI model to first come up with some score, and then justify it rather than score it carefully according to the rubric, and then add it up to then come up with a more thoughtful score.

And as you see in the practice lab, to come at the end of this module, this type of rubric will tend to give higher scores than more objective rubrics.

We talked about using AI to critique and hope to get suggestions for improving your work.

It turns out that having AI critique its own work, or having one AI model critique a different AI model's work, can also help improve the results.

For example, if you ask ChatGPT to write a user manual for a fantasy role-playing game, then it might generate a file for you.

And one thing you could do is provide a rubric for ChatGPT to critique his own work, but a neat technique is to find a different AI model, maybe Gemini, and give that a gradient rubric to have Gemini critique ChatGPT's work, or vice versa.

And it turns out that this type of cross-model review, where you have one model review a different model's output, it helps integrate a bit of knowledge from the two different models and can result in slightly better results than if you were to ask one model to critique its own results.

I think using multiple models in this context might give only a slight boost in performance.

Basically, if you ask ChatGPT to review his own results, or ask Gemini to review his own results, I think that will actually do just fine, but sometimes I find it reassuring if a totally different model judges the output of a different AI model.

And I use this technique only rarely myself, but one thing I do do is frequently switch between different AI models.

It turns out that AI models are advancing rapidly, and at different moments in time, different models will do better on different tasks.

And so routinely trying out different models will help keep you sharp and keep holding your intuition about what model is best for what task.

AI models have what's called jagged intelligence.

If the circle represents the tasks of what people can do, maybe in a job or maybe in a personal context, it turns out that AI can do some things better than any person, like quickly read tons of webpages, or solve tricky math problems.

But there are also many tasks that AI doesn't do as well as people.

So there's some tasks where AI does poorer than human, and some where it does much better than human, and different AI models are jagged in different ways.

So the tasks different AI models can do well are different.

Moreover, the marketplace of AI models is highly competitive, so Chatsby, Cloud, Gemini, really the long list of model providers are releasing better models all the time, and so the best model for your task will likely change rapidly over time.

And so I find that I'll often take the same prompt and feed it to multiple different models to see how they compare, and this continuously holds my intuition about what models are best for which of the tasks I care about.

So that takes us to almost the end of this module.

I hope you've seen that AI models are really useful for reasoning tasks, as well as for brainstorming, writing, editing, and critiquing your work.

These are powerful ways of using AI as a thought partner that I've found very useful in my own work, and that I'm confident you will too.

Let's go on to the next video to explore the practice lab for this module.

I hope you practiced the techniques we talked about for using AI as a thought partner.

In the upcoming practice lab, you can explore, once again, side by side, more effective and less effective strategies for both brainstorming and AI critique.

Let's take a look.

Once again, when the lab pops up, you can dismiss the tutorial.

You can also find the tutorial here, as well as the instructions I hope you follow over here.

And the buttons here, similar to the previous lab, allow you to enter different prompts.

To brainstorm, here is the prompt of less context.

I need a workout plan, 30 years old, want to get stronger, whereas this is a more detailed prompt with more helpful context.

And so you can run it like so, and see the difference in the outputs.

So let me focus on this more detailed example on the right.

It's given a Program 1, which looks pretty reasonable, and a Program 2, which also looks pretty reasonable.

And we have also a few suggested prompts for how you might refine it.

So maybe I'm drawn to this, let me do that, and have it continue the conversation.

Then it will take this feedback on the programs it has given you in order to refine its results.

By the way, I actually use a workout program that AI had helped me to generate, and I've found this helpful in my personal life of thinking through my regular workout plan.

And so if you're a fan of working out, maybe you'll find some of the suggestions it makes useful as well.

Let's go back to the homepage.

Here are a couple more examples of brainstorming prompts, with a simple prompt of a thousand dollars to invest, what should I do, versus giving more context, and then what are some options I should consider, I hope you try this out as well.

As well as some examples of critiquing a sci-fi story.

As well as, this is an example of an objective rubric on the right, and I hope you bring this up and read through it to give yourself a sense of what a well-written objective rubric looks like.

And compare that with what a more subjective rubric looks like.

And if you hit compare, you see a difference in the quality of these reviews, and in fact, maybe not surprisingly, with the less objective rubric, it gets an 83 out of 100, but a more objective rubric, 75 out of 100.

It also gives more helpful suggestions how to improve your story.

In addition to critiquing and improving a sci-fi story, you can also take a look at these examples of helping you improve a cover letter for applying for a job, as well as critiquing and helping you to improve a possible business plan.

After checking out these five examples, please also try your own prompts, or try your own stories or cover letters or business plans or something else, and I encourage you to use this interface to play with different ways to brainstorm or to critique your writing.

So that's it for this module.

Please enjoy exploring the lab.

Next, let's go on to the final module, where you see applications beyond text.

Specifically, you look at multimodal prompting, in which you get your AI to also use images and audio, and also look at building applications like games.

It will be a lot of fun, and one of the examples we see will have something to do with fireworks.

So that, I'll see you there.

In the previous two modules, we've mostly had AI generate text, but AI can produce richer types of outputs as well, like images, videos, and so on.

We call this multimodal outputs, that is, outputs with multiple modalities.

Prompting for multimodal output is a bit different, because multimodal interactions are slower and more costly.

But these capabilities will let you get a lot more done with AI.

Additionally, we'll also look at multimodal inputs, specifically if you want to show some images to your AI and have a reason about that.

Let's dive in.

AI models can generate images, videos, voices, even music, code, and more.

You might have seen AI generate various fun, creative images.

I want to share with you one example of AI image generation that I really enjoyed.

For my daughter Nova's 7th birthday, I wanted a unique cake design, and she loves cats, so we used AI image generation to explore different cake designs.

The leftmost image here is an AI generated image, created with an AI generation software called Nano Banana, which is created by Google.

And this is a picture that my daughter really liked.

She wanted a cake that looks like this generated image.

We then took the image and showed it to Baker, and asked Baker to render this picture into a real-life 3D cake.

And the picture on the right shows my daughter cutting this birthday cake that she loved.

So in this case, image generation wound up being a brainstorming tool to explore different cake designs until we found one that turned into a real-life 3D cake that we all ate and liked.

In addition to generate images, I really enjoy playing with AI for video generation as well.

Here's a fun video generated by our team of a man shrinking.

That video would previously have required maybe expensive special effects, but now AI can just generate it.

AI can also generate voices.

Here's a voice clone of me reading out loud a letter from The Batch, which is a weekly newsletter that DeepLearning.ai publishes to cover what matters in AI.

Dear friends, here's the latest from this week's issue of The Batch.

A barrier to faster progress in generative AI is evaluations, evals, particularly of custom AI applications that generate free-form text.

By the way, I've played audio of my voice clone to both of my parents, and it turns out one of my parents could tell it wasn't me, and one of my parents could not tell it was me or my voice clone.

And to avoid me getting into trouble, I will not review which of my parents got it wrong, but I think voice clones are getting really good.

Lastly, AI can generate code.

I mentioned my daughter loves cats, she also loves the color yellow, and so when her teacher mentioned that she wished kids in the class could type or keyboard a little bit faster, I use AI to generate this typing game, where if my daughter hits the right letter, then she sees this fun little animation of a cat being fed, which she loves.

Using AI to write code has made it easier and more accessible for everyone, including you, if you wish, to write at least basic computer programs.

I'll say more about this later in this module.

When you're working with AI model, there are many combinations you can use of input and output types.

For example, you can input text and images, such as if you input an inspirational image like this, if you like this Halloween costume, and also the plan my Halloween costume.

And in this case, the AI model might output text that says, let me help you brainstorm a few alien-inspired costume ideas.

Or you can also upload music to an AI model and ask it, help me plan my haunted house.

And AI may be able to take these things as input and generate both text and a video of a haunted house design, incorporating your creepy sounds audio.

AI models can use most of these input types relatively easily.

Some are slightly more expensive or slightly more costly to use as an input, but the differences aren't very significant.

In contrast, the time and cost of generating different types of output vary significantly.

In particular, some data types are much slower and much more costly to generate than others.

To give you a sense, text tends to be on the lower end in terms of time or cost of generation.

So AI is very efficient at generating text.

In fact, modern AI has started with large language models, sometimes abbreviated LLMs, but because they started with language, a lot of them were really adapted to deal with text and it's very efficient at that.

Generating speech tends to be a bit more expensive, and generating images even more expensive, and generating video much, much, much more expensive than images or any of the other modalities.

And the further we go to the right of this chart, the longer it takes, or the more time it takes to generate a single output, and also the more costly it is to do so.

Image generation has progressed significantly in the last few years.

Here's a short video generated by Imogen, which was in 2022 a state-of-the-art model by Google, and it looks pretty good, but still has some artificial looking artifacts.

The lines were quite right to the back wall, the dishes changed mid-wash.

In contrast, modern AI video generation looks much better, and can also be synchronized automatically with generated audio.

Air voice generation has also gotten much better.

Here's what AI could do just a few years ago.

But before you start doing anything, you need to define what success looks like.

This step is easy to overlook, but it's foundational.

It sounded a bit robotic, not that expressive.

In contrast, modern AI voice generation can sound much more expressive and much more natural.

But before you start tuning anything, you need to define what success looks like.

This step is easy to overlook, but it's foundational.

If you're generating multi-modal data, some of the techniques you learned earlier, such as giving the model enough context, and maybe using the best model available, those are relatively easy techniques to apply as well to multi-modal generation.

But some of the other techniques, like generating multiple options, you can still do that, but if each option now takes many seconds or even a few minutes to generate, then this becomes hard to apply because you end up having to wait for a long time.

Or if you want to iterate through many designs, then that too becomes harder if each generation takes a long time or is costly.

But if you have the patience to wait a little bit longer, then all of those techniques also apply to generating audio, images, video, and so on.

With great power comes great responsibilities, and AI technologies can be used for good or for harm.

Take voice generation.

If you have recorded a podcast and you want to make little fixes, that can be quite conveniently done today using AI voice generation to just re-synthesize one or two words that you may have flubbed.

Or if you're building a video game and you want to give characters lifelike voices, more and more video game designers are using AI voice generation to do so.

It does raise important questions about the livelihoods of voice actors, and I sympathize with all the voice actors that are worried about AI voice generation.

At the same time, I think it is also very valuable that AI voice generation is making it easier for a lot more people to build entertaining video games, including developers that don't have access to the great voice actors.

In contrast to these applications, there are also some that are clearly harmful.

Unfortunately, there's been a rise of scams where someone would use an AI voice clone to pretend to be someone else, to maybe pretend that someone's relative is in an emergency and to ask to wire emergency funds.

The number of beneficial use cases of AI vastly outnumbers the number of harmful ones, but we still have work to do to combat the harmful applications, and I hope that each of us will only use these techniques for beneficial and responsible applications.

So as you've seen, AI can now work with much more than text.

It can work with images, audio, video, code, and more.

And most of the prompting techniques you learned so far will be helpful for handling multimodal inputs and outputs.

One especially useful capability is giving AI images as input, so they can see what you're talking about.

Let's go on to the next video to see how to use images in your prompts.

Providing images of your prompt can enrich the context for the AI.

Pictures of something you want the AI to see, pictures of handwritten text, really anything that might be hard to describe in words.

This video will help you build intuition about what AI can see in images.

Here's a picture of me explaining some concepts in AI in front of a whiteboard.

My handwriting is not that great, and there are a number of math concepts that I'm trying to illustrate on this whiteboard.

If you upload this picture to an AI model and ask, what is this class about?

It may output something like this.

It's teaching a convolutional neural network, and the neat thing is my head is blocking the word convolutional, but it knows from this picture that I'm teaching about a specific AI technique called a convolutional neural network.

And it's extracted some facts about what I'm drawing, and also has some good guesses about what I might ask students to do next.

So it's able to make a pretty smart interpretation of this image.

One weakness of AI models in terms of how they look at images is they tend to look at the coarse image, but may miss fine grain details.

So for example, if you upload this picture to an AI model and ask, what are these machines at my gym?

It may confidently give an answer like this, which turns out to be wrong.

And that's because a lot of gym machines, if you look at them through a slightly blurry lens, they all look a little bit similar, and AI is not that good today at looking at the fine details of images to distinguish what really is a glute kickback machine or a hamstring curl machine.

In contrast, if you were to upload an image like this and ask it to create a sales ad for this item, it actually does pretty well because this is a very visually distinct object.

And so if you're looking at this even through a slightly blurry lens, you kind of see this is a human-sized hamster wheel treadmill.

When you upload an image, you can also give it moderately complex instructions on what to do with it.

So uploading a receipt like this, you can ask it, what's my portion of the bill?

I had these items, and in this case, it gets it correct.

AI's ability to read text like this is not bad.

It does make mistakes, so I wouldn't trust it for high-stakes applications.

But if you want to take a quick look and if you're willing to spend a few seconds to double-check the result, then it could do decently well.

And AI turns out to be pretty good at even reading handwritten text.

If you upload a picture like this to AI and ask it to transcribe it, it does a pretty decent job.

Feel free, if you want, to try reading this cursive handwriting yourself to see if you can outperform the AI.

And so if you upload an image like this and write a prompt like, build an archive of a family's history based on these handwritten letters, I wouldn't trust it to read everything completely accurately, but it might take a reasonable stab at this task.

Rather than uploading a single image to AI, sometimes you can upload many images.

For example, if you just had a brainstorming session and had some notes you'd taken as well as pictures of post-it notes and whiteboards, you can upload pictures and notes to an AI model and ask it to summarize the ideas from today's brainstorming meeting.

And again, it would probably do a decent job interpreting these images to come up with some summary.

Probably not perfect, so it's worth double-checking its output, but this could help you accelerate coming up with notes from today's meeting.

To recap, AI models can read basic text in images.

Visual understanding, however, may miss details in the image because it tends to see the image in a pretty coarse way.

And you can also use many images when needed to give the AI more context.

A picture is worth a thousand words, so adding an image to a prong can often be the fastest way to get the AI model the best context.

Like taking pictures of a brainstorming exercise or digitizing your grandmother's handwritten recipe book.

Of course, AI models can also generate images.

This is an interesting capability because it works somewhat differently to how AI models generate text.

Let's go on to the next video to see what are some fun images you can generate.

Generating images of AI has made my life more fun.

For example, I've used image generation for my kids' birthday parties or to make fun illustrations.

Generating high-quality AI images is a skill you can learn.

And understanding how AI image generators were trained will help you to control image generation to get better outputs.

Let's take a look.

One neat application of AI image generation is if you input an image and ask it to edit it.

This is a childhood picture of me on the right, my younger brother on the left, and one of our childhood friends in the middle.

And this is an old, somewhat faded image.

And if you upload this to an AI model and ask it to remove the glare and the rough texture and to make it a more natural aspect ratio, it can produce something like this, which looks like a nicely restored photo.

This particular image restoration was done with Google's Nano Banana model.

If you're not sure how to prompt an AI image generation system, you can actually ask a text-based AI model to help you write a prompt.

For example, if you ask it, generate a prompt for an image of a cat secretly running a coffee shop at night, an AI text model may write a prompt like this.

Notice that here, it specifies a setting, specifies details of the character, and specifies a mood or a style.

And if you don't like any of these details, you can modify them to your liking, and a prompt like this might generate this cute picture on the right.

People skilled in the visual arts have a certain language for describing images.

For example, a picture like this with this look is cinematic, this is a watercolor image, this is a cyberpunk image, and this is an anime image.

And I find that art buffs and art history buffs excel at image prompting because they understand the language of images and can describe what they want using a more precise language than those of us that don't know this language and can't quite find the right words to describe the look that we want.

So if you want to become really expert at generating images, it could be worth reading up or studying a little bit about the language of images to understand how to accurately describe different images.

And in fact, one way to do so would be to upload images to an AI model and ask the AI how it would describe those images.

And just get whole new instincts on what types of words can be used to describe what types of images.

It turns out image generation uses a very different technology than text generation.

When AI is generating text, it produces the output piece by piece, so it generates a few characters at a time.

In contrast, when generating an image, it doesn't generate the image a few pixels at a time, it generates the entire image all at once.

Specifically during training, that is, when an AI model is looking at pictures, maybe found online, in order to learn what images look like, it will typically look at captions or descriptions of images, like a small potted plant on a wooden table, and it'll learn to start from an image that looks like pure noise, this is just a grid of random pixel values, and it will then learn to sequentially remove or subtract noise from the image, to go from pure noise on the right, to a slightly blurry picture of a potted plant, to a less blurry one, to a less blurry one, to finally to a sharp picture of a potted plant.

And that's what an AI model tries to repeatedly practice doing during training.

A model that does this is called a diffusion model.

Then when you come in and when you write a prompt, maybe create an image of a potted plant on the table, it then goes through this process, starting from a pure noise image, and then gradually tries to remove noise from it, to come up with that final image.

And the key is how it learns to subtract noise, to maybe reveal the image that the person might have had in mind.

Diffusion models do generate random outputs, and they can also make certain types of errors.

If you repeatedly ask an AI model to generate a potted plant, different times you run the algorithm might result in different images like the one shown here.

But many people have also observed that the diffusion model tends to generate weird looking hands, often with more or fewer than five fingers.

And it can often output garbled text, like happy birthday is very badly misspelled here.

And it can also lead to inconsistent characters.

So if you ask it to generate a cartoon, the character's hair has changed between the two frames of this cartoon.

Fortunately, modern AI models have become much better at addressing these problems.

For example, modern models like NanoBanana can allow you to upload a number of research papers and ask it to generate an infographic, and it'll do a decent job with text that looks mostly plausible.

Or if you ask it to generate a cartoon, the more modern models can generate fairly consistent characters, meaning now this character, as you can see, looks very similar from frame to frame of this cartoon.

And the text also looks pretty decent.

Compared to text generation, image generation can be slow and costly.

For example, if you're generating just a short paragraph of text, many AI models could do that in just seconds, and it may cost less than a cent.

Of course, generating long paragraphs of text, or if you ask it to think for a long time, that can cost more.

And also, AI models were generated word by word or maybe a few characters at a time, and you can interrupt it or have it stop early if you want.

In contrast, generating a single image might take tens of seconds and cost many cents, and it generates the image all at once, and there often isn't an option to stop early.

Because image generation is much more expensive, that's why our ability to iterate with images is usually more limited.

And if you're generating videos, then it gets even harder.

Even though some things are more expensive to generate, the good news is, the cost of generating virtually anything with AI is trending downward.

So a year from now, it will be less expensive for you to create art for your home or graphics for a family member's birthday card compared to today.

Beyond generating images, AI can also help you create fun games and websites without having to write any code yourself.

This is a more advanced capability, and it's so easy to get stuff that doesn't work or for novices to get sucked trying to build more complex applications, but I want to just give you a taste of how using AI to build custom software works.

While it's not that easy, it's also certainly much easier than it was just months ago, and quite likely easier than you might think.

Let's go on to the next video to see how to use AI to create your own mini game or website.

Building computer games and websites used to be something that only professional developers were able to do, but the ability to do this is being democratized.

By writing text prompts, you too will be able to build basic software applications and websites.

This does take some skill, and I don't want to make it sound trivial, but in just this one video, you learn some of the basics.

This is a very exciting capability, and I encourage you to explore it.

I'm seeing many people who are not software engineers have a lot of fun with this, and even with just one prompt, it's often possible to get a cool little game or app.

Let's see some examples.

Using this prompt, build a game where the user has to place obstacles and a goal, and it creates a simulation of what you design.

Claude created this game, which you can play, and it's actually pretty cool.

So with just a short, simple prompt like that, a leading AI model can create a reasonably interesting game.

One example that you see in the practice lab is prompting an AI model to generate a fireworks display.

Here's a prompt, and this creates this app, which is actually pretty fun to play with.

If you're trying to write a prompt to tell AI to build a simple app for you, here are some building blocks you might consider including in your prompt.

First is to specify your goal for what you want to create.

Next, specify what are the inputs, what do users need to input into the system, and then lastly, what are the outputs, or what the app shows back to the user.

For example, in this prompt, we are telling it that the goal is to generate a fun fireworks simulator.

The input is, I want to be able to click on the screen, and the output is, see a colorful display of fireworks.

Sophisticated developers will use AI to build much more complex applications than what I'm showing here, but just to continue with the simpler examples, to build a game like this, you might ask it to create a fun game where the user has to place obstacles and a goal, and it creates a simulation of what you design.

In addition to entertaining games, you can also make more useful and functional apps to help you save time or make your life a bit easier.

For example, you can create a work timer, called a Pomodoro timer.

This is the type of timer that people use to time their work or studying, with 25 minutes of work, interspersed with a 5-minute break.

Or maybe you can create a bill calculator.

Here you can input the bill and the number of friends you need to split the bill with, and the app can tell you how much each person should pay.

Or maybe build an outfit picker app that helps you decide what to wear based on the weather.

Each of these apps could be fun or useful in everyday life, and they're also good starting points if you're new to making apps with AI because they're quite simple.

In particular, they each have a specific, well-defined task.

They also don't need any additional files that need to be uploaded or outside information.

And they're also something you can open up, use for a short period of time, and then close.

If you're curious to try using AI this way yourself, I encourage you to experiment, starting with building simple apps.

It turns out some ideas are easier to create than others.

For example, a simple platformer game would be relatively easier to build.

Or a quiz to practice French words.

In contrast, a multiplayer game played over the internet, that would be harder and much more complex.

Or live French practice with AI feedback would also be harder to build.

It takes a while to hone intuition about what is easy for AI to build and what is hard for AI to build.

If you're not sure, I encourage you to just try it out, and the worst thing that could happen is it doesn't work and you start to hone your intuitions about what's hard.

But if you're just getting started trying to use AI to build apps, I encourage you to start with simple ideas, like build a simple game and see if you can get AI to do that.

When we get to the practice lab later in this module, you better try a few examples that you can build with just a single prompt.

If you want to dive deeper into using AI to build software, I encourage you to take the course Build with Andrew, offered by deeplearn.ai.

In addition to writing code to build games and websites, AI can also help you to analyze data, which it can do by writing code to do that data analysis.

Again, you don't need to write any code yourself.

Just tell AI what you want, and it'll try to write code to do it for you.

And this can lead to helpful insights in your work and personal life.

In the next video, let's take a look at using AI for data analysis.

If you have data from your personal health records, like maybe if you use one of the apps that track your heart rate or running time, or if you have data of sales records from your company, or really most other types of data, like what you might store in a spreadsheet table, AI can be pretty good at writing code to analyze this data for you and to try to extract useful insights.

Let's look at some examples that I hope will inspire you to try out this capability.

If you have running tracker data that you can download into a spreadsheet, you might try uploading that spreadsheet to AI model and asking it, how am I pace and distance progressing?

And an AI system might spend some time to write some code to analyze it, and maybe generate a plot for you, and also potentially provide some insights.

Or if you run a small business and you have a spreadsheet of sales data, you can upload that data and ask the AI model, what have you told me about this month's sales?

The AI could analyze for a little while, and it might actually write some code to do things like compute the monthly revenue, or to create a graph, and try to show you whatever insights it finds.

I find that AI data analysis often isn't as sophisticated as a really, really good human data scientist, but for pulling out basic insights and doing so efficiently, an AI model can be pretty good.

How do AI models write and execute code?

It turns out that the way it works under the hood, the ability to write and execute or to run code is like any other tool that the AI model might use.

You've seen how AI model might have tools to carry out web search, or to read and write files and so on.

And some AI models also have a tool to run a computer program or to run code.

And this capability will often be used when data is present or when some sort of calculation or some sort of plotting of a graph is needed.

In that case, the AI could generate a bunch of code and then use this tool to run the code in order to generate a result for the user.

We saw previously how a reasoning model might input a user prompt, reason for a while using the context and occasionally decide whether or not it needs to use a web search tool to get more context.

In this case, it has the option of not just using a web search tool, but also a run code or code execution tool in which the AI can read the basic computer program to analyze data, compute averages, plot a graph or whatever is needed in order to get to the final answer.

Let's look at a concrete example.

If you're running a bubble tea shop, you may want to look at your sales trends over time to answer questions like, did your new drinks sell well compared to existing drinks?

If you have sales data handy, AI can help you with this.

You can attach your sales data as a file and write a prompt like this one.

Which drinks had the biggest changes in sales?

Graph it.

The AI will then go through an agentic process to analyze the data and graph it, inspect the data, then calculate the monthly changes in sales and do things like analyze intelligently your data to help you get useful insights.

For example, it may say things like, I'm noticing some clear patterns, most drinks have that, but four stand out, now graph those.

So not just graphing all the drinks, but identifying and potentially focusing on the most interesting ones.

Then it might generate a graph like this one.

Time is on the horizontal axis and number of sales on the vertical axis with each drink graphed in its own color.

So the strawberry matcha took off in spring, mango green tea and strawberry lemonade in summer and your new coconut milk tea did well in fall.

It also added these colorful highlights to make these trends stand out.

So in just a few minutes, you can get a highly useful graph like this.

On this graph, you may conclude that your spring promotion for strawberry matcha was strong and maybe you want to try it again next year.

You can keep going and iterate on this graph and ask for different insights or changes to the graph or give more context about your business to the AI to give it a better shot at finding more useful insights.

But let me show you a more long thinking example.

You could take your sales analysis further and ask for a more comprehensive graphic.

For example, you might want to create a year-end review graphic for your business to present to your team.

This involves analyzing all your data in a few different ways to find out what's most interesting to share.

So you might write a prompt like, create a one-slide year-end review graphic for a bubble tea shop, analyze the data carefully for insights, and again, attach your data file.

Reading the words carefully in your prompt may trigger the AI to think for several minutes to complete this analysis.

And it might go through an agentic thinking process, like you saw on the previous slide, and it will likely write and run codes to calculate things like revenue and items sold.

And in the end, you might get a graphic like this with a lot of interesting insights, like brown sugar and classic being the most ordered drinks and most customers choosing the last drink.

This graphic also has a creative bubble tea color scheme.

You probably want to double-check these figures to make sure they match your expectations since the AI sometimes can hallucinate, but you can get a good analysis in a relatively short period of time just by prompting.

That's fairly likely to be accurate since AI calculated these numbers by writing and running code.

I encourage you to try this type of analysis yourself if you have data like sales data or personal data you're interested in getting insights from.

If you're using an AI model that can run code, when would you choose to do so?

We've seen that for some questions, it can use its pre-trained knowledge.

If you're asking for a question that can be answered using common knowledge on the internet, what it already knows may well be good enough.

If you're asking a specific question that is real-time, then web search enabled AI will help you get a better answer.

You may want to ask it to use a deep researcher if you have a more complex question that may require multiple related searches, such as to come up with a complete plan for your Halloween hauls, and for queries that require calculation or drafting, those are the types of questions where it's most likely to write and run some code in order to carry out that task precisely for you.

One of the most powerful things that AI can do now is write and run code in order to carry out the task for you.

If you have some data that you want to get insights from, I encourage you to try exploring it with an AI model.

AI isn't always reliable, and it's better at simple analyses than really complex ones, so consider double-checking these conclusions, but it can be much faster than having to do the analysis yourself in Excel or Google Sheets, and it has on many occasions helped me discover useful insights in my data.

To put your skills into practice, we have one last practice lab and an optional final project for you.

Let's go into the next video to see what these are.

I'm excited for you to test out building games and applications using AI.

This should give you a taste for what's possible to build with AI.

When you get into the lab, you can read through these instructions shown here, but I'm just going to dismiss this for now, and I'm just going to click the prompt for a fireworks show.

I encourage you to read through this prompt carefully so that you understand what it takes to build an app like this, and I'm just going to hit run, and this short prompt will build this application.

I did leave a lot of things unspecified, but it looks like my AI has made reasonable decisions, and if I click my mouse, it launches pretty many fireworks.

If I don't want to launch them by myself, the auto show runs my automatic fireworks show, and then here's the grand finale.

So pretty cool.

One neat feature is that you can actually share the apps you built.

So I'm going to click share, just copy the link to my clipboard, and I can open up a new tab with this URL, and this actually launches the fireworks app in my web browser.

So if you share this URL with a friend, they'll be able to run the fireworks app that you just created.

One of the things I most enjoy about building simple applications is being able to share with friends, so I encourage you to maybe take what you built and consider sharing that with friends and see what they think.

Feel free to take the fireworks show prompt, modify it, and see if you can get a different result that could be even more to your liking.

Second example, let me click on the color palette prompt.

If you are designing a website, something you may have to do is choose a color palette for the website, and so with a prompt like this, you can build a color palette picker, which is a base color in RGB values.

So this is actually roughly the color of my shirt, and you can choose complementary, analogous, and so on color palettes that, you know, they actually go pretty well together.

Hope you have fun playing with this.

So I hope you try out all four of these built-in prompts, and additionally, try using your own prompt.

For example, here I'm going to ask it to build a flashcard app to help me practice basic French vocabulary.

This process may take a few seconds, and when it's done, this is what you might get, where it says, yes, not too shabby, and so on.

Looks like I got that right.

It looks like it's giving me all the answers here on the right, but if I don't want that, I can go back and modify the prompt to redesign my flashcard app.

Please have fun with this, and I find it inspiring that with just one English prompt, you can build a webpage to build applications like these.

The final project is to build a simple app from research on a topic that interests you.

We're going to go through three steps.

First, brainstorm a research question, then run research, and then to build an app.

I encourage you to read through these instructions, but I'm just going to dismiss them for this demo.

The first step is to brainstorm a research question on either a topic you want to explore or a decision you want to make, like researching health supplements, or buying a car, or things to do with your career, or some fun things with astronomy.

Let me pick the careers one.

Given this prompt, or some other one that you may choose, I would encourage you to then give the AI more context.

So I'm interested in exploring different careers, know about me, my situation.

Let's say I'm at university studying deep learning, I want to work in an office, and I encourage you to write more than I am here in this demo walkthrough, but let me send it to the AI.

And here it will help me to brainstorm a few specific research questions, such as what careers let me spend a lot of time collaborating, and what is a lovely day today.

So hopefully you pick something that's relevant to your life, and this will help you brainstorm a range of different questions that hopefully will be interesting to you.

And we also have an AI mentor to give you some feedback on how the brainstorming process is going.

So in this brainstorming workflow, I would read through these questions and give feedback.

So let's say I'm most interested in question one, but like the specificity of question three, I also want to make sure I have time to go to my job.

So with this type of feedback, the AI now has additional context about what would make an interesting research question for you, and so it refines it to slightly better options.

Let's take one more turn.

Based on these three questions, my feedback is, and I also want to make sure to consider non-profit work.

And so by going a few rounds with looking at the brainstormed research questions and giving feedback, we've now refined it to a handful of research questions.

So if you're using an OM, you can actually go for multiple rounds, ask the refined questions, combine them, but let me just end this part of the exercise for now and say I like this question the best, and I'm just going to pick this question to go on to step two.

Now that I've formulated a research question, and what we just did was go through a brainstorming exercise that's an example of how you may iterate with your AI to brainstorm any of other possible topics as well, not just identify a research question to work on.

Now that I've picked my research question, let me give it a little bit more context.

I also want to know about salaries and typical compensation, and I wanted to use these sources, forums for personal experience and so on, and encourage you to add more personal context and add more sources than I'm doing in this quick run through, but let's just have it carry out this research.

And let's have it go ahead and do so, the AI Mentor gives some feedback on my prompts, and after running for a while, here my web search enabled AI gives a pretty decent set of results with lots of citations.

For the third and final step, after you've gotten your research report, let's go build an app.

We can build a quiz to test my knowledge based on the reports, or there's a minigame or build an infographic.

All three of these are actually pretty fun, but I'm going to choose the quiz option.

So here I'm going to build a five-question multiple-choice quiz, and the research report that we had created from step two is uploaded here as an attachment to give it more context.

And to keep the results more predictable, in our website, this prompt is grayed out, so it's not editable.

But you can always copy the report into a third-party AI system, such as ChaiGPT or Gemini or Cloud, to experiment more.

And I encourage you to play with all of these, all of these are actually pretty fun, but let me just generate my quiz app.

So that will again take a little bit of time.

And so here's the app, let me pick that.

Yep, looks like I got that right.

And so on.

And same as before, you can copy this link using the share button and share this app with your friends and see if they like it.

So I hope you enjoy going through the iterative brainstorming workflow, which is a very useful skill to have when using AI, and then provide enough context for it to do research for you.

And then based on or inspired by that research, build one of the fun apps that we just showed you and maybe even share it with some friends.

I hope you enjoy exploring the lab and the optional final project.

And I hope you'll share your final project with others as well.

And again, if you're interested in learning more, I encourage you to take the Build with Andrew course.

Once you're done, I'll see you in one final video.

Congratulations on making it to the end of this course.

With all the techniques you've learned, you're now ready to be an AI power user.

I hope you find a lot of places in your personal life and work where your skills will benefit you, like using AI to help brainstorm, or use Deep Researcher where you need thoroughly research reports, or use AI to help you with writing and even generate multi-modal outputs and code.

And even as you're doing all this, AI models will keep on getting better.

So please keep trying new models and give AI hard tasks and provide high quality context to help you to keep honing your intuitions about what AI can and cannot do.

I'm confident you'll get a lot out of this incredibly powerful technology.

Thank you for sticking with me to this point.

And I hope you use these powers to help yourself, your friends, your community, and go make the world a better place for yourself and others. </user_query>

Course by Andrew Ng (DeepLearning.AI). Shared by Roan (@RohOnChain).
View the original post on X →