Follow ZDNET: Add us as a preferred source on Google.
ZDNET’s key takeaways
- Different AI models win at images, coding, and research.
- App integrations often add costly AI subscription layers.
- Obsessing over model version matters less than workflow.
The pace of change in the burgeoning generative AI world is blisteringly fast. It’s often hard to keep up with everything, even if it’s your full-time job. Readers tell me that one area they find particularly confusing is the wide array of poorly-named AI models. What in the heck is the difference between GPT-5.1, Opus 4.5, Gemini 3, etc.? And why would you use one over the other?
Also: Claude Code made an astonishing $1B in 6 months – and my own AI-coded iPhone app shows why
To be honest, trying to fully understand the detailed differences between each of the AI models will send almost anyone screaming into the woods. But it’s fairly easy, especially with examples, to understand which models to choose for different tasks. That’s what we’re going to do in this article.
Keep in mind that apps and AI models are different beasts. The model is the underlying AI engine that does the intelligence processing. The application is the tool you use. Here’s a diagram, created in Google’s Nano Banana Pro, that gets it mostly right.
But in the same way that different vehicles use different kinds and brands of engines, different applications use different models. For example, I gave the exact same prompt that produced the above diagram to ChatGPT’s image generator (on the left) and Midjourney’s image generator (on the right), and we got these results:
As you can see, ChatGPT’s result is simple and fairly clear, whereas Midjourney went both overboard while also simultaneously not actually following the prompt. Midjourney is great for conceptual images (like my bunny outside a restaurant creation shown in this article), but it’s fairly terrible at diagrams.
Also: The best AI chatbots of 2025: I tested ChatGPT, Copilot, and others to find the top tools now
That said, always keep in mind my friend Vinnie Moscaritolo’s description: “AI isn’t the sharpest tool in the box. Sometimes it gets it perfect, and sometimes it dives straight into the rabbit hole of stupid like it packed a lunch for the trip.” Rabbit hole of stupid. Gotta remember that phrase.
(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)
The cost/profit factor
Most AI companies sell their models both as APIs (which programmers can incorporate into their applications) and as chatbot applications. Keep in mind that an AI chatbot isn’t an AI on its own. An AI chatbot like ChatGPT is really a chat application (like a messaging app or Slack) that calls into an AI model for intelligence work.
While there are some really good free AI chatbots, once you start using them a lot, you’re likely to buy into one or more of the roughly $20/month plans each offers.
You’ve probably noticed how many software vendors are now including AI capabilities. I’ve been a QuickBooks Online user for many years, but now QuickBooks is all about pitching “upgrades” to sell you AI agent subscriptions.
Some vendors allow you to use the AI capabilities from the paid chatbot plans you’re already paying for inside their apps. But most application vendors want that additional revenue stream. Even if you already pay for AI models like GPT-5.1 or Opus 4.5 through your chatbot plan, most software vendors incorporating AI will charge you their own fee for access to those models.
Also: 4 ways I save money on my favorite AI tool subscriptions – and you can too
How you decide what to use should be based on what you need. To help out with that decision, I’ll go over the models I use for many of the different tasks I do for work. Keep in mind that I rarely choose the model first. I choose the task and application, and the model usually comes as part of that application.
Creating explainers
- Models used: Gemini 3 and Gemini 2.5
A perfect example of this is Google’s NotebookLM, a tool I added to my kit about six months ago and now use at least weekly.
NotebookLM is a notebook tool integrated with a language model (hence, NotebookLM). But I don’t use it as a notebook. While some folks like that capability, NotebookLM has a superpower: it creates audio explainers based on the source material it is provided with.
Also: I found an open-source NotebookLM alternative that’s powerful, private – and free
You can choose whether you want straight audio or an audio track accompanied by slides. Mostly, I just go for the straight audio. Whenever I get a particularly dense document, press release, or technical paper, I feed it into NotebookLM. About 15 minutes later, the app produces a discussion exploring the key points and main issues of whatever I’ve fed to it.
To be clear, I never use that output directly in my work. But it’s a great way to get some quick triangulation on the big issues related to whatever I’m working on.
Google doesn’t explicitly describe the model underlying NotebookLM, but it’s certainly a Gemini variant. Since Gemini 3 just came out, most of the work I’ve done with NotebookLM was likely generated with Gemini 2.5.
Identifying keywords
- Models used: GPT-4o, GPT-5, GPT-5.1
Notice that I listed three models above. That’s because OpenAI updated them. As they were updated, the model I used automatically upgraded as well. That’s why I don’t want you to obsess on model versions. If you read this in three months, all the models will be different.
In this case, I’m using a self-hosted web archiving application called Karakeep. Karakeep replaced Pocket as my way of storing articles I’ve found for future use in research. One thing that Karakeep does that Pocket never could is create really great automatic keywords.
Karakeep does this using the API from OpenAI. Because the self-hosted Karakeep is open source and doesn’t involve any subscription fee, its doesn’t take its AI tax out of fees. Instead, I use an API key provided by OpenAI, and pay OpenAI based on my usage.
I have 24,594 items indexed in Karakeep. Moving my listings from Pocket to Karakeep took about two months, because Karakeep had to rescan every article. It also called out to OpenAI so the AI could generate keywords. That process cost me about $40.
Since the original conversion, I have been billed $5 every couple of months. Well worth it.
Coding
- Models used: ChatGPT 5.1, GPT-5.1-Max, Claude Opus 4.5
I use AI models in two very different ways when coding. For projects where I’m dropping a chunk of code into the chatbot and asking what it does, what’s wrong with it, or why I’m getting an error message, I haven’t found anything better than ChatGPT Plus using GPT-5.1.
When I tested the free chatbots for coding prowess, Microsoft Copilot actually beat ChatGPT. And when I later tested the paid version of Claude in a chatbot, Opus 4.5, it failed half my tests. But the $20/month ChatGPT Plus, using the latest GPT, has reliably done a great job.
Also: The best free AI for coding in 2025 – only 3 make the cut now
However, when I do agentic coding, I’ve found that OpenAI’s Codex using GPT-5.1-Max and Claude Code using Opus 4.5 are astonishingly great. Agentic AI coding is when I hook up the AIs to my development environment, let the AIs read my entire codebase, and then do substantial, multi-step tasks.
For example, I used Codex to write four WordPress plugin products for me in four days. Just recently, I’ve been using Claude Code with Opus 4.5 to build an entire complex and sophisticated iPhone app, which it helped me do in little sprints over the course of about half a month. I spent $200 for the month’s use of Codex and $100 for the month’s use of Claude Code.
It does astonish me that Opus 4.5 did so poorly in the chatbot experience, but was a superstar in the agentic coding experience, but that’s part of why we’re looking at different models. AI vendors are still working out the kinks from this nascent technology.
Notion databases
- Models used: Claude, ChatGPT, Gemini
Despite my annoyance with Notion’s prioritization of AI upsells over basic product functionality, I finally broke down and paid Notion’s AI tax for the year. It doesn’t call it that, of course. Instead, it offers very limited AI trial capabilities with its $12/month Plus plan. You have to upgrade to the $24/mo business plan to get the AI features for real. I saved a few bucks by buying the year at once, which made my effective monthly fee $20/month.
Also: Dreading AI job cuts? 5 ways to future-proof your career – before it’s too late
There are two things I use Notion AI for. The first is searching and summarizing my article drafts. ZDNET doesn’t allow outside AIs to access its articles, but I often need to dig back into my previous articles to review what I’ve written.
I can point Notion AI to my draft library, and it will scan the entire thing, not just searching for articles, but summarizing what I’ve previously said. This is great for assignments like “Summarize what I’ve written about home labs by looking in the blogs database.”
Note: I don’t use this AI-generated content in my articles. It’s solely for planning purposes.
As you can see from the screenshot above, I can pop open the Notion AI window and ask questions of it, right while I’m writing. The article you see in the screenshot is this article, still being drafted.
The other thing Notion AI does for me is to turn giant lists into categorized databases. I did a project where I gathered together a large list of AI-related programming tools, but I wanted to be able to select and view tools by vendors and categories. I gave Notion the list, and it created a database for me. It wasn’t perfect, but it is usable.
Notion doesn’t use just one model. It tends to jump between Claude, ChatGPT, and Gemini (and their various models) depending on what it thinks will do the job best and which model can do it for the least cost.
Speech recognition
- Model: Parakeet
I use the Mac’s dictation feature quite often. My little pup likes to sleep on my left shoulder. That’s adorable, but it makes it somewhat difficult to type with two hands. So I use my mouse with my right hand, and often dictate small blocks of text using the Mac’s native dictation feature.
Recently, I’ve begun exploring dictation programs that offer more advanced speech recognition and analysis capabilities. One such tool is Paraspeech. Paraspeech offers two features I find quite compelling.
Also: I’ve been testing AI content detectors for years – these are your best options in 2025
First, you pay just once. There’s no subscription fee. This is possible due to its second feature: the AI runs on your own machine. When you install Paraspeech, it downloads a variation of the Nvidia Parakeet speech recognition model. The benefit of this is that your speech doesn’t get sent to the cloud. It just stays on your machine.
This example shows how an AI model can differ considerably from other models. This model runs right on my own machine. It’s designed specifically to do speech recognition. And it’s included inside the application, rather than called by it. That allows the developer to charge a one-time fee without having to pay usage fees back to the AI vendor. It may also give the developer a sales advantage, because paying once is something quite attractive to most consumers. We all pay subscription fees when we have to, but nobody likes them.
Deep research
- Models: GPT-5.1 Thinking (in ChatGPT Pro)
Deep research is the process by which the AI spends a lot of time figuring out answers and providing information. It often uses a ton of AI resources, so it costs more to access. Back when I had the $200/mo Pro subscription for my programming, I also used Deep Research to do some fairly amazing things.
My favorite example was when I asked Deep Research to create a marketing briefing document from source code. I used a rather long prompt, so I’m just going to summarize. What I told it was that there’s a fairly standard process where engineering managers brief marketing and product managers on what a piece of technology does.
Also: Stop accidentally sharing AI videos – 6 ways to tell real from fake before it’s too late
Marketing managers then use that briefing, which includes features, tech tidbits (think Apple presentation) and use cases, as a foundation to help them produce all their marketing materials and activities.
But I didn’t have an engineering manager. Instead, I told Deep Research to read all 12,000+ lines of code from the source code files, and from them, intuit what the products did, and from that, write the briefing documents. The results were astonishing. The AI got a few minor things wrong, but they were easy to correct.
I then upped the ante by taking those briefing documents and feeding them to NotebookLM, which produced compelling, voice-annotated slide shows describing the products. It was amazing, fascinating, and terrifying, all at the same time.
Net-net-net though, since I’m my own engineering and product manager, the two AIs together probably saved me 60-80 hours of product management work.
Analysis, SEO, and general use
- Model: GPT-5.1 Auto
My go-to tool for general business support AI operations has been the Plus tier of ChatGPT. While it can occasionally be incredibly uncooperative, it’s also mostly useful. Even if I didn’t write about AI for a living, I’d probably keep my $20/month subscription. I usually keep the model set on Auto mode, which lets the AI choose the processing level for each prompt.
My wife recently bought a Plus subscription. She has been going great guns with ChatGPT, discussing the nuances of religious arcana. She’s both fairly devout and fairly academically inclined, so ChatGPT has been hugely fun for her. I particularly enjoy the moments when she calls ChatGPT out in a fib or fabrication, because that behavior both amuses and annoys her at the same time.
In terms of practical use, I find the Plus tier to be particularly helpful doing data analysis. I feed it big spreadsheets of different kinds of data, like sentiment analysis, and have it crunch it, digest it, find the key issues, and give me back results. I also use it to help choose SEO keywords for my YouTube videos.
Up until now, no other chatbot has been as broadly useful. However, Gemini 3 looks like it might give ChatGPT a run for its money. Gemini 3 has only been out for a week or so, which is why I don’t have enough experience to compare them. But, who knows, in six months this category might list Gemini 3 as the favorite model instead of GPT-5.1.
Tools and models I don’t use
Among the big names, I don’t use Perplexity, Copilot, or Grok. I know Perplexity also uses GPT-5.1, but it’s just never resonated with me. It’s known for search, but the few times I’ve tried some searches, its results have been meh. Also, I can’t stand the fact that you have to log in via email.
Copilot is an interesting case. It did well in the free coding evaluations, but it’s very Microsoft-centric. Despite the fact that I was once all-Redmond, all the time, I barely ever use Microsoft products now. I keep my family subscription to Microsoft 365 Family because I know Excel and PowerPoint like the back of my hand. But I just don’t do Excel or PowerPoint work much anymore. Copilot is tightly integrated with those tools, so it just hasn’t proven to be that compelling yet.
As for Grok, I was surprised how well it did in my coding tests. But it doesn’t hold an agentic candle to either OpenAI’s Codex or Claude Code. It also did a great job on my travel itinerary test in my free chatbot tests. But it still seems inconsistent, and there are better options easily available. I will continue to test it, and nothing is cast in concrete, but it’s off my list for now.
Also: Apple’s iPhone App of the Year is an AI tool for people with ADHD – and it’s free
You may also have noticed that I didn’t talk about AI-generated video. I haven’t really had much of an occasion to do much with AI video, other than my early Sora 2 evaluation for ZDNET. I’m sure I’ll do more. When I do, I’ll write about it here. Also in the video category, I’m certainly hoping that Apple adds more and better AI capabilities to Final Cut Pro, which is my preferred video editor.
Oh yeah. Apple. Nope. Sorry. Nothing to see here. Apple just hasn’t been on my radar much for AI. I tried its Coding Intelligence feature in Xcode, and it just crashed all the time. Apple’s complete lack of AI representation in articles like this is bad news for the company. The Cupertino giant really needs to up its AI game, and fast.
What about you?
What about you? Do you actively choose different AI models depending on the task, or do you mostly stick with one favorite tool? Have you found certain models to be clearly better for images, coding, research, or writing? Are there any tools mentioned here that surprised you, or ones you strongly disagree with? Let us know in the comments below.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.