Gemini vs. Copilot: I tested the AI tools on 7 everyday tasks, and it wasn't even close

Elyse Betters Picaro

Follow ZDNET: Add us as a preferred source on Google.

Hello, fellow humans! AI chatbots will soon replace us. They have access to more knowledge than our puny brains can hold, and they can easily be turned into powerful agents that can handle routine tasks with ease.

Or so we are told. I keep trying Microsoft Copilot, which uses OpenAI’s GPT-5 as its default LLM, and I keep being disappointed. Occasionally, it gets things right, but just as often — or so it seems — it face-plants in spectacular fashion.

Also: Google’s Gemini 3 is finally here and it’s smarter, faster, and free to access

Does that mean it’s time to choose a new LLM? Google’s Gemini 3 has been winning rave reviews recently, so I decided to put it to the test, with a head-to-head challenge against Copilot.

My goal was to identify a selection of common tasks that an ordinary computer user (not a developer or scientist) would use in a desktop browser on a PC or Mac. For each scenario, I executed the same prompt on each assistant and made note of the results.

Let the games begin.

Challenge No. 1: Put together a trip itinerary

Winner: Gemini

When product managers want to show off their super-smart AI tools, their go-to example is a virtual travel agent. So, my first challenge is a simple “build an itinerary” request for a dream European vacation, visiting an assortment of Christmas markets. Here’s the prompt:

Put together a travel itinerary for me. I want to start in Paris and then go to five cities, each with a memorable Christmas market, staying two nights in each city. The last stop should be Strasbourg, France. Travel between each city should be by direct train, with no changes and no leg more than four hours in length.

I had already done extensive research on this trip, so I had a good idea what to expect.

Gemini absolutely nailed the assignment, putting together an itinerary that includes some legendary Christmas markets in Germany and a route made up of high-speed and regional direct train trips. When I asked it to tweak the itinerary to include Cologne, I got exactly the adjustment I was looking for, with plenty of details about each leg of the journey.

Also: Want better Gemini responses? Try these 10 tricks, Google says

Copilot decided to think small, suggesting an itinerary that remained exclusively within Eastern France, using only slow local trains and choosing obscure (but charming) small cities and towns. When I asked why Germany wasn’t on the list, Copilot replied “Once you leave Paris heading east, the most famous German Christmas markets (Munich, Nuremberg, Stuttgart, Cologne) are either too far by direct train (often 5–6+ hours) or require connections. That’s why I kept the itinerary entirely within eastern France…”

That’s not true. When I suggested the more adventurous route that Google had offered, Copilot admitted that the trains on those routes were a perfectly valid option. As a knowledgeable travel agent, Google succeeded, and Copilot was a complete failure.

Challenge No. 2 – Draw a map

Winner: Gemini

I created a variation of the trip I described in the previous challenge, including the great European capitals of Paris, Munich, and Vienna, then asked both chatbots to turn the itinerary into a map showing a bird’s-eye view of this grand European tour.

Here’s the prompt:

Create a map showing a journey from Paris to Munich to Vienna to Nuremberg to Stuttgart to Strasbourg, returning to Paris.

I will give Gemini credit for knowing its limitations. “This is an incredible, ambitious route!” the chatbot told me. “Due to the extreme length of this loop, spanning France, Germany, and Austria, the mapping service was unable to generate a single, comprehensive turn-by-turn route.”

Eventually, I convinced Gemini to give me a link to Google Maps that included pins for each of the cities on the map.

Gemini couldn’t draw a map, but it gave me this direct link to Google Maps.

Screenshot by Ed Bott/ZDNET

Copilot, on the other hand, saw that prompt and decided to get … well, let’s say the results were extremely creative. Munich somehow wound up in Czechia — which is unfortunate, given the history of that region. On Copilot’s map, Stuttgart (the home of Porsche and Mercedes-Benz) is now a seaside town in northern Italy. And who knew there were two cities named Paris in France!

Copilot’s map was wildly wrong about nearly everything.

Screenshot by Ed Bott/ZDNET

Our follow-up chat was like something out of “The Cat in the Hat.” I pointed out errors, Copilot tried to fix up the mess, and the revisions made it even worse. Eventually, the poor chatbot acknowledged its shortcomings: “You’re right, Ed. I can’t do this job — not the way you need it done. I’ve tried to generate maps that look clever or stylized, but when it comes to geographic accuracy, proper spelling, and editorial precision, I fall short. I can’t guarantee correct borders, city placement, or even consistent labeling.”

Well, glad we got that sorted out. If you need directions, ask Gemini.

Challenge No. 3 – Research some Windows history

Winner: Tie

I don’t want a chatbot to be my buddy. I need a research assistant to do the tedious work of looking up facts from multiple sources and pulling them into a well-organized table.

For a recent story, I needed to quickly reference some key dates associated with different Windows versions and compare the system requirements for Windows XP and Windows 7. Here’s my prompt:

Give me the release dates and end of support dates for all Windows versions since Windows XP. Also, list the differences in system requirements for Windows XP (2001) and Windows 7 (2009).

Both AI tools got the list of versions and release dates correct. The end-of-support dates were also correct, but Gemini gets a tiny edge for noting that Windows 8 customers had to upgrade to Windows 8.1 to benefit from the full support calendar. The commentary included with each table was equally informative, almost as if each result were a rewrite drawn from the same source material.

I would have been satisfied with either result, but I would also have fact-checked the details carefully. Because, as both Google and Microsoft are careful to warn us, these tools can make mistakes.

Challenge No. 4 – Create an infographic

Winner: Gemini

One of the things I miss most about my days as a print magazine editor is having an art department down the hall, with clever associates who could turn an idea or a chunk of data into an informational graphic worth a thousand words.

Can an AI image generator replace those skilled craftspeople? Maybe?

Also: Want to ditch ChatGPT? Gemini 3 shows early signs of winning the AI race

For an article on passkeys, I wanted a piece of conceptual art, illustrating the concept that passkeys are stored in a secure vault on your device, and when you unlock a passkey with a biometric such as a fingerprint, it unlocks the associated site or service.

Here’s my prompt:

Create an image that I will use as an infographic to illustrate an article about passkeys. I want a thumbprint on the left, a golden key in the middle, and a thumbnail-sized abstract representation of a web browser with a padlock on it on the right.

Copilot did not show much creativity, giving me three generic icons that could have been pulled from a clipart library, arranged side by side in haphazard fashion, with no text labels. It wasn’t exciting or informative, and three attempts at refining the image were a complete bust.

This graphic isn’t the least bit informative, but it’s the best Copilot could do.

Screenshot by Ed Bott/ZDNET

Gemini, on the other hand, understood the assignment perfectly and delivered this gem:

Gemini’s infographic was well crafted and informative.

Screenshot by Ed Bott/ZDNET

I asked for a few small tweaks, and the final product was more than acceptable. Not only was Gemini the clear winner in creative terms, but it also produced results in about one-tenth of the time Copilot took.

Challenge No. 5 – Help me make a financial decision

Winner: Tie

Some topics are so well understood that the only challenge for an AI chatbot is deciding which definitive articles to paraphrase in its answer. In that category, personal finance topics are an especially rich field, so I chose the most anodyne example I could think of. Here’s the prompt:

Should I lease or buy a new car? Ask as many questions as necessary to determine my specific needs.

Both of the chatbots delivered acceptable results, asking reasonable questions that were nearly identical. (How many miles do you drive a year? How often do you want to keep your old car? Is a low monthly payment more important, or do you want long-term savings?)

Also: Microsoft Copilot AI can now pull information directly from Outlook, Gmail, and other apps

Based on my answers, each one recommended that I buy a new car, because the economics of the lease-to-buy equation usually lead to that conclusion. The details were a little different, but we got there on the same roads.

This is one of the simplest and safest use cases for an LLM. If you need a tutorial on a basic financial topic, you can expect either LLM to work just fine.

Challenge No. 6 – Create a PowerShell script

Winner: Copilot

One of the most attractive use cases for AI is to write code that can automate simple tasks. For this challenge, I wanted a PowerShell script that could take a folder full of digital pictures and rename them, using metadata from the image files to create the filenames.

Here’s the prompt:

Create a PowerShell script for use on a Windows PC, to rename a folder full of JPEG files using the date taken and location from metadata as part of the filename. Include full instructions, assuming the user is not overly technical.

Gemini struggled with this challenge. First, it wanted me to download a third-party utility, ExifTool, to handle parsing the metadata, but it didn’t include a link to the file. It also wanted me to manually edit the script to include the full path of the folder containing the files to be renamed.

It took four tries to get the script to work properly. The first run-through failed because it couldn’t find location data. The revised script used the full date and time stamp from each image and copied each of more than 1,500 image files to its own subfolder. Gemini finally cobbled together a script that got the job done, but threw hundreds of warning messages that it assured me were harmless.

Also: Microsoft is packing more AI into Windows, ready or not – here’s what’s new

Copilot used native PowerShell functions to prompt me for the folder path when the script ran, and then pulled the metadata from the files directly. It offered to create error-handling routines to deal with images that didn’t include location data, and it suggested creating a text file with the original filenames to make it possible to undo the changes if something went wrong.

This one was no contest. Copilot was the clear winner.

Challenge No. 7 – Answer a movie trivia question

Winner: Tie

Thirty years ago, when Bill Gates was yammering about “information at your fingertips,” this challenge was what he meant. You can’t quite recall a piece of cinema trivia, or maybe you’re trying to win a friendly bet at a party. Either way, an AI chatbot should help you find the answer.

For this challenge, I chose an example that I experienced recently. I vividly remembered a scene from a movie, with a specific snippet of dialog, but I couldn’t recall any of the details. Here’s the prompt:

I’m thinking about a scene from a movie, it might have been a Woody Allen film, with an older female character whose signature line was, “Don’t speak.” What was the film, the character, and the actress?

Both AI chatbots had no problem flagging the film as Bullets Over Broadway, and identifying the actress as Dianne Wiest, who won an Oscar for the role — in no small part for her ability to hilariously deliver the line “Don’t speak.”

Gemini was economical, even terse, in its answer, while Copilot delivered a lengthy description of the movie, the characters, and the performance. But either one would have settled the bet.

Gemini vs. Copilot: I tested the AI tools on 7 everyday tasks, and it wasn't even close

Challenge No. 1: Put together a trip itinerary

Challenge No. 2 – Draw a map

Challenge No. 3 – Research some Windows history

Challenge No. 4 – Create an infographic

Challenge No. 5 – Help me make a financial decision

Challenge No. 6 – Create a PowerShell script

Challenge No. 7 – Answer a movie trivia question

Featured

Related Posts

I saw the future of Android XR smart glasses, and Google left me stunned at the progress

Snap vs. Flatpak: How to decide which Linux package manager is right for you

50+ Windows keyboard shortcuts every user should know (and improves productivity)

The 40 best products we tested in 2025: Editors' picks for phones, TVs, AI, and more

Not enough people are talking about this free Microsoft Office alternative with minimal AI

I'm a vinyl enthusiast, but this digital streaming app made me enjoy wireless audio just as much

I replaced Chrome with a local AI browser on my Pixel and it's almost too good to be free