As artificial intelligence systems evolve, more and more people are starting to use them in various fields. Schoolchildren are no exception: why learn to solve problems when AI can solve everything in a couple of seconds? Their parents may also find it difficult to remember what happened 20-25 years ago. Today, we will check how the most popular tools perform on algebra and physics problems: ChatGPT 5, Google Gemini 2.5, and Grok 3. All versions are free.
Content
During the writing an article about FPS I had to mention the Coefficient of Variation (Coefficient of Variation, CV) — is a statistical indicator that aims to track the deviation of graph points from the mean. At that moment, I asked ChatGPT 4 to calculate it so that I wouldn’t have to do the math myself. So I was very surprised that even calculating the mean (amount / quantity) turned out to be a big problem.
Counting in Excel and ChatGPT 4.
To my great surprise, it was only on the fourth attempt that he correctly calculated the sum in the sample. And even then, the true sum of all the values in the sample was indicated by itself. At the same time, inserting the sample into Google Gemini — resulted in an incorrect calculation. In addition, the AI increased the amount of data (36 instead of 35). I asked to recalculate the sample values again, but the result was the same.
Three months have passed since then, so let’s try to repeat the experiment. Let’s add Grok AI to the testing. The results are in front of you.
Physics has been and remains a rather difficult subject at school. Not only do you need to know the formulas for calculations, but it is also desirable to derive them independently from each other. Let’s take three tasks, each of which will correspond to a different section: mechanical motion, mechanics, and optics.
Task #1 A bus is traveling along a highway at 72 km/h. It is overtaken by a car traveling at 90 km/h. A trailer is traveling at 48 km/h toward the bus. How fast is the bus approaching the trailer and the car?
Answer: the speed for the pair «car-bus» (one direction) 90 – 72 = 18 km/h. Then «bus-trailer» (opposite) — 72 + 48 = 120 km/h.
Copy and paste the task condition into ChatGPT. Judging by the description and results, we can see that the AI has successfully solved this task.In addition, he transferred km/h in m/s.
Gemini is next in line. The result is correct, but some of the signatures (v_v) and the sign / look rather strange in the mobile app. No such problems were found in the browser versions for PCs.
We’re finishing the first Grok problem. The answers are the same, but as in the case of Gemini — unclear characters in the mobile version.
Task #2. A body of mass 2 kg was thrown vertically upward with an initial velocity of 20 m/s. At what height will its potential energy be equal to its kinetic energy? Neglect the air resistance.
Answer: h = 10.2 m.
We look at the result in ChatGPT and see a clear explanation and the correct answer.
Gemini inserted the / character again, for some reason. But what’s interesting is that the results didn’t match. The main difference in the calculations is that ChatGPT used a value of g = 9.8 m/s2 and Gemini — 10 m/s2.
This time, Grok proved to be more clever than the other candidates, providing both answers and two methods of solving the problem at the end. The first time we tried to solve the problem, we got an error with some hieroglyphic character and the date of the query. We had to click the “Think Harder” function.
Task #3. During the laboratory, a student obtained a clear image of a lighted candle. What is the focal length and optical power of the lens if the distance from the candle to the lens is 24 cm and the distance from the lens to the screen — 12 cm?
Answer: Lens power F = 0.08 m or 8 cm; Focal length D = 12.5 dpt.
This time, all the AI’s followed the same path to solve the problem and provided the correct answers.
The results show that the selected AI’s perform well in solving school physics problems.
Algebra contains many sections. Let’s start with a simple one that is taught in elementary school. Let’s finish with a complex trigonometry problem.
Task #4. They paid 42 UAH for 7 identical dolls and 72 UAH for 8 balls. Which is more expensive, the doll or the ball, and by how much?
The task is for the 3rd grade. ChatGPT quickly solved this simple problem and boldly asked for a new one. The other two also solved it very quickly. All of them have clear explanations for children.
Task #5. A set of furniture is needed for the new gymnasium classroom: 15 desks and 30 chairs. The manager asks us to quote the price of a desk and a chair if the principal has informed her that the total cost is UAH 14,100, and a desk is UAH 280 more expensive than a chair.
This time, too, there were no surprises — the AI found the solution quickly and without any problems. Everyone used the construction of an equation with x as an unknown. The problem is for the 7th grade.
Task #6* Identify the text in the image, the formula, and complete the task.
Let’s model an interesting situation. When taking tests or competing in an Olympiad, students need to quickly take a picture of the task, cut out all unnecessary parts of the image, and send it to the corresponding AI app on their smartphones. Let’s test this functionality. In addition, we took a trigonometry problem that is not easy for everyone.
This time we’ll start with Grok with the “Think Harder” mode. It took 1 minute and 28 seconds to solve it. To put it mildly, a lot of things are simply not clear and cut out. Again, the problem of apps for different devices.
Gemini repeats the situation where Grok — counts a lot of things, but it’s not clear why it does so and not otherwise. If you look closely, you will find the phrase “Consider a different approach”. It is clear that the AI took an alternative path, as it was stuck in a dead end in fulfilling its original intention.
ChatGPT decided to give the answer right away in the first step. We don’t pay attention because we are interested in the task itself. Step 2 also doesn’t help much in understanding the solution and the meaning. It’s good that AI offers an algebraic derivation of the identity. Please use it.
We see what we absolutely missed in Gemini — what interacts with what and changes. Only the first point “Formula for tg(7x)” is not clear enough. Knowing or understanding this, ChatGPT offered to show how the formula was derived. Asking for more details about this formula, the AI showed an error. A few hours later, we continued and got the necessary explanation.
No problems were found with the regular tasks. Everyone solved the problem with an asterisk, but not everyone described the solution clearly.
It took two days to test the AI’s capabilities. The main reason was the limit on the number of requests for free use. For ChatGPT 5, it is 10 requests at certain intervals. The first time, we had to wait four hours, and the second time — two hours.
Gemini 2.5 Pro also applies a limit of 10 requests, but unlike ChatGPT, the limits are reset once a day. The exact time is unknown: The AI claims to be 5:00 AM UTC, but the message says otherwise.
Grok has similar restrictions to its competitors, but offers 15 regular messages and 2 “thinking” messages every two hours. However, this is purely individual, so some developers have come up with a plugin for checking your limits.
Testing has shown a clear winner — ChatGPT 5. Whereas the previous version 4 made frankly bizarre mistakes in calculations, the new version does everything clearly and with clear explanations, even for complex tasks. The limits are noticeable, but the browser or smartphone app shows the exact time to use the full version.
Grok 3 unexpectedly takes the second place. It makes mistakes and offers “crooked” descriptions of solutions. In general, its behavior resembles ChatGPT 4. The third place goes to Gemini 2.5. It needs to improve AI performance and problems with text descriptions on mobile devices. In the example with the arithmetic mean, it did not accept the correct result from the user.
After conducting the experiments in the article, we can conclude that AI’s can fully assist students and their parents in solving homework or preparing for NMT. Yes, incompletely and not always clearly. Sometimes with mistakes that the user has to point out on their own. However, humanity has created a tool that can significantly speed up the solution of math problems and help in learning.
Two questions remain to be answered: Has the digital God already been created? Is humanity doomed?
Контент сайту призначений для осіб віком від 21 року. Переглядаючи матеріали, ви підтверджуєте свою відповідність віковим обмеженням.
Cуб'єкт у сфері онлайн-медіа; ідентифікатор медіа - R40-06029.