4.1. Remedial Group
Teacher A chose two questions: Q1. Plot a graph ; Q2. Solve .
He used the mobile app version of the AI tools POE, Copilot, and Wolfram Alpha. He gave the same prompt to the three tools. He did not keep interacting with the tools. Teacher A’s conclusion to Q1 is that POE could not plot the graph; it just showed the HTML code. It could not convert it to a graph or a picture. Copilot and Wolfram Alpha provided the correct graph. The graph on Copilot was downloadable. Teacher A’s conclusion to Q2 was that POE and Copilot offered the correct answers in texts. Thus, it was not easy to read. Wolfram Alpha represented answers with the approximate value or exact value, and it could also give various solutions. The teacher described the benefit of AI tools as being fast and having lots of information. Several AI tools did not provide graphs. They only generated ideas with too much information.
Teacher B wanted to use worksheet exercises designed by a textbook publisher. However, there were not enough exercises, and no solutions were provided. The topic chosen was factorization. Teacher B used two AI tools, Magic School and Wolfram Alpha, to generate more. Teacher B chose a YouTube video clip teaching factorization. He used the YouTube Video Questions tool in Magic School. He filled in the template with eighth-grade-level questions. It included 10 questions with multiple-choice answers and the URL of the YouTube video. Then, 10 multiple-choice questions were generated with the answer key. Teacher B used Wolfram Alpha to generate an exercise on quadratic factorization for students. He used Wolfram Problem Generate to conduct this. He set the prompt “Factor ” with the ‘Beginner’ level. Eight multiple-choice questions were generated with a set of answer keys. Teacher B also tried out the step-by-step function in Wolfram Alpha. The prompt was “Factor ”; a step-by-step solution was generated. AI tools saved him time in preparing the online remedial materials. Students could obtain hints from AI anytime and anywhere. However, the choice of words for the questions generated by AI was not the same as the teachers’ choice of words.
Teacher C had the same objectives as Teacher B, preparing a set of exercise questions on factoring quadratic polynomials. He used AI to prepare exercises on the factor method. He used POE and prompts, “Can you generate 10 questions for factorizing quadratic polynomial?” and “Once again, share the solution with me?”. POE provided 10 questions with the correct solutions. He used the same prompts for Copilot. Copilot could also provide 10 questions with correct solutions. POE and Copilot provided text responses. Teacher C also wanted to know if AI could plot the graph of y = 2x + 1. He used POE and Copilot and asked “Can you use Desmos or GeoGebra to plot the graph of y = 2x + 1?”. Copilot provided the steps of using Desmos and GeoGebra to plot the graph. Desmos gave a scatter plot, while Copilot looked for images and gave sketches of a graph that did not exactly show y = 2x + 1. AI tools generated questions and answers effectively and efficiently, particularly for those mechanical questions and topics. However, precise wording needed to be used for good AI responses. AI could not show the mathematics in the proper format. For example, POE did not show the answers in mathematics notation.
4.2. Enrichment Group
Teacher D chose a question from the Hong Kong Mathematics Olympics Competition as shown in
Figure 1.
Teacher D used POE, Thetawise, and Wolfram Alpha to test whether these AI tools could give correct solutions. Teacher D used the same prompt for the 3 AI tools: “Let f(x) be a polynomial of degree 2, where f(1) = , f(2) = , f(3) = . Find the value of f(6)”.
POE correctly set the system of three equations. However, it solved the system of equations incorrectly. Teacher D did not notice that POE gave the incorrect answer. Thetawise correctly set the system of three equations and solved the equations with numerical methods, which provided an estimate. He prompted Thetawise by adding the word “exact value” and the second prompt was “Let f(x) be a polynomial of degree 2, where f(1) = , f(2) = , f(3) = . Find the value of f(6) with exact value”. Thetawise gave what Teacher D wanted. Teacher D observed that Wolfram Alpha could not understand the questions. Teacher D did not revise the prompt and considered that Wolfram Alpha failed in this task. AI tools were time-saving and they generally gave clear and detailed explanations of the corresponding mathematics problems. It was difficult to identify the incorrect steps. Users need to make modifications when inputting the question.
Teacher E captured the image of a DSE question—a high-stakes public examination for senior secondary students in Hong Kong (HKDSE 2024 paper 1 Q17) as shown in
Figure 2. He imported the image of the question to three different AI tools:
Copilot,
POE, and
Thetawise. He wanted to know if the AI tools could give the correct solutions within three prompts.
Copilot was able to read the question from the captured image. However, it missed the slope of Γ. Teacher E set the second prompt as “In (a)(i), the solution given by you, the slope of QR is incorrect, check again please”. However, the slope of Γ generated by Copilot was still incorrect. Teacher E used POE with the same first prompt. POE was also able to read the question from the captured image and it was able to give the correct answer in (a)(i). However, POE missed the slope of QR and wrongly represented the sign. POE was almost correct in (b)(i). The sqrt (49 + 16) must be sqrt(65) instead of 7. It did not explain why the diameter of C is QR though. Teacher E set the second prompt as “In (a)(i), the solution given by you, the slope of QR is incorrect, check again please”. POE corrected the slope of QR, but it failed to note that Γ is perpendicular to QR, so the slope of Γ was incorrect. Teacher E set the third prompt as “The new solution is not correct. Γ is a line perpendicular to line segment QR, if the slope of QR is 4/7, the slope of Γ should be −7/4”. The AI made a mistake in this third and last step. Instead of y + 5 = −(7/4)x + (21/4), it should be y = −(7/4)x + ¼. Teacher E concluded that POE could not successfully complete the task.
Teacher E used
Thetawise with the first prompt as shown in
Figure 2.
Thetawise was also able to read the question from the captured image. It translated the math symbol into LaTeX form and gave the answers correctly in (a)(i) and (a)(ii). For part (b), the AI found the equation of C by setting up a system of simultaneous equations.
Thetawise finished the calculation of (b)(i) correctly. In (b)(ii), it used a more complex approach than expected. The AI stopped before attempting to find the radius r of the circle. Teacher E then continued with the second prompt “Do you notice that the circumference UVW passes through the center of C in part (b)(i)? If yes, you should be able to find the diameter of the circumcircle UVW, hence the area of it”.
Thetawise responded and found the center of a circle from an equation that applied the distance formula. Yet, it seemed relatively inapt to comprehend deductive geometry in this situation. The distance between point U and the center of circle C should be the diameter of the required circle because its subtended angle is 90°.
Teacher E set the third prompt as “In your answer, the distance between point U and the center of C should be the diameter of the required circle because it’s subtended angle is 90°”. Thetawise then completed the question correctly. After using Thetawise to obtain the correct solution, Teacher E was interested in exploring if it could generate similar questions. From the above results, only Thetawise can generate the correct solution within a few prompts. Also, Thetawise was able to display easily readable solutions using LaTeX. Thus, Teacher E rated Thetawise over POE and Copilot to generate a question similar to the given one.
It is a handy tool to prepare questions and solutions with clear prompts and boosts self-directed learning because students learn at home using AI tools. AI provides instant feedback in learning; users can learn the background knowledge of the subject easily. Teacher E also observed that AI is a bit different from Google because AI organizes learning material for users, and teachers can even tailor-make the lesson with clear prompts such as “tell me about quadratic equation in 50 words”. However, the solution given by AI may not be correct, so the effectiveness of AI depends on the depth of the users’ knowledge about the subject matter.
Teacher F selected a junior secondary competition question which was not a standard textbook-type or examination-type question. He was interested in exploring if AI provided the correct solution to the question. He used
Thetawise,
MathGPT, and
Julius AI to solve this problem. Teacher F fed the image of
Figure 3 to the three AI tools.
Both Thetawise and MathGPT showed the step-by-step solution with correct answers. Julius AI provided the solutions with several steps, but it stopped before giving the full solutions and prompted Teacher F to say what he wanted. Teacher F set the second prompt as “Solve the whole question and give the full solution”. Julius AI gave a wrong answer, and the third prompt was “It seems that the answer is 1/3. Did you make any mistake?”. Julius AI finally gave the correct answer.
Besides using AI to develop enrichment materials, Teacher F shared his dialog with AI about how he prepared a set of remedial exercises. He used Websim AI. He used eight prompts to interact with AI. First prompt: “Create 5 factorization questions related to only taking out common factor, perfect square, and difference of two squares”; second prompt: “Put the answer at the end of all questions. Don’t use Hide/Show Button”; third prompt: “Don’t show “Factor the following expression” in each question, but show “Factorize the following expressions” at the beginning of all 5 questions”; fourth prompt: “Don’t show “(common factor)” in each question. 1 row 1 question”; fifth prompt: “1 row 1 answer, no explanation needed”; sixth prompt: “Create 15 more questions, giving a total of 20 questions”; seventh prompt: “Answer of Q6 and Q12 can be further factorized using the difference of two squares”; eighth prompt: “Q16, Q18, and Q20 involve other skills. Delete them and create 3 others”. Teacher F finally prepared a set of exercises with 20 questions on the factorization of quadratic polynomials with answers.
Using Websim AI, teachers can generate different types of questions to address the users’ needs. It shortens the time to prepare when generating a lot of exercises with slight or large variations according to the user’s input commands. The time to set questions, prepare solutions, and typeset is reduced. Another benefit shared by Teacher F is how Wolfram Problem Generator can generate different levels of questions within the scope on its website. The questions can be different as they are generated. This provides extra resources to teachers.
Its limitation is related to math questions with diagrams, in 2D or 3D, especially when the information is provided in the diagrams, instead of descriptions in the question. Mistakes made by AI are found in diagrams. Another limitation is related to math questions with cross-topics and high-order thinking. It is easy for AI to solve or create a difficult math question in a single specific topic, but it becomes problematic if several different topics are involved. For example, to work out a long question of an HKDSE Math Paper 1 which requires knowledge of arithmetic sequences and geometric sequences, the transformation of functions, quadratic equations, the centers of triangles, and coordinate geometry, the AI solver could not solve it successfully. When AI is asked to create a math question with different topics, these topics are dealt with in different parts and coherence among these different parts of the question is absent.