Neuro-Math

An AI-powered math solver that reads photos of handwritten problems and returns detailed step-by-step solutions using Google Gemini Vision API.

Neuro-Math - Image 1

About This Project

Neuro-Math is a personal project that uses Google Gemini's Vision API to interpret photos of handwritten math problems and return step-by-step solutions. Users can upload or capture a photo of any handwritten equation or problem, and the app sends it to the Gemini Vision model which reads the handwriting and generates a structured, step-by-step breakdown of the solution. The frontend is built in React and the backend is a Node.js API that handles image processing and Gemini API communication.

Key Features

  • Handwritten math problem recognition via Google Gemini Vision API
  • Step-by-step solution generation from raw image input
  • Photo upload and camera capture support on the React frontend
  • Node.js backend handling image relay and Gemini API integration
  • Supports algebra, arithmetic, and basic calculus problems

Challenges & Solutions

1

Prompt engineering Gemini to return consistently structured step-by-step outputs

Crafted a strict system prompt instructing Gemini to respond only in a numbered JSON array of steps, then validated and parsed the response on the Node.js backend before sending it to the frontend — falling back to a plain-text display if the structure was missing

2

Handling varied handwriting quality and image resolution gracefully

Added server-side image pre-processing (resize to a fixed max resolution, convert to JPEG) using the Sharp library before forwarding to Gemini, reducing API errors on low-quality camera captures

Technologies Used

ReactNode.jsGoogle Gemini Vision APIMongoDBJavaScriptREST APIs

Project Details

Category:AI Application
Year:2025

Tags

AI/MLGemini APIVisionMathEducationFull Stack