Neuro-Math

An AI-powered math solver that reads photos of handwritten problems and returns detailed step-by-step solutions using Google Gemini Vision API.

$Neuro-Math - Image 1$

View Live Source Code

About This Project

Neuro-Math is a personal project that uses Google Gemini's Vision API to interpret photos of handwritten math problems and return step-by-step solutions. Users can upload or capture a photo of any handwritten equation or problem, and the app sends it to the Gemini Vision model which reads the handwriting and generates a structured, step-by-step breakdown of the solution. The frontend is built in React and the backend is a Node.js API that handles image processing and Gemini API communication.

Key Features

Handwritten math problem recognition via Google Gemini Vision API
Step-by-step solution generation from raw image input
Photo upload and camera capture support on the React frontend
Node.js backend handling image relay and Gemini API integration
Supports algebra, arithmetic, and basic calculus problems

Challenges & Solutions

Prompt engineering Gemini to return consistently structured step-by-step outputs

Crafted a strict system prompt instructing Gemini to respond only in a numbered JSON array of steps, then validated and parsed the response on the Node.js backend before sending it to the frontend — falling back to a plain-text display if the structure was missing

Handling varied handwriting quality and image resolution gracefully

Added server-side image pre-processing (resize to a fixed max resolution, convert to JPEG) using the Sharp library before forwarding to Gemini, reducing API errors on low-quality camera captures

Technologies Used

ReactNode.jsGoogle Gemini Vision APIMongoDBJavaScriptREST APIs

Project Details

Category:AI Application