A Generative AI platform to create accurate, interactive, and pedagogical STEM animations from text, concepts, or problems.

StemFun: Generative STEM Video Powered by Manim

The Problem: Beyond Rote Memorization

In modern STEM education, students often get stuck in a loop of “rote memorization” (học vẹt). They can find videos that show how to solve a problem, but rarely ones that explain why a specific formula or concept is used.

Current AI video generation tools (like Fliki or Revid) are designed for marketing and basic content. They fail at STEM education because they:

Lack Precision: They cannot accurately visualize complex mathematical, physical, or chemical concepts
Suffer from “Element Overlap”: Diffusion models often create messy, overlapping, or nonsensical graphics
Lack Pedagogical Depth: They cannot explain the “first-principles” reasoning behind a solution

This leaves a major gap: a need for an AI tool that can act as a true “tutor,” building deep, intuitive understanding through accurate and dynamic visuals.

The Solution: “AI Video Tutor” with Manim

StemFun (also known as the “AI Video Tutor for STEM”) is a platform that generates high-precision, educational videos.

Instead of relying on diffusion models, our core is built on Manim, the mathematical animation library famously used by 3Blue1Brown. This allows us to render crisp, accurate, and non-overlapping vector graphics, code, and formulas.

We combine Manim’s precision with a sophisticated Large Language Model (LLM) orchestration system to not only show the answer but explain the reasoning from the ground up.

Core Features

Conceptual Topic Explainer

Takes a STEM concept (e.g., “Gradient Descent,” “Eigenvalues”) and generates a full video explanation with visuals and a voiceover.

Step-by-Step Exercise Solver

The user provides a STEM problem (e.g., a physics problem, a probability puzzle). The AI generates a video that walks through the solution step-by-step, visualizing each stage.

Dynamic Coding Tutorials

Generates videos that explain an algorithm, show the code, and simulate its execution, visualizing how data structures change in real-time.

AI-Powered Online Editor

A web-based interface to edit the AI-generated videos, allowing for fine-tuning and customization.

Innovation & Competitive Advantage

Feature	Standard AI Video Tools (Fliki, Revid)	StemFun (AI Video Tutor)
Visualization	Diffusion-based, often inaccurate, “element overlap”	Manim-based. Mathematically precise, clean, and accurate
Explanation	”Text-to-Video” - only reads what’s provided	”Concept-to-Explanation.” LLM + RAG explains the “why” and “how”
Accuracy	Prone to LLM “hallucination”	RAG & Tool-Calling. Verifies facts with a knowledge base and uses tools (e.g., code interpreters, calculators) for 100% accurate results
Content	Basic marketing or “infotainment”	Deeply Pedagogical. Designed specifically for complex STEM topics

System Architecture

The system is a robust, asynchronous pipeline designed to handle complex, multi-step generation tasks.

Step-by-Step Generation Workflow

Overall architecture

Request & Queueing (Frontend & Backend)

A user submits a request (e.g., “Explain Binary Search”) via the Next.js frontend
The FastAPI backend receives the request, validates it, and saves the job details to a PostgreSQL database
The job is then pushed onto a Redis queue, managed by Celery, for asynchronous processing

Planning & Scripting (Orchestrator)

A Celery worker picks up the job from the queue
This triggers the core LangChain / LangGraph orchestrator (the “Main Orchestrator”)
The orchestrator uses an LLM (e.g., GPT-4o, Claude 3, Gemini 2.5) to act as a “Scene Planner”

The Scene Planner:

Retrieves Knowledge (RAG): Queries a ChromaDB or PineCone Vector DB (using the Gemma embedding model) to fetch accurate scientific facts and relevant Manim code examples
Calls Tools: Uses tools like a web-search agent (for real-time info) or a “MCP Server” (computation engine) to get verifiable data
Generates a Plan: Creates a high-level conceptual script and a structured scene-by-scene storyboard

Scene Generation & Manim Code (Core AI)

The “Scene Planner” passes the storyboard to the “Manim Code Generator”
This is a specialized agent (likely a LangGraph) that iterates through each scene and writes the actual Python code for Manim required to animate that scene
It uses the RAG-retrieved code snippets as a reference to write effective and error-free Manim code

The generated Manim code is sent to a “Code Validation” service
If Successful: The code is approved and sent to the render farm
If Failed (Error): The code and the error message (e.g., ManimException) are sent to a “Code Fixing” agent

Rendering & Assembly (GPU Cluster)

The validated Manim Python scripts are sent to a GPU render cluster (e.g., AWS EC2, GCP Compute) running OpenGL
Manim renders the scenes into video (or image) files
Kokoro TTS (or another service) generates the voiceover
FFmpeg and sox are used to combine the video scenes, add the audio, and encode the final .mp4 file

Technology Stack

Frontend

Framework: Next.js
Styling: TailwindCSS

Backend

Framework: Python, FastAPI
Database: PostgreSQL (for job/user data), Redis (for caching & queue)
Task Queue: Celery

AI & Orchestration

Framework: LangChain / LangGraph
LLM Providers: OpenAI (GPT-4o), Anthropic (Claude 4), Google (Gemini 2.5)
Vector DB / RAG: ChromaDB, PineCone
Embedding Model: Gemma - 300M (for code retrieval task)
Tools: Docling, Web Search, MCP Server (Mathematical Computation Platform)

Video & Rendering

Core Library: Manim
Rendering Engine: OpenGL
Audio/Video: FFmpeg, sox
Voice: Kokoro TTS

DevOps & Infrastructure

Containerization: Docker
Cloud: AWS (S3, EC2, RDS) or GCP
Version Control: Git

Target Audience & Impact

Students (High School & University)

To gain a deep, intuitive understanding of complex STEM topics.

Teachers & Educators

To create engaging, high-quality, and customized visual aids for their courses without needing any animation experience.

Researchers

To visualize and explain their findings and complex models.

Educational Platforms (EdTech)

To integrate as an API to auto-generate video explanations for their existing content libraries.

StemFun: Generative STEM Video Powered by Manim

StemFun: Generative STEM Video Powered by Manim

The Problem: Beyond Rote Memorization

The Solution: “AI Video Tutor” with Manim

Core Features

Conceptual Topic Explainer

Step-by-Step Exercise Solver

Dynamic Coding Tutorials

AI-Powered Online Editor

Innovation & Competitive Advantage

System Architecture

Step-by-Step Generation Workflow

Request & Queueing (Frontend & Backend)

Planning & Scripting (Orchestrator)

Scene Generation & Manim Code (Core AI)

Error Handling & Refinement (Self-Correction Loop)

Rendering & Assembly (GPU Cluster)

Technology Stack

Frontend

Backend

AI & Orchestration

Video & Rendering

DevOps & Infrastructure

Target Audience & Impact

Students (High School & University)

Teachers & Educators

Researchers

Educational Platforms (EdTech)