A newer version of the Gradio SDK is available:
5.16.0
GPT-4o Media Stream Capture and Analysis
Project Overview
This project provides a web application that captures media streams from various sources such as a webcam, desktop, or specific applications. It captures frames at intervals and uses AI to analyze and summarize the frames, providing insights using GPT-4.
Key Features
- Media Stream Capture: Capture video streams from a webcam, screen, or specific applications.
- Frame Analysis: Use OpenAI's GPT-4 to analyze captured frames for text, objects, context, and other details.
- Customizable Prompts: Customize the prompt used for frame analysis.
- API Integration: Integrate with OpenAI's API for frame analysis.
Project Structure
app.py
: The main server-side application code using Quart.templates/index.html
: The HTML template for the web application.static/script.js
: The client-side JavaScript for handling media streams and interaction with the backend.
API Endpoints
GET /
: Serves the main web application.POST /process_frame
: Processes a captured frame and returns the analysis result.
POST /process_frame
- Request Body:
{ "image": "data:image/jpeg;base64,<base64-encoded-image>", "prompt": "Analyze this frame", "api_key": "<OpenAI API Key>" }
- Response:
{ "response": "<Analysis result in markdown format>" }
Potential Uses
- Remote Monitoring: Capture and analyze video streams for remote monitoring applications.
- Educational Purposes: Use AI to analyze and summarize educational video content.
- Content Creation: Automate the analysis and summarization of video content for creators.
Customization
- Prompts: Customize the analysis prompt via the settings panel in the web application.
- Refresh Rate: Adjust the frame capture interval through the settings panel.
- API Key: Configure the OpenAI API key via the settings panel.
Deployment
Clone the Repository:
git clone https://github.com/ruvnet/ai-video.git cd ai-video
Install Dependencies:
pip install -r requirements.txt
Set Environment Variables:
export OPENAI_API_KEY=<your_openai_api_key>
Run the Application:
python app.py
Access the Application: Open your web browser and navigate to
http://localhost:5000
.
requirements.txt
quart
opencv-python-headless
httpx
numpy
API Endpoints
GET /
: Serves the main web application.POST /process_frame
: Processes a captured frame and returns the analysis result.
Customization
- Customize prompts and refresh rates via the settings panel in the web application.
- Configure the OpenAI API key via the settings panel.
Contributing
Feel free to fork the repository and submit pull requests. For major changes, please open an issue first to discuss what you would like to change.