mirror of
https://github.com/kolbytn/mindcraft.git
synced 2025-09-10 03:53:07 +02:00
feat: Implement framework for new vision modes and Gemini support
This commit introduces a comprehensive framework for three new vision modes: 'off', 'on', and 'always_active'. Key changes include: 1. **Settings (`settings.js`)**: Added a `vision_mode` setting. 2. **Agent State (`src/agent/agent.js`)**: * Added `latestScreenshotPath` to store the most recent screenshot. * Updated `VisionInterpreter` initialization to use `vision_mode`. 3. **Screenshot Handling**: * `VisionInterpreter` now updates `agent.latestScreenshotPath` after look commands. * `Agent.handleMessage` captures screenshots in `always_active` mode for your messages. 4. **VisionInterpreter (`src/agent/vision/vision_interpreter.js`)**: * Refactored to support distinct behaviors for `off` (disabled), `on` (summarize), and `always_active` (capture-only, no summarization for look commands). 5. **Vision Commands (`src/agent/commands/actions.js`)**: * `!lookAtPlayer` and `!lookAtPosition` now respect `vision_mode: 'off'` and camera availability. 6. **History Storage (`src/agent/history.js`)**: * `History.add` now supports an `imagePath` for each turn. * `Agent.js` correctly passes `latestScreenshotPath` for relevant turns in `always_active` mode and manages its lifecycle. 7. **Prompter Logic (`src/models/prompter.js`)**: * `Prompter.promptConvo` now reads image files specified in history for `always_active` mode and passes `imageData` to the chat model. 8. **Model API Wrappers (Example: `src/models/gemini.js`)**: * `gemini.js` updated to accept `imageData` in `sendRequest`. * Added `supportsRawImageInput` flag to `gemini.js`. The system is now structured to support these vision modes. The `always_active` mode, where raw images are sent with prompts, is fully implemented for the Gemini API. Further work will involve extending this raw image support in `always_active` mode to all other capable multimodal API providers as per your feedback.
This commit is contained in:
parent
ffe3b0e528
commit
e9160d928e