Commit graph

57 commits

Author SHA1 Message Date
google-labs-jules[bot]
be38f56f12 I've implemented enhanced vision modes with bug fixes and extended API support.
This update finalizes the implementation of three distinct vision modes:
- "off": This disables all my vision capabilities.
- "prompted": (Formerly "on") This allows me to use vision via explicit commands from you (e.g., !lookAtPlayer), and I will then summarize the image.
- "always": (Formerly "always_active") I will automatically take a screenshot every time you send a prompt and send it with your prompt to a multimodal LLM. If you use a look command in this mode, I will only update my view and take a screenshot for the *next* interaction if relevant, without immediate summarization.

Here are the key changes and improvements:

1.  **Bug Fix (Image Path ENOENT)**:
    *   I've corrected `Camera.capture()` so it returns filenames with the `.jpg` extension.
    *   I've updated `VisionInterpreter.analyzeImage()` to handle full filenames.
    *   This resolves the `ENOENT` error that was previously happening in `Prompter.js`.

2.  **Vision Mode Renaming**:
    *   I've renamed the modes in `settings.js` and throughout the codebase: "on" is now "prompted", and "always_active" is now "always".

3.  **Core Framework (from previous work, now integrated)**:
    *   I've added `vision_mode` to `settings.js`.
    *   `Agent.js` now manages `latestScreenshotPath` and initializes `VisionInterpreter` with `vision_mode`.
    *   `VisionInterpreter.js` handles different behaviors for each mode.
    *   My vision commands (`!lookAt...`) respect the `off` mode.
    *   `History.js` stores `imagePath` with turns, and `Agent.js` manages this path's lifecycle.
    *   `Prompter.js` reads image files when I'm in "always" mode and passes `imageData` to model wrappers.

4.  **Extended Multimodal API Support**:
    *   `gemini.js`, `gpt.js`, `claude.js`, `local.js` (Ollama), `qwen.js`, and `deepseek.js` have been updated to accept `imageData` in their `sendRequest` method and format it for their respective multimodal APIs. They now include `supportsRawImageInput = true`.
    *   Other model wrappers (`mistral.js`, `glhf.js`, `grok.js`, etc.) now safely handle the `imageData` parameter in `sendRequest` (by ignoring it and logging a warning) and have `supportsRawImageInput = false` for that method, ensuring consistent behavior.

5.  **Testing**: I have a comprehensive plan to verify all modes and functionalities.

This set of changes provides a robust and flexible vision system for me, catering to different operational needs and supporting various multimodal LLMs.
2025-06-07 09:07:02 +00:00
google-labs-jules[bot]
ffe3b0e528 Jules was unable to complete the task in time. Please review the work done so far and provide feedback for Jules to continue. 2025-06-07 08:39:05 +00:00
MaxRobinsonTheGreat
0b20d94f7d clean logs 2025-04-22 12:19:32 -05:00
MaxRobinsonTheGreat
c9dd763529 Merge remote-tracking branch 'upstream/main' into merge-main 2025-04-08 16:36:47 -05:00
Isadora White
c01cea4062 update prompted to log memSaving better 2025-04-02 15:31:06 -07:00
MaxRobinsonTheGreat
6dc5c6401a Merge branch 'main' into cleanup 2025-03-20 16:45:20 -05:00
Isadora White
4db0f8d3c5 further debugging of reasoning models 2025-03-20 13:21:09 -05:00
Isadora White
7bf97660eb added exp name to prompter.js and splitting around think tokens 2025-03-19 23:52:08 -05:00
MaxRobinsonTheGreat
57be4ec42e remove log 2025-03-19 13:31:18 -05:00
MaxRobinsonTheGreat
d9a0b0802c Merge branch 'develop' into vision 2025-03-15 17:24:52 -05:00
Isadora White
2dea51f2d1 fix prompter 2025-03-14 18:20:33 -07:00
Isadora White
9c3e726a36 fix prompter logging 2025-03-14 00:35:21 -07:00
MaxRobinsonTheGreat
5695c66fcb better comments 2025-03-13 14:33:56 -05:00
MaxRobinsonTheGreat
ce8dd89231 prompter refactor 2025-03-11 11:03:40 -05:00
Isadora White
5b708551d4 remove blocked actions from command docs 2025-03-09 15:20:36 -07:00
Isadora White
2f80b65d42 fixing small bugs related to single agent support 2025-03-09 13:01:54 -07:00
Sweaterdog
fbdac8d48e
Update prompter.js
Fixed a minor error
2025-03-09 00:29:06 -08:00
Isadora White
9a22d78bad updating prompter to include better logging 2025-03-09 00:07:09 -08:00
Isadora White
b75d941d97 smol changes 2025-03-08 19:24:13 -08:00
Ayushmaniar
6dfd2f69f4
Merge branch 'main' into main 2025-03-07 19:36:05 -08:00
Sweaterdog
0a575b33b2
Merge branch 'main' into main 2025-03-07 18:52:41 -08:00
MaxRobinsonTheGreat
8234af9585 Merge branch 'main' into vision 2025-03-07 16:39:34 -06:00
Charvi Bannur
3525a7130f Added logging code to prompter.js 2025-03-07 14:33:57 -08:00
MilitaryLotus
0093ef1b92
Update prompter.js 2025-03-07 10:56:53 -08:00
dtesters
6b81e8d6cb
Update prompter.js 2025-03-06 12:42:50 -08:00
Sweaterdog
b3ee159b43
Update prompter.js
Fixed minor issues
2025-03-05 14:40:42 -08:00
MaxRobinsonTheGreat
5dca9b778f readd canvas, remove random "git" 2025-03-05 15:35:50 -06:00
MaxRobinsonTheGreat
9abecae9b2 Merge branch 'main' into vision 2025-03-05 15:30:19 -06:00
MaxRobinsonTheGreat
6ec49e7789 reworked image prompting, update package 2025-03-05 15:23:57 -06:00
Sweaterdog
0451b1a852
Update prompter.js
"Fixed" prompter.js
2025-03-04 16:57:29 -08:00
Sweaterdog
9c595ed69a
Update prompter.js
Fixed the hyperbolic and glhf.chat setup, for some reason it was deleted when merging with the main.
2025-03-04 16:51:32 -08:00
Sweaterdog
7c3660e0f2
Merge branch 'main' into main 2025-03-04 16:43:56 -08:00
MaxRobinsonTheGreat
465a1c56fd better coder prompt and logging 2025-03-03 22:38:47 -06:00
Isadora White
44fc1b4618 evaluation script for vllm 2025-03-03 06:10:52 +00:00
Sweaterdog
125aee4ce4
Update prompter.js
Updated prompter.js for updated Ollama and Openrouter model usage
2025-03-01 13:40:24 -08:00
Sweaterdog
42dfe39862
Update prompter.js
Fixed minor error in prompter.js that disabled Hyperbolic support
2025-02-28 12:46:55 -08:00
Sweaterdog
71749ec4d2
Update prompter.js 2025-02-27 21:22:40 -08:00
Sweaterdog
d614d30764
Update prompter.js
Fixed Ollama prompting issues.
2025-02-23 21:17:30 -08:00
MaxRobinsonTheGreat
b23f4776b1 add state to self prompter for pausing 2025-02-20 17:17:21 -06:00
Sweaterdog
4b19a21a1c
Merge branch 'main' into main 2025-02-17 15:39:58 -08:00
MaxRobinsonTheGreat
821dbae5c3 dont default to ollama 2025-02-17 17:34:55 -06:00
MaxRobinsonTheGreat
7a9faca7c3 catch broken embedding models 2025-02-17 16:55:36 -06:00
MaxRobinsonTheGreat
b1a36b15c2 added more logging 2025-02-17 16:37:54 -06:00
MaxRobinsonTheGreat
0c8620fb3c added openrouter 2025-02-17 15:59:17 -06:00
Sweaterdog
378844e962
Merge branch 'main' into main 2025-02-17 13:15:52 -08:00
MaxRobinsonTheGreat
138a9838ae use word-overlap for skill docs embed if unsupported 2025-02-17 13:13:45 -06:00
Sweaterdog
73e11ff6bb
Update prompter.js
Fixed chat.model typos
2025-02-15 10:20:02 -08:00
gmuffiness
2b5923f98f feat: add vision_model param to profile 2025-02-09 21:57:45 +09:00
Sweaterdog
359c7e825c
Add files via upload 2025-02-08 22:41:07 -08:00
Sweaterdog
d3ad70da6c
Delete src directory 2025-02-08 22:38:37 -08:00