Commit graph

128 commits

Author SHA1 Message Date
Sweaterdog
2818330d57 Merge remote-tracking branch 'upstream/main' into stt-new-ollama-model-always-active-vision 2025-06-18 13:49:30 -07:00
MaxRobinsonTheGreat
00127506b1 improve ui and default settings 2025-06-16 16:32:40 -05:00
MaxRobinsonTheGreat
b2de1cda17 clean settings.js 2025-06-13 13:08:44 -05:00
MaxRobinsonTheGreat
317c01e340 always connect agents to localhost 2025-06-13 13:02:48 -05:00
MaxRobinsonTheGreat
0f5dd0cb07 create-agent endpoint from ui 2025-06-11 16:41:54 -05:00
Sweaterdog
d79b3f3534
Update settings.js
Restored the settings back to its true form
2025-06-07 17:33:21 -07:00
Sweaterdog
4d6765cacf
Update settings.js 2025-06-07 17:29:53 -07:00
Sweaterdog
f22b4957e0
Update settings.js
Changed some of the values for a better STT experience
2025-06-07 17:25:33 -07:00
Sweaterdog
87e2e708fd
Update settings.js
Updated settings with the new features
2025-06-07 17:19:23 -07:00
Sweaterdog
296fb1323c
Update settings.js
fixed a comma
2025-06-07 16:17:00 -07:00
Sweaterdog
da0722a8fb
Merge branch 'main' into Speech-to-Text 2025-06-07 14:59:35 -07:00
Sweaterdog
131dd45c9f
Merge branch 'main' into always-active-vision 2025-06-07 14:56:59 -07:00
Sweaterdog
b4f6ad8835
Update settings.js
Removed unnecessary comments made by Jules
2025-06-07 13:52:28 -07:00
google-labs-jules[bot]
62bcb1950c I've integrated universal logging and applied some refactors.
I implemented comprehensive logging across all API providers in src/models/ using logger.js.
This includes:
- Adding log() and logVision() calls to each provider (Claude, DeepSeek, Gemini, GLHF, GPT, Grok, Groq, HuggingFace, Hyperbolic, Local, Mistral, Novita, Qwen, Replicate, VLLM).
- Ensuring logging respects 'log_normal_data', 'log_reasoning_data', and 'log_vision_data' flags in settings.js, which I added.
- I deprecated 'log_all_prompts' in settings.js and updated prompter.js accordingly.

I refactored openrouter.js and prompter.js:
- I removed the experimental reasoning prompt functionality ($REASONING) from openrouter.js.
- I removed a previously implemented (and then reverted) personality injection feature ($PERSONALITY) from prompter.js, openrouter.js, and profile files.

I had to work around some issues:
- I replaced the full file content for glhf.js and hyperbolic.js due to persistent errors with applying changes.

Something I still need to do:
- Based on your latest feedback, model responses containing <thinking>...</thinking> tags need to be transformed to <think>...</think> tags before being passed to logger.js to ensure they are categorized into reasoning_logs.csv. This change is not included in this update.
2025-06-07 10:18:04 +00:00
google-labs-jules[bot]
fa35e03ec5 Refactor logging and remove unused features.
- Unified logging for `prompter.js` to use granular settings from `settings.js` (e.g., `log_normal_data`) instead of `log_all_prompts`, which has been deprecated.
- Removed the experimental reasoning prompt functionality (formerly triggered by `$REASONING`) from `openrouter.js`.
- Reverted the recently added personality injection feature (`$PERSONALITY` and `getRandomPersonality`) from `prompter.js`, `openrouter.js`, and profile files as per your request.
- Verified that `openrouter.js` correctly utilizes `logger.js` for standard and vision logs.
2025-06-07 10:01:18 +00:00
google-labs-jules[bot]
be38f56f12 I've implemented enhanced vision modes with bug fixes and extended API support.
This update finalizes the implementation of three distinct vision modes:
- "off": This disables all my vision capabilities.
- "prompted": (Formerly "on") This allows me to use vision via explicit commands from you (e.g., !lookAtPlayer), and I will then summarize the image.
- "always": (Formerly "always_active") I will automatically take a screenshot every time you send a prompt and send it with your prompt to a multimodal LLM. If you use a look command in this mode, I will only update my view and take a screenshot for the *next* interaction if relevant, without immediate summarization.

Here are the key changes and improvements:

1.  **Bug Fix (Image Path ENOENT)**:
    *   I've corrected `Camera.capture()` so it returns filenames with the `.jpg` extension.
    *   I've updated `VisionInterpreter.analyzeImage()` to handle full filenames.
    *   This resolves the `ENOENT` error that was previously happening in `Prompter.js`.

2.  **Vision Mode Renaming**:
    *   I've renamed the modes in `settings.js` and throughout the codebase: "on" is now "prompted", and "always_active" is now "always".

3.  **Core Framework (from previous work, now integrated)**:
    *   I've added `vision_mode` to `settings.js`.
    *   `Agent.js` now manages `latestScreenshotPath` and initializes `VisionInterpreter` with `vision_mode`.
    *   `VisionInterpreter.js` handles different behaviors for each mode.
    *   My vision commands (`!lookAt...`) respect the `off` mode.
    *   `History.js` stores `imagePath` with turns, and `Agent.js` manages this path's lifecycle.
    *   `Prompter.js` reads image files when I'm in "always" mode and passes `imageData` to model wrappers.

4.  **Extended Multimodal API Support**:
    *   `gemini.js`, `gpt.js`, `claude.js`, `local.js` (Ollama), `qwen.js`, and `deepseek.js` have been updated to accept `imageData` in their `sendRequest` method and format it for their respective multimodal APIs. They now include `supportsRawImageInput = true`.
    *   Other model wrappers (`mistral.js`, `glhf.js`, `grok.js`, etc.) now safely handle the `imageData` parameter in `sendRequest` (by ignoring it and logging a warning) and have `supportsRawImageInput = false` for that method, ensuring consistent behavior.

5.  **Testing**: I have a comprehensive plan to verify all modes and functionalities.

This set of changes provides a robust and flexible vision system for me, catering to different operational needs and supporting various multimodal LLMs.
2025-06-07 09:07:02 +00:00
google-labs-jules[bot]
ffe3b0e528 Jules was unable to complete the task in time. Please review the work done so far and provide feedback for Jules to continue. 2025-06-07 08:39:05 +00:00
Maximus
6f2bf41e6e initial refactor 2025-06-02 13:47:07 -06:00
Sweaterdog
d91a3c79a3
Fixed typo model name :p
Fixed a typo

`// "./profiles/andy-4/json",`
to
`// "./profiles/andy-4.json",`
2025-05-20 19:03:40 -07:00
Sweaterdog
504dd3b7e8
Update settings.js
Updated `settings.js` to include the profile for Andy-4
2025-05-20 18:50:54 -07:00
Sweaterdog
7ac16e6ace
Merge branch 'main' into Speech-to-Text 2025-05-01 13:05:23 -07:00
MaxRobinsonTheGreat
eb6fbb0ba9 clean settings 2025-04-21 12:42:19 -05:00
MaxRobinsonTheGreat
c9dd763529 Merge remote-tracking branch 'upstream/main' into merge-main 2025-04-08 16:36:47 -05:00
Isadora White
57af4f13cc adding back logging 2025-03-22 15:17:08 -05:00
Isadora White
b72500a0ea
Merge branch 'main' into constructionTaskRevision 2025-03-20 20:45:28 -07:00
Mehul Maheshwari
e35e0badf3
Update settings.js 2025-03-20 14:11:59 -07:00
Mehul Maheshwari
12b327f193
Merge branch 'main' into constructionTaskRevision 2025-03-20 12:26:29 -07:00
MaxRobinsonTheGreat
96250b3ce5 Merge branch 'main' into cleanup 2025-03-19 13:29:37 -05:00
Mehul Maheshwari
a92eb58ad0 fixed tool based problems with construction tasks (ladders needs blocks behind them, door validation fixed) 2025-03-18 17:02:13 -07:00
Sweaterdog
9007a49ab3
Merge branch 'develop' into TTS 2025-03-16 22:57:27 -07:00
Isadora White
1ccba3a4b5 new train, test, dev tasks and new analysis files 2025-03-16 17:55:05 -07:00
MaxRobinsonTheGreat
2015667b2e refactor environment variable settings overrides 2025-03-16 19:45:21 -05:00
MaxRobinsonTheGreat
d9a0b0802c Merge branch 'develop' into vision 2025-03-15 17:24:52 -05:00
MaxRobinsonTheGreat
eebd43e8a3 Merge branch 'main' into cleanup 2025-03-15 14:19:01 -05:00
Isadora White
125aa73d6c adding blocked actions 2025-03-14 18:51:41 -07:00
Sweaterdog
2db99b3440
Update settings.js
Moved speak setting to the bottom near STT settings
2025-03-14 14:29:26 -07:00
Sweaterdog
a5275f7093
Update settings.js 2025-03-14 12:27:49 -07:00
Sweaterdog
360b937237
Merge branch 'develop' into TTS 2025-03-13 23:54:49 -07:00
MaxRobinsonTheGreat
cbe5804f73 remove duplicate mistral 2025-03-13 15:38:49 -05:00
MaxRobinsonTheGreat
7f97574c4e fix speak 2025-03-13 14:40:18 -05:00
MaxRobinsonTheGreat
135af2229c Merge branch 'develop' of https://github.com/kolbytn/mindcraft into develop 2025-03-13 14:33:58 -05:00
MaxRobinsonTheGreat
5695c66fcb better comments 2025-03-13 14:33:56 -05:00
uukelele-scratch
f57da837b1
speaking is now false by default 2025-03-12 21:22:14 +00:00
MaxRobinsonTheGreat
753b8aa32b fix andy path 2025-03-12 15:00:51 -05:00
MaxRobinsonTheGreat
ce8dd89231 prompter refactor 2025-03-11 11:03:40 -05:00
MaxRobinsonTheGreat
87d34aa023 add blueprint to blocked actions 2025-03-11 10:45:55 -05:00
Isadora White
7402839cb5 teleport close to building 2025-03-10 23:08:35 -07:00
uukelele-scratch
b2cafb06b9
Merge branch 'kolbytn:main' into main 2025-03-09 15:46:17 +00:00
Isadora White
9a22d78bad updating prompter to include better logging 2025-03-09 00:07:09 -08:00
Ayush Maniar
8644d48fb3 More fixes to agent.js, removed jill.json from settings.js 2025-03-08 16:56:01 -08:00