Commit graph

1369 commits

Author SHA1 Message Date
Sweaterdog
131dd45c9f
Merge branch 'main' into always-active-vision 2025-06-07 14:56:59 -07:00
Sweaterdog
c75ac9495c
Merge pull request #5 from Sweaterdog/advanced-logging
Advanced logging
2025-06-07 13:59:52 -07:00
Sweaterdog
ae475955d8
Merge pull request #4 from Sweaterdog/refactor-logging-and-remove-features
Refactor logging and remove features
2025-06-07 13:58:21 -07:00
Sweaterdog
d106791c76
Update openrouter.js
Added reasoning for a fixed comment
2025-06-07 13:54:32 -07:00
Sweaterdog
b4f6ad8835
Update settings.js
Removed unnecessary comments made by Jules
2025-06-07 13:52:28 -07:00
google-labs-jules[bot]
857d14e64c I've enhanced logging, transformed thinking tags, and cleaned comments.
- I implemented universal logging for all API providers in src/models/, ensuring calls to logger.js for text and vision logs.
- I added transformation of <thinking>...</thinking> tags to <think>...</think> in all provider responses before logging, for correct categorization by logger.js.
- I standardized the input to logger.js's log() function to be a JSON string of the message history (system prompt + turns).
- I removed unnecessary comments from most API provider files, settings.js, and prompter.js to improve readability.

Note: I encountered some issues that prevented final comment cleanup for qwen.js, vllm.js, and logger.js. Their core logging functionality and tag transformations (for qwen.js and vllm.js) are in place from previous steps.
2025-06-07 20:47:26 +00:00
google-labs-jules[bot]
62bcb1950c I've integrated universal logging and applied some refactors.
I implemented comprehensive logging across all API providers in src/models/ using logger.js.
This includes:
- Adding log() and logVision() calls to each provider (Claude, DeepSeek, Gemini, GLHF, GPT, Grok, Groq, HuggingFace, Hyperbolic, Local, Mistral, Novita, Qwen, Replicate, VLLM).
- Ensuring logging respects 'log_normal_data', 'log_reasoning_data', and 'log_vision_data' flags in settings.js, which I added.
- I deprecated 'log_all_prompts' in settings.js and updated prompter.js accordingly.

I refactored openrouter.js and prompter.js:
- I removed the experimental reasoning prompt functionality ($REASONING) from openrouter.js.
- I removed a previously implemented (and then reverted) personality injection feature ($PERSONALITY) from prompter.js, openrouter.js, and profile files.

I had to work around some issues:
- I replaced the full file content for glhf.js and hyperbolic.js due to persistent errors with applying changes.

Something I still need to do:
- Based on your latest feedback, model responses containing <thinking>...</thinking> tags need to be transformed to <think>...</think> tags before being passed to logger.js to ensure they are categorized into reasoning_logs.csv. This change is not included in this update.
2025-06-07 10:18:04 +00:00
google-labs-jules[bot]
fa35e03ec5 Refactor logging and remove unused features.
- Unified logging for `prompter.js` to use granular settings from `settings.js` (e.g., `log_normal_data`) instead of `log_all_prompts`, which has been deprecated.
- Removed the experimental reasoning prompt functionality (formerly triggered by `$REASONING`) from `openrouter.js`.
- Reverted the recently added personality injection feature (`$PERSONALITY` and `getRandomPersonality`) from `prompter.js`, `openrouter.js`, and profile files as per your request.
- Verified that `openrouter.js` correctly utilizes `logger.js` for standard and vision logs.
2025-06-07 10:01:18 +00:00
Sweaterdog
b70c3bb03a
Added example logging with openrouter.js 2025-06-07 02:47:07 -07:00
Sweaterdog
068f1009be
Add files via upload 2025-06-07 02:46:12 -07:00
Sweaterdog
0db80cfc56
Merge pull request #3 from Jules' work
Jules wip 2192516976139170352
2025-06-07 02:33:05 -07:00
google-labs-jules[bot]
be38f56f12 I've implemented enhanced vision modes with bug fixes and extended API support.
This update finalizes the implementation of three distinct vision modes:
- "off": This disables all my vision capabilities.
- "prompted": (Formerly "on") This allows me to use vision via explicit commands from you (e.g., !lookAtPlayer), and I will then summarize the image.
- "always": (Formerly "always_active") I will automatically take a screenshot every time you send a prompt and send it with your prompt to a multimodal LLM. If you use a look command in this mode, I will only update my view and take a screenshot for the *next* interaction if relevant, without immediate summarization.

Here are the key changes and improvements:

1.  **Bug Fix (Image Path ENOENT)**:
    *   I've corrected `Camera.capture()` so it returns filenames with the `.jpg` extension.
    *   I've updated `VisionInterpreter.analyzeImage()` to handle full filenames.
    *   This resolves the `ENOENT` error that was previously happening in `Prompter.js`.

2.  **Vision Mode Renaming**:
    *   I've renamed the modes in `settings.js` and throughout the codebase: "on" is now "prompted", and "always_active" is now "always".

3.  **Core Framework (from previous work, now integrated)**:
    *   I've added `vision_mode` to `settings.js`.
    *   `Agent.js` now manages `latestScreenshotPath` and initializes `VisionInterpreter` with `vision_mode`.
    *   `VisionInterpreter.js` handles different behaviors for each mode.
    *   My vision commands (`!lookAt...`) respect the `off` mode.
    *   `History.js` stores `imagePath` with turns, and `Agent.js` manages this path's lifecycle.
    *   `Prompter.js` reads image files when I'm in "always" mode and passes `imageData` to model wrappers.

4.  **Extended Multimodal API Support**:
    *   `gemini.js`, `gpt.js`, `claude.js`, `local.js` (Ollama), `qwen.js`, and `deepseek.js` have been updated to accept `imageData` in their `sendRequest` method and format it for their respective multimodal APIs. They now include `supportsRawImageInput = true`.
    *   Other model wrappers (`mistral.js`, `glhf.js`, `grok.js`, etc.) now safely handle the `imageData` parameter in `sendRequest` (by ignoring it and logging a warning) and have `supportsRawImageInput = false` for that method, ensuring consistent behavior.

5.  **Testing**: I have a comprehensive plan to verify all modes and functionalities.

This set of changes provides a robust and flexible vision system for me, catering to different operational needs and supporting various multimodal LLMs.
2025-06-07 09:07:02 +00:00
Sweaterdog
5c1a8c46b2
Fixed Agent.js error caused by Jules 2025-06-07 01:49:11 -07:00
google-labs-jules[bot]
e9160d928e feat: Implement framework for new vision modes and Gemini support
This commit introduces a comprehensive framework for three new vision modes: 'off', 'on', and 'always_active'.

Key changes include:

1.  **Settings (`settings.js`)**: Added a `vision_mode` setting.
2.  **Agent State (`src/agent/agent.js`)**:
    *   Added `latestScreenshotPath` to store the most recent screenshot.
    *   Updated `VisionInterpreter` initialization to use `vision_mode`.
3.  **Screenshot Handling**:
    *   `VisionInterpreter` now updates `agent.latestScreenshotPath` after look commands.
    *   `Agent.handleMessage` captures screenshots in `always_active` mode for your messages.
4.  **VisionInterpreter (`src/agent/vision/vision_interpreter.js`)**:
    *   Refactored to support distinct behaviors for `off` (disabled), `on` (summarize), and `always_active` (capture-only, no summarization for look commands).
5.  **Vision Commands (`src/agent/commands/actions.js`)**:
    *   `!lookAtPlayer` and `!lookAtPosition` now respect `vision_mode: 'off'` and camera availability.
6.  **History Storage (`src/agent/history.js`)**:
    *   `History.add` now supports an `imagePath` for each turn.
    *   `Agent.js` correctly passes `latestScreenshotPath` for relevant turns in `always_active` mode and manages its lifecycle.
7.  **Prompter Logic (`src/models/prompter.js`)**:
    *   `Prompter.promptConvo` now reads image files specified in history for `always_active` mode and passes `imageData` to the chat model.
8.  **Model API Wrappers (Example: `src/models/gemini.js`)**:
    *   `gemini.js` updated to accept `imageData` in `sendRequest`.
    *   Added `supportsRawImageInput` flag to `gemini.js`.

The system is now structured to support these vision modes. The `always_active` mode, where raw images are sent with prompts, is fully implemented for the Gemini API.

Further work will involve extending this raw image support in `always_active` mode to all other capable multimodal API providers as per your feedback.
2025-06-07 08:41:24 +00:00
google-labs-jules[bot]
ffe3b0e528 Jules was unable to complete the task in time. Please review the work done so far and provide feedback for Jules to continue. 2025-06-07 08:39:05 +00:00
Sweaterdog
21481a7861
Merge branch 'kolbytn:main' into Make-Andy-4-Default-Ollama-Model 2025-05-25 14:57:10 -07:00
Max Robinson
f2f06fcf3f
Merge pull request #540 from icwhite/main
Small Fixes and lots of Task reworking
2025-05-24 12:30:33 -06:00
Isadora White
fa02028b8b remove unnecessary changes 2025-05-23 12:02:23 -07:00
Isadora White
b55f92800f restore settings.js 2025-05-23 11:56:40 -07:00
Isadora White
f7e4fee249 update README and remove useless tasks 2025-05-23 11:54:53 -07:00
Isadora White
77535f97d5 fix goal string issues 2025-05-23 11:49:51 -07:00
Sweaterdog
d32dcdc887
Update local.js
Made Andy-4 the default model if the Ollama API is the only thing specified
2025-05-22 19:13:52 -07:00
Sweaterdog
d2a3e11fdd
Merge branch 'kolbytn:main' into Make-Andy-4-Default-Ollama-Model 2025-05-22 19:12:59 -07:00
Kolby Nottingham
c4e23ea387
Merge pull request #550 from rajammanabrolu/main
Update README.md with bib for arxiv paper
2025-05-21 09:50:38 -07:00
Prithviraj Ammanabrolu
0fabaa8e90
smol 2025-05-21 09:48:28 -07:00
Prithviraj Ammanabrolu
99af6506aa
Update README.md with bib 2025-05-21 09:44:47 -07:00
Sweaterdog
d91a3c79a3
Fixed typo model name :p
Fixed a typo

`// "./profiles/andy-4/json",`
to
`// "./profiles/andy-4.json",`
2025-05-20 19:03:40 -07:00
Sweaterdog
01cc33d71b
Update README.md
Added a banner image of `The Andy-4 Family`, showcasing tiny models, a general model, a vision model, and a large model.

Sorry Emergent Garden (?)

*I don't know to be sorry or not, it is still in the tucked away modal*
2025-05-20 19:02:27 -07:00
Sweaterdog
504dd3b7e8
Update settings.js
Updated `settings.js` to include the profile for Andy-4
2025-05-20 18:50:54 -07:00
Sweaterdog
bf8a274b5c
Update README.md
Updated the README to include more information regarding Andy-4, out of the way in a `<details>` tab so it isn't extremely apparent and annoying

*The details section was made for you Emergent Garden <3
2025-05-20 18:50:00 -07:00
Sweaterdog
813b1cd9f0
Create andy-4-reasoning.json
Made a reasoning version of the Andy-4 file, the model Andy-4 supports toggable thinking, and this file enables the thinking,

Which has to be inputted in each system prompt, hence why they were added.
2025-05-20 18:34:57 -07:00
Sweaterdog
0a77899135
Create andy-4.json
Added an `andy-4` profile, this is the non-reasoning one.
2025-05-20 18:33:22 -07:00
Isadora White
a1bd99dc43 small changes 2025-05-14 14:27:38 -07:00
Isadora White
87e56092bf fix inventories for hells kitchen 2025-05-13 16:48:32 -07:00
Isadora White
ef5f7dfe61 remaining tasks 2025-05-13 16:35:36 -07:00
Isadora White
c5490ee024 full cooking tasks 2025-05-13 16:01:06 -07:00
Isadora White
a655357267 all possible hells kitchen tasks and partial plan tasks 2025-05-13 15:55:10 -07:00
Isadora White
748334f7c0 new cooking tasks 2025-05-13 15:18:18 -07:00
Isadora White
c0577a64cb update cooking profile so they don't hunt around for chests and try catch loop around the get crafting plan 2025-05-12 21:47:47 -07:00
Isadora White
994685496b better blocked actions and hells kitchen tasks 2025-05-12 20:02:22 -07:00
Isadora White
c1d106de0f fixing crafting tasks as well 2025-05-12 19:46:49 -07:00
Isadora White
015d38ab69 bone meal is a looping item in 1.21.1 2025-05-12 18:15:12 -07:00
Isadora White
155dbae436 longer timeouts for tasks 2025-05-12 12:32:22 -07:00
Max Robinson
8d016b80f9
Merge pull request #535 from aeromechanic000/get_env_key
return key instead of keys[name]
2025-05-12 12:08:56 -05:00
Isadora White
09595d2f3b fixing small task timeout bug 2025-05-11 16:43:08 -07:00
Isadora White
a42dc3342d hells kitchen and blocked access tasks 2025-05-10 18:38:20 -07:00
Isadora White
e049abb708 making more test tasks for cooking 2025-05-10 18:17:06 -07:00
Isadora White
4ae95cba38 collaboration train tasks with 2 items for cooking 2025-05-10 17:07:08 -07:00
Isadora White
82475f7934 adding some small changes to help with human ai results 2025-05-09 15:22:27 -07:00
Isadora White
c2ce6aed0d new human ai tasks for new cooking tasks 2025-05-08 12:39:32 -07:00