Commit graph

1050 commits

Author SHA1 Message Date
Ayush Maniar
afded138d0 Constructiontasks store average success across all tasks 2025-04-17 13:32:29 -07:00
Ayush Maniar
e8e8212832 Thoroughly fixed evaluation_script success calculations and added support for debugging the same 2025-04-17 13:06:19 -07:00
Ayush Maniar
c583e2d5e1 Fix for crafting task analysis script for 3+ agent tasks 2025-04-15 22:36:18 -07:00
Isadora White
a52905092d flower and church blueprint 2025-04-10 14:04:55 -07:00
Isadora White
074cbcf8f8 three agent pyramid 2025-04-09 16:34:11 -07:00
Isadora White
e20a932614 fixed major bug in construction tasks and cool footage 2025-04-08 20:57:19 -07:00
Isadora White
9e713058cb Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-04-08 16:08:48 -07:00
Isadora White
95357463cb change blocked actions 2025-04-08 16:08:41 -07:00
Ayush Maniar
6bc8585c17 Updated timeout for crafting tasks 2025-04-08 12:26:32 -07:00
Ayush Maniar
d154e46085 Removed cooking task files that are no longer needed 2025-04-07 20:39:05 -07:00
Ayush Maniar
dafbdf38d5 Modified goal for 3+ agent crafting tasks similar to cooking tasks 2025-04-07 20:36:05 -07:00
Ayush Maniar
c43dc879d2 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-04-07 20:33:54 -07:00
Ayush Maniar
4cb0c51026 Conversation examples for 3+ agent crafting tasks 2025-04-07 20:33:40 -07:00
Isadora White
64c0f61828 fixing three agent tasks with long timeout 2025-04-07 16:25:35 -07:00
Isadora White
f2355f16be adding long timeout tasks for llama evaluation 2025-04-07 15:42:26 -07:00
Isadora White
11db49ce78 make a blueprint for the pyramid tasks 2025-04-04 13:49:12 -07:00
Isadora White
83057cb255 final adjustment 2025-04-03 22:57:27 -07:00
Isadora White
3b80afe813 update to include the better coordinates 2025-04-03 22:55:30 -07:00
Isadora White
3834406f6a custom task file for small church 2025-04-03 22:44:19 -07:00
Isadora White
c01cea4062 update prompted to log memSaving better 2025-04-02 15:31:06 -07:00
Isadora White
fce23becb2 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-04-02 14:46:28 -07:00
Isadora White
69172c0e91 evaluation script that doesn't use tmux 2025-04-02 14:46:22 -07:00
Ayush Maniar
3ee72c11f3 Hells kitchen tasks pushed 2025-04-02 10:19:23 -07:00
Ayush Maniar
67b8326da9 Hells kitchen tasks -> Agent specific validator function created 2025-04-01 22:33:37 -07:00
Ayush Maniar
22973a9991 Removing LICENSE 2025-03-28 20:05:59 -07:00
Ayush Maniar
cd89bc9354 Removed README.md file 2025-03-28 20:04:31 -07:00
Ayush Maniar
beb8605ef3 Comment removal 2025-03-28 20:03:28 -07:00
Ayush Maniar
cd548195b5 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-28 13:13:19 -07:00
Ayush Maniar
d39b254a06 Updates on construction and cooking tasks, prettyTable, flexibility to enter multiple folders, .... 2025-03-28 13:13:16 -07:00
Isadora White
75037f7596 longer timeout test tasks 2025-03-28 14:57:31 -05:00
Ayush Maniar
63e7861c4f Situations where no task score is present for logs of all bots 2025-03-27 00:19:54 -07:00
Ayush Maniar
4d2e90a455 Added timeout for history saving 2025-03-27 00:10:46 -07:00
Ayush Maniar
7afaba99a4 Added 2 agent fully unblocked test tasks for comparing with 3+ agents performance 2025-03-26 23:24:46 -07:00
Ayush Maniar
a5158674dd Add new test and train tasks for 3+ agent cooking 2025-03-26 20:32:28 -07:00
Isadora White
fab5721b4e cooking tasks longer timeout 2025-03-26 19:59:28 -05:00
Isadora White
f862964edd evaluation script add no pruning and block conversation 2025-03-26 01:13:39 -05:00
hlillemark
38c701a8fb Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-25 21:19:15 -07:00
hlillemark
d68d35e74c Add filtered tasks for 3, 4, 5 agents 2025-03-25 21:18:49 -07:00
Ayush Maniar
4e0ae4b9b7 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-25 20:04:42 -07:00
Ayush Maniar
607c383dfc New test and train (easier) tasks for 2+ agent cooking 2025-03-25 20:04:39 -07:00
Isadora White
c24c8aa11b Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-25 22:02:40 -05:00
Isadora White
97e158accd longer timeout 2025-03-25 22:02:35 -05:00
Ayush Maniar
dbe96b8083 Human names for 2+ agents, modifications to goal prompts for 2+ agent tasks, and more milk_buckets in chest 2025-03-25 11:05:56 -07:00
Ayush Maniar
ab491ede8a Allow agents to use !endConversation for cooking tasks 2025-03-25 00:27:41 -07:00
Isadora White
c637e37de5 fix evaluation script for 4 agents 2025-03-23 21:02:16 -05:00
Isadora White
a6f9392b81 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-23 16:29:21 -05:00
Isadora White
8130703ba0 additional logging information 2025-03-23 16:29:16 -05:00
Ayush Maniar
cc3c6d8677 Merge branch 'main' of https://github.com/icwhite/mindcraft 2025-03-23 14:23:14 -07:00
Ayush Maniar
76de807a46 Updated analyze scripts to perform model comparison 2025-03-23 14:23:10 -07:00
Isadora White
354e1f754a construction tasks try catch loop 2025-03-23 15:39:50 -05:00