mindcraft

1362 commits 3 branches 0 tags 82 MiB

Author	SHA1	Message	Date
Johnathan Walker	cc51242527	feat: Enhanced task evaluation system with flexible agent support and rich outcome reporting - Added new evaluation.py with dynamic agent configuration support - Implemented comprehensive test suite (38 tests, 100% pass rate) - Enhanced evaluation_script.py with improved error handling and logging - Updated analysis tools for better outcome reporting and visualization - Added extensive documentation including architecture guide and user manuals - Maintained backward compatibility with existing task formats - Improved performance and reliability for multi-agent evaluations Key improvements: - Flexible agent count configuration (1-N agents) - Rich outcome data structures with detailed metrics - Comprehensive error handling and recovery mechanisms - Enhanced logging and debugging capabilities - Complete test coverage for production readiness Files added/modified: - tasks/evaluation.py (new core evaluation engine) - tasks/test_*.py (comprehensive test suite) - docs/ (complete documentation suite) - Updated analysis and visualization tools	2025-06-15 22:01:19 -04:00
Isadora White	088b71a99a	more friendly messages in the python evaluation script to make it more easy for the users to understand what is happening	2025-06-09 01:35:18 -05:00
Isadora White	a1bd99dc43	small changes	2025-05-14 14:27:38 -07:00
Isadora White	94388efe89	fix merge issues	2025-05-05 13:52:07 -07:00
Isadora White	fa316e350c	fixing human human experiments	2025-05-03 15:00:48 -07:00
Isadora White	aac00bc893	human ai cooking and crafting tasks	2025-04-25 19:16:00 -07:00
Isadora White	181d628033	fixing small issue with defaults	2025-04-25 15:08:34 -07:00
Isadora White	84d8ab0c5e	fixed task paths	2025-04-21 16:20:35 -07:00
MaxRobinsonTheGreat	8060b1e94f	refactor all python to tasks folder (ai)	2025-04-19 14:49:20 -05:00

Renamed from evaluation_script.py (Browse further)

9 commits