Commit graph

4 commits

Author SHA1 Message Date
Johnathan Walker
18eca2f5d9 fix: Resolve API naming inconsistency in analyse_results module
- Re-export enhanced function as 'aggregate_results' for backward compatibility
- Users can now import aggregate_results and get the enhanced functionality
- Updated architecture documentation to reflect the corrected API
- Maintains intuitive API while providing enhanced model extraction features
2025-06-15 23:21:01 -04:00
Johnathan Walker
f7947ec3c2 refactor: Eliminate code duplication and enhance development workflow
- Created tasks/experiment_utils.py for shared utility functions
- Streamlined entry point scripts by moving common code to utils
- Enhanced .gitignore with comprehensive Python development patterns
- Validated and fixed documentation links across all markdown files
- Applied final code quality improvements and optimization
2025-06-15 23:12:34 -04:00
Johnathan Walker
cc51242527 feat: Enhanced task evaluation system with flexible agent support and rich outcome reporting
- Added new evaluation.py with dynamic agent configuration support
- Implemented comprehensive test suite (38 tests, 100% pass rate)
- Enhanced evaluation_script.py with improved error handling and logging
- Updated analysis tools for better outcome reporting and visualization
- Added extensive documentation including architecture guide and user manuals
- Maintained backward compatibility with existing task formats
- Improved performance and reliability for multi-agent evaluations

Key improvements:
- Flexible agent count configuration (1-N agents)
- Rich outcome data structures with detailed metrics
- Comprehensive error handling and recovery mechanisms
- Enhanced logging and debugging capabilities
- Complete test coverage for production readiness

Files added/modified:
- tasks/evaluation.py (new core evaluation engine)
- tasks/test_*.py (comprehensive test suite)
- docs/ (complete documentation suite)
- Updated analysis and visualization tools
2025-06-15 22:01:19 -04:00
MaxRobinsonTheGreat
8060b1e94f refactor all python to tasks folder (ai) 2025-04-19 14:49:20 -05:00
Renamed from analyse_results.py (Browse further)