Changelog
All notable changes to PyVisionAI will be documented here. The format is based on Keep a Changelog.
[0.3.1] - 2025-02-23
Added
- Added new
-m/--model
parameter todescribe-image
command for model selection - Added new
-s/--source
parameter todescribe-image
command for specifying image path - Added comprehensive test coverage for CLI parameters
Changed
- Updated CLI parameter handling to support both new and legacy model selection
- Updated CLI parameter handling to support both new and legacy image path specification
- Enhanced help messages with clearer model descriptions
Note
- The
-u/--use-case
parameter continues to be supported for backward compatibility - The
-i/--image
parameter continues to be supported for backward compatibility - We recommend using
-m/--model
and-s/--source
for better consistency across commands
[0.3.0] - 2024-03-25
Added
Claude Vision Integration
- Added
ClaudeVisionModel
class for Anthropic's Claude Vision API integration - Implemented robust retry logic and error handling for Claude API calls
- Added support for custom prompts with Claude Vision
- Added
describe_image_claude
function to main API
Testing Framework
- Added Claude-specific test markers and comprehensive test suite
- Added integration tests with real API calls
- Added rate limit and retry logic tests
Documentation
- Added Claude Vision model documentation
- Updated API documentation with Claude Vision integration details
- Added environment setup instructions for Anthropic API key
Configuration
- Added
ANTHROPIC_API_KEY
environment variable support - Added Claude Vision model configuration in factory system
- Added retry strategy configuration for API calls
[0.2.7] - 2024-03-22
Added
- Added retry mechanism for handling transient failures:
- Implemented RetryManager with configurable strategies
- Added support for exponential, linear, and constant backoff
- Added comprehensive logging for retry attempts
- Added proper error handling and delay management
Changed
- Improved error handling in model selection:
- Enhanced connection error handling for API calls
- Added graceful fallback when default model is unavailable
- Improved error messages with detailed failure context
- Enhanced test coverage:
- Added tests for retry mechanism with various strategies
- Added tests for model fallback scenarios
- Added mocked API tests for connection failures
Fixed
- Fixed model selection to properly handle connection failures
- Fixed retry delays to prevent excessive wait times
- Fixed logging to capture all retry and fallback attempts
[0.2.6] - 2024-01-25
Added
- Implemented Model Factory pattern for vision models:
- Added VisionModel base class with abstract methods
- Added ModelFactory for centralized model management
- Added concrete implementations for GPT4 and Llama models
- Added comprehensive logging for model lifecycle
- Added configuration validation for each model type
Changed
- Refactored model initialization to use factory pattern
- Improved error handling in model creation and validation
- Standardized model interface across all implementations
- Enhanced logging with model-specific context
Documentation
- Added docstrings for new model classes
- Updated logging documentation
- Added model factory usage examples
[0.2.5] - 2024-01-21
Added
- Implemented comprehensive logging across all extractors:
- Added structured logging for PDF processing stages
- Added progress tracking for DOCX file conversions
- Added detailed logging for PPTX slide extraction
- Added HTML processing status logging
Changed
- Standardized logging patterns across all extractors
- Replaced print statements with proper logger calls
- Added logging initialization in all core modules
- Standardized log message format and levels
Improved
- Enhanced benchmark testing reliability
- Added performance metrics logging
- Improved test independence from environment
[0.2.4] - 2024-03-21
Changed
- Implemented parallel processing for DOCX extraction
- Added concurrent processing of paragraphs and images
- Improved performance through ThreadPoolExecutor
- ~72% reduction in processing time (189s → 53s)
[0.2.3] - 2024-03-20
Changed
- Implemented parallel processing for PDF extraction
- Improved performance by ~68% (4min → 1.3min on 27-page PDF)
[0.2.2] - 2024-03-20
Added
- Support for custom prompts in image description
- Added support for custom prompts in file extraction
[0.2.1] - 2024-03-19
Added
- Support for HTML file extraction using Playwright
- Capability to handle interactive HTML pages
- HTML to image conversion for consistent results
[0.2.0] - 2024-01-07
Fixed
- Fixed PDF image extraction black image issue (#11)
- Added proper color space handling
- Improved error handling and logging
Changed
- Improved image extraction reliability
- Implemented parallel processing
- Enhanced error reporting
- Updated documentation
[0.1.1] - 2024-01-07
Added
- Initial release with PDF, DOCX, and PPTX support
- Text and image extraction capabilities
- Image description using Vision LLMs
- Command-line interface