Skip to content

PyVisionAI

Changelog

MDGrey33/pyvisionai

Changelog

All notable changes to PyVisionAI will be documented here. The format is based on Keep a Changelog.

[0.3.1] - 2025-02-23

Added

Added new -m/--model parameter to describe-image command for model selection
Added new -s/--source parameter to describe-image command for specifying image path
Added comprehensive test coverage for CLI parameters

Changed

Updated CLI parameter handling to support both new and legacy model selection
Updated CLI parameter handling to support both new and legacy image path specification
Enhanced help messages with clearer model descriptions

Note

The -u/--use-case parameter continues to be supported for backward compatibility
The -i/--image parameter continues to be supported for backward compatibility
We recommend using -m/--model and -s/--source for better consistency across commands

[0.3.0] - 2024-03-25

Added

Claude Vision Integration

Added ClaudeVisionModel class for Anthropic's Claude Vision API integration
Implemented robust retry logic and error handling for Claude API calls
Added support for custom prompts with Claude Vision
Added describe_image_claude function to main API

Testing Framework

Added Claude-specific test markers and comprehensive test suite
Added integration tests with real API calls
Added rate limit and retry logic tests

Documentation

Added Claude Vision model documentation
Updated API documentation with Claude Vision integration details
Added environment setup instructions for Anthropic API key

Configuration

Added ANTHROPIC_API_KEY environment variable support
Added Claude Vision model configuration in factory system
Added retry strategy configuration for API calls

[0.2.7] - 2024-03-22

Added

Added retry mechanism for handling transient failures:
Implemented RetryManager with configurable strategies
Added support for exponential, linear, and constant backoff
Added comprehensive logging for retry attempts
Added proper error handling and delay management

Changed

Improved error handling in model selection:
Enhanced connection error handling for API calls
Added graceful fallback when default model is unavailable
Improved error messages with detailed failure context
Enhanced test coverage:
Added tests for retry mechanism with various strategies
Added tests for model fallback scenarios
Added mocked API tests for connection failures

Fixed

Fixed model selection to properly handle connection failures
Fixed retry delays to prevent excessive wait times
Fixed logging to capture all retry and fallback attempts

[0.2.6] - 2024-01-25

Added

Implemented Model Factory pattern for vision models:
Added VisionModel base class with abstract methods
Added ModelFactory for centralized model management
Added concrete implementations for GPT4 and Llama models
Added comprehensive logging for model lifecycle
Added configuration validation for each model type

Changed

Refactored model initialization to use factory pattern
Improved error handling in model creation and validation
Standardized model interface across all implementations
Enhanced logging with model-specific context

Documentation

Added docstrings for new model classes
Updated logging documentation
Added model factory usage examples

[0.2.5] - 2024-01-21

Added

Implemented comprehensive logging across all extractors:
Added structured logging for PDF processing stages
Added progress tracking for DOCX file conversions
Added detailed logging for PPTX slide extraction
Added HTML processing status logging

Changed

Standardized logging patterns across all extractors
Replaced print statements with proper logger calls
Added logging initialization in all core modules
Standardized log message format and levels

Improved

Enhanced benchmark testing reliability
Added performance metrics logging
Improved test independence from environment

[0.2.4] - 2024-03-21

Changed

Implemented parallel processing for DOCX extraction
Added concurrent processing of paragraphs and images
Improved performance through ThreadPoolExecutor
~72% reduction in processing time (189s → 53s)

[0.2.3] - 2024-03-20

Changed

Implemented parallel processing for PDF extraction
Improved performance by ~68% (4min → 1.3min on 27-page PDF)

[0.2.2] - 2024-03-20

Added

Support for custom prompts in image description
Added support for custom prompts in file extraction

[0.2.1] - 2024-03-19

Added

Support for HTML file extraction using Playwright
Capability to handle interactive HTML pages
HTML to image conversion for consistent results

[0.2.0] - 2024-01-07

Fixed

Fixed PDF image extraction black image issue (#11)
Added proper color space handling
Improved error handling and logging

Changed

Improved image extraction reliability
Implemented parallel processing
Enhanced error reporting
Updated documentation

[0.1.1] - 2024-01-07

Added

Initial release with PDF, DOCX, and PPTX support
Text and image extraction capabilities
Image description using Vision LLMs
Command-line interface