Testing Policy: Separation of Unit Tests and Content-Based Tests¶
1. Policy Overview¶
This policy establishes a strict separation between unit tests and content-based tests to ensure fast, reliable testing while maintaining comprehensive content validation.
1.1 Core Principles¶
- Unit tests must always run and must never include tests that validate actual content generation or content-dependent behavior
- Content-based validation must live in a separate content-test suite
- Content tests should run only when new content has been generated since the last successful content-test run
- No content-based tests inside the unit-test suite - no ignoring or filtering, complete separation
- Clear project structure with explicit separation of concerns
2. Definitions¶
2.1 Unit Tests¶
Unit tests validate code logic, functions, classes, and algorithms without dependencies on: - Actual content generation - Real data or content files - External systems or services - File system operations on content - Content-dependent business logic
Unit tests focus on: - Code correctness - Algorithm implementation - Data transformation logic - Error handling - API contract validation
2.2 Content-Based Tests¶
Content-based tests validate generated content and content-dependent behavior: - Content structure and formatting - Content completeness and validity - Content metadata and properties - Content relationships and references - Content quality metrics
2.3 New Content¶
New content is defined as: - Newly generated content files since the last successful content-test run - Modified content files that have changed since the last successful content-test run - Content identified by content digests, timestamps, or content IDs that differ from previous runs
3. Project Structure¶
3.1 Directory Organization¶
project/
├── src/ # Source code
├── tests/ # Test suite root
│ ├── unit/ # Unit tests (always run)
│ │ ├── test_module1.py
│ │ └── test_module2.py
│ └── content/ # Content tests (run only on new content)
│ ├── test_content_validation.py
│ └── test_content_structure.py
├── docs/ # Documentation
└── ci/ # CI configuration
3.2 Test Runner Configuration¶
Each test suite must have its own dedicated test runner configuration.
4. Content Change Detection¶
4.1 Content Digests¶
Use cryptographic hashes to detect content changes:
# Generate content digest for all content files
find content/ -type f -exec sha256sum {} \; | sort -k2 | sha256sum > .content-digest
4.2 Content Tracking¶
Maintain state to track content changes:
- .content-digest - Current content hash
- .last-tested-digest - Last tested content hash
- Content timestamps in metadata
4.3 Change Detection Algorithm¶
- Generate current content digest
- Compare with last tested digest
- Run content tests only if digests differ
- Update last tested digest on successful run
5. Test Runner Configuration¶
5.1 Unit Test Runner¶
Unit tests must run on every build with no content dependencies:
# Pytest example
pytest tests/unit/ -v
# Jest example
npm test:unit
# Maven example
mvn test -Dtest=**/unit/**
5.2 Content Test Runner¶
Content tests run only when new content is detected:
# Pytest example
pytest tests/content/ -v
# Jest example
npm test:content
# Maven example
mvn test -Dtest=**/content/**
5.3 Test Discovery and Marking¶
Use explicit test marking to prevent content tests from being discovered in unit runs:
Python (pytest):
# Content tests marked explicitly
@pytest.mark.content
def test_content_structure():
pass
# Run only unit tests (exclude content)
pytest tests/ -v -m "not content"
JavaScript (Jest):
// Content tests in separate files or marked
describe('[CONTENT] Content validation', () => {
test('validates content structure', () => {
// content test
});
});
// Run only unit tests (exclude content)
npm test -- --testPathIgnorePatterns=content
6. CI/CD Integration¶
6.1 Unit Test Pipeline¶
6.2 Content Test Pipeline¶
# Run only when content changes are detected
- name: Check for content changes
run: |
if [ "$(cat .content-digest)" != "$(cat .last-tested-digest 2>/dev/null || echo '')" ]; then
echo "CONTENT_CHANGED=true" >> $GITHUB_ENV
fi
- name: Run content tests
if: env.CONTENT_CHANGED == 'true'
run: pytest tests/content/ -v
- name: Update content digest
if: success()
run: cp .content-digest .last-tested-digest
7. Implementation Process¶
7.1 Migration Steps¶
- Identify existing content-based tests in unit test suite
- Move content tests to
tests/content/directory - Add content markers to content tests
- Update CI/CD to exclude content tests from unit runs
- Implement content change detection mechanism
- Validate separation with audit process
7.2 Migration Example¶
Before (mixed tests):
# tests/test_articles.py
def test_article_processing_logic(): # UNIT TEST
# Code logic validation
pass
def test_article_has_title(): # CONTENT TEST - WRONG LOCATION
# Content validation
pass
After (separated tests):
# tests/unit/test_articles.py
def test_article_processing_logic(): # UNIT TEST
# Code logic validation
pass
# tests/content/test_article_validation.py
@pytest.mark.content
def test_article_has_title(): # CONTENT TEST - CORRECT LOCATION
# Content validation
pass
8. Validation Checklist¶
8.1 Unit Test Validation¶
- Unit tests contain no content-based checks
- Unit tests run independently of content files
- Unit tests complete in < 5 minutes
- Unit tests have no external dependencies on content
- Unit tests pass without content directories present
8.2 Content Test Validation¶
- Content tests are skipped unless there is new content
- Content tests validate actual content properties
- Content tests map to specific content items
- Content tests use content change detection
- Content tests report results per content item
8.3 CI/CD Validation¶
- CI fails if policy is violated (content tests in unit runs)
- Unit tests run on every push/PR
- Content tests run only on content changes
- Content change detection is reliable
- Test results are reported separately
9. Audit and Adjustment Process¶
9.1 Regular Audits¶
- Monthly review of test categorization
- Verify no content tests have leaked into unit suite
- Check content change detection accuracy
- Review test performance metrics
9.2 Policy Adjustment¶
- Document policy violations and root causes
- Update policy for evolving content patterns
- Refine content change detection mechanisms
- Adjust test organization as project grows
10. Error Handling and Edge Cases¶
10.1 Content Test Failures¶
- Content tests failures should not block unit test pipeline
- Content test failures should trigger content regeneration
- Content test failures should provide detailed content-specific reports
10.2 False Positives/Negatives¶
- Implement content digest verification
- Use multiple change detection methods
- Provide manual override for content test runs
10.3 Missing Content¶
- Handle gracefully when expected content is missing
- Provide clear error messages for content dependencies
- Skip content tests when no content is present