Testing Policy: Separation of Unit Tests and Content-Based Tests¶

1. Policy Overview¶

This policy establishes a strict separation between unit tests and content-based tests to ensure fast, reliable testing while maintaining comprehensive content validation.

1.1 Core Principles¶

Unit tests must always run and must never include tests that validate actual content generation or content-dependent behavior
Content-based validation must live in a separate content-test suite
Content tests should run only when new content has been generated since the last successful content-test run
No content-based tests inside the unit-test suite - no ignoring or filtering, complete separation
Clear project structure with explicit separation of concerns

2. Definitions¶

2.1 Unit Tests¶

Unit tests validate code logic, functions, classes, and algorithms without dependencies on: - Actual content generation - Real data or content files - External systems or services - File system operations on content - Content-dependent business logic

Unit tests focus on: - Code correctness - Algorithm implementation - Data transformation logic - Error handling - API contract validation

2.2 Content-Based Tests¶

Content-based tests validate generated content and content-dependent behavior: - Content structure and formatting - Content completeness and validity - Content metadata and properties - Content relationships and references - Content quality metrics

2.3 New Content¶

New content is defined as: - Newly generated content files since the last successful content-test run - Modified content files that have changed since the last successful content-test run - Content identified by content digests, timestamps, or content IDs that differ from previous runs

3. Project Structure¶

3.1 Directory Organization¶

project/
├── src/                    # Source code
├── tests/                  # Test suite root
│   ├── unit/              # Unit tests (always run)
│   │   ├── test_module1.py
│   │   └── test_module2.py
│   └── content/           # Content tests (run only on new content)
│       ├── test_content_validation.py
│       └── test_content_structure.py
├── docs/                   # Documentation
└── ci/                     # CI configuration

3.2 Test Runner Configuration¶

Each test suite must have its own dedicated test runner configuration.

4. Content Change Detection¶

4.1 Content Digests¶

Use cryptographic hashes to detect content changes:

# Generate content digest for all content files
find content/ -type f -exec sha256sum {} \; | sort -k2 | sha256sum > .content-digest

4.2 Content Tracking¶

Maintain state to track content changes: - .content-digest - Current content hash - .last-tested-digest - Last tested content hash - Content timestamps in metadata

4.3 Change Detection Algorithm¶

Generate current content digest
Compare with last tested digest
Run content tests only if digests differ
Update last tested digest on successful run

5. Test Runner Configuration¶

5.1 Unit Test Runner¶

Unit tests must run on every build with no content dependencies:

# Pytest example
pytest tests/unit/ -v

# Jest example
npm test:unit

# Maven example
mvn test -Dtest=**/unit/**

5.2 Content Test Runner¶

Content tests run only when new content is detected:

# Pytest example
pytest tests/content/ -v

# Jest example
npm test:content

# Maven example
mvn test -Dtest=**/content/**

5.3 Test Discovery and Marking¶

Use explicit test marking to prevent content tests from being discovered in unit runs:

Python (pytest):

# Content tests marked explicitly
@pytest.mark.content
def test_content_structure():
    pass

# Run only unit tests (exclude content)
pytest tests/ -v -m "not content"

JavaScript (Jest):

// Content tests in separate files or marked
describe('[CONTENT] Content validation', () => {
  test('validates content structure', () => {
    // content test
  });
});

// Run only unit tests (exclude content)
npm test -- --testPathIgnorePatterns=content

6. CI/CD Integration¶

6.1 Unit Test Pipeline¶

# Always run on every push/PR
- name: Run unit tests
  run: pytest tests/unit/ -v

6.2 Content Test Pipeline¶

# Run only when content changes are detected
- name: Check for content changes
  run: |
    if [ "$(cat .content-digest)" != "$(cat .last-tested-digest 2>/dev/null || echo '')" ]; then
      echo "CONTENT_CHANGED=true" >> $GITHUB_ENV
    fi

- name: Run content tests
  if: env.CONTENT_CHANGED == 'true'
  run: pytest tests/content/ -v

- name: Update content digest
  if: success()
  run: cp .content-digest .last-tested-digest

7. Implementation Process¶

7.1 Migration Steps¶

Identify existing content-based tests in unit test suite
Move content tests to tests/content/ directory
Add content markers to content tests
Update CI/CD to exclude content tests from unit runs
Implement content change detection mechanism
Validate separation with audit process

7.2 Migration Example¶

Before (mixed tests):

# tests/test_articles.py
def test_article_processing_logic():  # UNIT TEST
    # Code logic validation
    pass

def test_article_has_title():  # CONTENT TEST - WRONG LOCATION
    # Content validation
    pass

After (separated tests):

# tests/unit/test_articles.py
def test_article_processing_logic():  # UNIT TEST
    # Code logic validation
    pass

# tests/content/test_article_validation.py
@pytest.mark.content
def test_article_has_title():  # CONTENT TEST - CORRECT LOCATION
    # Content validation
    pass

8. Validation Checklist¶

8.1 Unit Test Validation¶

Unit tests contain no content-based checks
Unit tests run independently of content files
Unit tests complete in < 5 minutes
Unit tests have no external dependencies on content
Unit tests pass without content directories present

8.2 Content Test Validation¶

Content tests are skipped unless there is new content
Content tests validate actual content properties
Content tests map to specific content items
Content tests use content change detection
Content tests report results per content item

8.3 CI/CD Validation¶

CI fails if policy is violated (content tests in unit runs)
Unit tests run on every push/PR
Content tests run only on content changes
Content change detection is reliable
Test results are reported separately

9. Audit and Adjustment Process¶

9.1 Regular Audits¶

Monthly review of test categorization
Verify no content tests have leaked into unit suite
Check content change detection accuracy
Review test performance metrics

9.2 Policy Adjustment¶

Document policy violations and root causes
Update policy for evolving content patterns
Refine content change detection mechanisms
Adjust test organization as project grows

10. Error Handling and Edge Cases¶

10.1 Content Test Failures¶

Content tests failures should not block unit test pipeline
Content test failures should trigger content regeneration
Content test failures should provide detailed content-specific reports

10.2 False Positives/Negatives¶

Implement content digest verification
Use multiple change detection methods
Provide manual override for content test runs

10.3 Missing Content¶

Handle gracefully when expected content is missing
Provide clear error messages for content dependencies
Skip content tests when no content is present