Kihagyás

Testing Policy: Separation of Unit Tests and Content-Based Tests

1. Policy Overview

This policy establishes a strict separation between unit tests and content-based tests to ensure fast, reliable testing while maintaining comprehensive content validation.

1.1 Core Principles

  1. Unit tests must always run and must never include tests that validate actual content generation or content-dependent behavior
  2. Content-based validation must live in a separate content-test suite
  3. Content tests should run only when new content has been generated since the last successful content-test run
  4. No content-based tests inside the unit-test suite - no ignoring or filtering, complete separation
  5. Clear project structure with explicit separation of concerns

2. Definitions

2.1 Unit Tests

Unit tests validate code logic, functions, classes, and algorithms without dependencies on: - Actual content generation - Real data or content files - External systems or services - File system operations on content - Content-dependent business logic

Unit tests focus on: - Code correctness - Algorithm implementation - Data transformation logic - Error handling - API contract validation

2.2 Content-Based Tests

Content-based tests validate generated content and content-dependent behavior: - Content structure and formatting - Content completeness and validity - Content metadata and properties - Content relationships and references - Content quality metrics

2.3 New Content

New content is defined as: - Newly generated content files since the last successful content-test run - Modified content files that have changed since the last successful content-test run - Content identified by content digests, timestamps, or content IDs that differ from previous runs

3. Project Structure

3.1 Directory Organization

project/
├── src/                    # Source code
├── tests/                  # Test suite root
│   ├── unit/              # Unit tests (always run)
│   │   ├── test_module1.py
│   │   └── test_module2.py
│   └── content/           # Content tests (run only on new content)
│       ├── test_content_validation.py
│       └── test_content_structure.py
├── docs/                   # Documentation
└── ci/                     # CI configuration

3.2 Test Runner Configuration

Each test suite must have its own dedicated test runner configuration.

4. Content Change Detection

4.1 Content Digests

Use cryptographic hashes to detect content changes:

# Generate content digest for all content files
find content/ -type f -exec sha256sum {} \; | sort -k2 | sha256sum > .content-digest

4.2 Content Tracking

Maintain state to track content changes: - .content-digest - Current content hash - .last-tested-digest - Last tested content hash - Content timestamps in metadata

4.3 Change Detection Algorithm

  1. Generate current content digest
  2. Compare with last tested digest
  3. Run content tests only if digests differ
  4. Update last tested digest on successful run

5. Test Runner Configuration

5.1 Unit Test Runner

Unit tests must run on every build with no content dependencies:

# Pytest example
pytest tests/unit/ -v

# Jest example
npm test:unit

# Maven example
mvn test -Dtest=**/unit/**

5.2 Content Test Runner

Content tests run only when new content is detected:

# Pytest example
pytest tests/content/ -v

# Jest example
npm test:content

# Maven example
mvn test -Dtest=**/content/**

5.3 Test Discovery and Marking

Use explicit test marking to prevent content tests from being discovered in unit runs:

Python (pytest):

# Content tests marked explicitly
@pytest.mark.content
def test_content_structure():
    pass

# Run only unit tests (exclude content)
pytest tests/ -v -m "not content"

JavaScript (Jest):

// Content tests in separate files or marked
describe('[CONTENT] Content validation', () => {
  test('validates content structure', () => {
    // content test
  });
});

// Run only unit tests (exclude content)
npm test -- --testPathIgnorePatterns=content

6. CI/CD Integration

6.1 Unit Test Pipeline

# Always run on every push/PR
- name: Run unit tests
  run: pytest tests/unit/ -v

6.2 Content Test Pipeline

# Run only when content changes are detected
- name: Check for content changes
  run: |
    if [ "$(cat .content-digest)" != "$(cat .last-tested-digest 2>/dev/null || echo '')" ]; then
      echo "CONTENT_CHANGED=true" >> $GITHUB_ENV
    fi

- name: Run content tests
  if: env.CONTENT_CHANGED == 'true'
  run: pytest tests/content/ -v

- name: Update content digest
  if: success()
  run: cp .content-digest .last-tested-digest

7. Implementation Process

7.1 Migration Steps

  1. Identify existing content-based tests in unit test suite
  2. Move content tests to tests/content/ directory
  3. Add content markers to content tests
  4. Update CI/CD to exclude content tests from unit runs
  5. Implement content change detection mechanism
  6. Validate separation with audit process

7.2 Migration Example

Before (mixed tests):

# tests/test_articles.py
def test_article_processing_logic():  # UNIT TEST
    # Code logic validation
    pass

def test_article_has_title():  # CONTENT TEST - WRONG LOCATION
    # Content validation
    pass

After (separated tests):

# tests/unit/test_articles.py
def test_article_processing_logic():  # UNIT TEST
    # Code logic validation
    pass

# tests/content/test_article_validation.py
@pytest.mark.content
def test_article_has_title():  # CONTENT TEST - CORRECT LOCATION
    # Content validation
    pass

8. Validation Checklist

8.1 Unit Test Validation

  • Unit tests contain no content-based checks
  • Unit tests run independently of content files
  • Unit tests complete in < 5 minutes
  • Unit tests have no external dependencies on content
  • Unit tests pass without content directories present

8.2 Content Test Validation

  • Content tests are skipped unless there is new content
  • Content tests validate actual content properties
  • Content tests map to specific content items
  • Content tests use content change detection
  • Content tests report results per content item

8.3 CI/CD Validation

  • CI fails if policy is violated (content tests in unit runs)
  • Unit tests run on every push/PR
  • Content tests run only on content changes
  • Content change detection is reliable
  • Test results are reported separately

9. Audit and Adjustment Process

9.1 Regular Audits

  1. Monthly review of test categorization
  2. Verify no content tests have leaked into unit suite
  3. Check content change detection accuracy
  4. Review test performance metrics

9.2 Policy Adjustment

  1. Document policy violations and root causes
  2. Update policy for evolving content patterns
  3. Refine content change detection mechanisms
  4. Adjust test organization as project grows

10. Error Handling and Edge Cases

10.1 Content Test Failures

  • Content tests failures should not block unit test pipeline
  • Content test failures should trigger content regeneration
  • Content test failures should provide detailed content-specific reports

10.2 False Positives/Negatives

  • Implement content digest verification
  • Use multiple change detection methods
  • Provide manual override for content test runs

10.3 Missing Content

  • Handle gracefully when expected content is missing
  • Provide clear error messages for content dependencies
  • Skip content tests when no content is present