SearchMuse Contributing Guide¶

Welcome! This guide explains how to contribute code, documentation, and improvements to SearchMuse. We follow best practices for code quality, testing, and collaboration.

Code of Conduct¶

This project adheres to a Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to the maintainers.

How to Contribute¶

Reporting Bugs¶

Check existing issues - Avoid duplicates
Provide details:
SearchMuse version
Python version
Operating system
Steps to reproduce
Expected vs. actual behavior
Include logs - Error messages and stack traces
Example code - Minimal reproduction case

Suggesting Enhancements¶

Describe the feature - What problem does it solve?
Provide examples - How would users interact with it?
Discuss implementation - Technical approach
Check impact - Does it affect other components?

Contributing Code¶

Our contribution workflow:

Fork the repository on GitHub
Create a feature branch from main
Write tests first (TDD approach)
Implement the feature
Pass all checks (tests, linting, typing)
Create a pull request with clear description
Address review feedback
Merge to main

Development Workflow¶

Step 1: Fork and Clone¶

# Fork on GitHub (click "Fork" button)

# Clone your fork
git clone https://github.com/yourusername/searchmuse.git
cd searchmuse

# Add upstream remote
git remote add upstream https://github.com/originalorg/searchmuse.git

# Create feature branch
git checkout -b feature/my-feature

Step 2: Set Up Development Environment¶

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Verify setup
pytest --version
mypy --version
ruff --version

See Development Setup for detailed instructions.

Step 3: Write Tests First (TDD)¶

Always write tests before implementation:

# tests/domain/test_my_feature.py
import pytest
from searchmuse.domain import MyNewClass

class TestMyFeature:
    def test_basic_functionality(self):
        """MyNewClass should do X."""
        obj = MyNewClass(param="value")
        result = obj.some_method()
        assert result == "expected"

    def test_error_handling(self):
        """MyNewClass should raise error on invalid input."""
        with pytest.raises(ValidationError):
            MyNewClass(param="invalid")

Run tests to verify they fail:

pytest tests/domain/test_my_feature.py -v
# Should show FAILED - this is expected (RED phase)

Step 4: Implement the Feature¶

# src/searchmuse/domain/my_feature.py
from dataclasses import dataclass

@dataclass(frozen=True)
class MyNewClass:
    param: str

    def some_method(self) -> str:
        """Implement feature logic."""
        if not self.param:
            raise ValidationError("param", "Must be non-empty")
        return f"Result: {self.param}"

Run tests to verify they pass:

pytest tests/domain/test_my_feature.py -v
# Should show PASSED (GREEN phase)

Step 5: Refactor¶

Improve code while keeping tests passing:

# Optimize, extract methods, improve naming
@dataclass(frozen=True)
class MyNewClass:
    param: str

    def __post_init__(self):
        self._validate()

    def _validate(self) -> None:
        if not self.param:
            raise ValidationError("param", "Must be non-empty")

    def some_method(self) -> str:
        """Process the parameter."""
        return f"Result: {self.param}"

Run tests again:

pytest tests/domain/test_my_feature.py -v
# Should still PASS (REFACTOR phase complete)

Code Standards¶

Style and Formatting¶

SearchMuse uses ruff for code formatting and linting.

# Format code automatically
ruff format src/ tests/

# Check for linting issues
ruff check src/ tests/

# Fix some issues automatically
ruff check --fix src/ tests/

Configuration in pyproject.toml:

[tool.ruff]
line-length = 100
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "W", "I", "N", "UP"]

Type Checking¶

All code must pass mypy strict mode:

mypy src/ tests/ --strict

# Check specific file
mypy src/searchmuse/domain/my_feature.py --strict

Configuration in pyproject.toml:

[tool.mypy]
python_version = "3.11"
strict = true
warn_unused_ignores = true

Immutability¶

Critical: All domain objects must be immutable frozen dataclasses:

# Correct - frozen dataclass
@dataclass(frozen=True)
class MyClass:
    field: str

# Wrong - mutable class
class MyClass:
    def __init__(self, field: str):
        self.field = field

No mutation of existing objects:

# Wrong
def update_source(source: Source, title: str) -> None:
    source.title = title  # Mutation!

# Correct - return new object
def update_source(source: Source, title: str) -> Source:
    return Source(
        url=source.url,
        title=title,
        summary=source.summary,
        # ... other fields
    )

Function Size¶

Keep functions small and focused:

# Wrong - too long
def process_sources(sources, query, llm, scraper, extractor):
    # 100+ lines of logic

# Correct - extracted methods
def process_sources(sources: list[Source]) -> list[Source]:
    return [_process_source(s) for s in sources]

def _process_source(source: Source) -> Source:
    # 20-30 lines focused on single responsibility

Target: Functions under 50 lines

File Size¶

Organize code into small, focused files:

# Wrong - 1000-line file with everything
# searchmuse/utils.py (all utilities)

# Correct - organized by feature
# searchmuse/domain/search_query.py
# searchmuse/domain/search_state.py
# searchmuse/adapters/ollama_llm.py
# searchmuse/adapters/httpx_scraper.py

Target: Files under 400 lines, max 800 lines

Error Handling¶

Handle errors explicitly:

# Wrong - silent failure
try:
    result = await scraper.scrape(url)
except Exception:
    pass  # Error silently ignored

# Correct - explicit error handling
try:
    result = await scraper.scrape(url)
except TimeoutError as e:
    logger.error(f"Scrape timeout for {url}: {e}")
    raise SearchError(f"Could not fetch {url}") from e
except NetworkError as e:
    logger.error(f"Network error for {url}: {e}")
    raise SearchError(f"Network unavailable") from e

Testing Requirements¶

Test Coverage¶

Target: 80%+ overall coverage

# Generate coverage report
pytest --cov=searchmuse --cov-report=html tests/

# View specific coverage
pytest --cov=searchmuse --cov-report=term-missing tests/ | grep -A 5 "TOTAL"

Test Types¶

Every feature needs:

Unit tests - Individual functions/methods
Integration tests - Multi-component workflows (optional for small features)
E2E tests - Complete user flows (optional for critical paths)

Test Conventions¶

# Test file location: tests/<layer>/<module>/test_<name>.py
# tests/domain/test_search_query.py
# tests/adapters/test_ollama_llm.py

# Test class: Test<ComponentName>
class TestSearchQuery:
    pass

# Test method: test_<subject>_<behavior>_<expectation>
def test_search_query_with_empty_text_raises_error(self):
    pass

# Use descriptive docstrings
def test_search_query_with_empty_text_raises_error(self):
    """SearchQuery should reject empty text."""
    with pytest.raises(ValidationError):
        SearchQuery(text="")

See Testing Strategy for comprehensive testing guide.

Commit Message Format¶

Follow conventional commits:

<type>: <description>

<optional body>

<optional footer>

Types: - feat: New feature - fix: Bug fix - refactor: Code refactoring without behavior change - docs: Documentation changes - test: Test additions or modifications - chore: Build, dependencies, or tooling - perf: Performance improvements - ci: CI/CD configuration

Examples:

feat: add relevance scoring for sources

Implements LLM-based relevance assessment with configurable
threshold. Scores are cached in repository to avoid redundant
API calls.

Closes #42

fix: handle timeout in scraper gracefully

Previously, network timeouts would crash the entire research session.
Now timeouts are caught and logged, allowing the orchestrator to
continue with other sources.

Fixes #98

refactor: extract HTTP client to separate module

Improves code organization and testability by separating HTTP
concerns from scraping logic.

Pull Request Process¶

Before Creating PR¶

Update from upstream:

git fetch upstream
git rebase upstream/main

Run all checks:

# Format and lint
ruff format src/ tests/
ruff check --fix src/ tests/

# Type check
mypy src/ tests/ --strict

# Tests
pytest --cov=searchmuse tests/

# Coverage report
pytest --cov=searchmuse --cov-report=term-missing tests/

Verify no issues:
All tests pass
Coverage >= 80%
No mypy errors
No ruff warnings

Creating PR¶

Push to your fork:
```
git push -u origin feature/my-feature
```
Create PR on GitHub:
Clear title describing the feature
Reference any related issues (#42, #98)
Describe what changed and why
Link to relevant documentation

PR Template:

## Description
Brief description of changes.

## Related Issues
Closes #42

## Changes
- Item 1
- Item 2

## Testing
- [ ] Added unit tests
- [ ] Added integration tests
- [ ] Manually tested locally

## Checklist
- [ ] Code follows style guide
- [ ] No new warnings
- [ ] Tests pass
- [ ] Coverage >= 80%
- [ ] Documentation updated

During Review¶

Address feedback constructively
Make focused commits for each fix
Re-request review after addressing comments
Communicate clearly about changes

After Merge¶

Delete feature branch:

git branch -d feature/my-feature
git push origin --delete feature/my-feature

Update local main:

git checkout main
git pull upstream main

Extending with Adapters¶

Adding a New Adapter¶

To add support for a new LLM, scraper, or storage backend:

Define the port (if new):

# src/searchmuse/ports/my_port.py
from typing import Protocol

class MyPort(Protocol):
    async def my_method(self, param: str) -> str:
        """Docstring."""
        ...

Implement the adapter:

# src/searchmuse/adapters/my_adapter.py
class MyAdapter:
    async def my_method(self, param: str) -> str:
        """Implementation."""
        return "result"

Add tests:

# tests/adapters/test_my_adapter.py
import pytest
from searchmuse.adapters.my_adapter import MyAdapter

class TestMyAdapter:
    @pytest.mark.asyncio
    async def test_my_method(self):
        adapter = MyAdapter()
        result = await adapter.my_method("input")
        assert result == "expected"

Update configuration:

# config/default.yaml
my_service:
  adapter: my_adapter
  param: value

Document in:
Components Guide
README in the adapter file

Documentation¶

Update documentation when:

Adding new public API
Changing configuration options
Adding features
Fixing bugs with workarounds

Documentation files: - API changes: API Reference - Configuration: Configuration Reference - Architecture: Architecture Guide - Use cases: Use Cases

Questions?¶

Open an issue for questions
Check existing issues and discussions
Read relevant documentation
Ask in pull request comments

Last updated: 2026-02-28 Maintainers: [List of maintainers]