Skip to main content

Bookkeeping Exchange - Data Models & Database Schema

Bookkeeping Exchange - Data Models & Database Schema

3. Database Schema & Data Models

Document Version: 1.0 Last Updated: 2025-12-26


3.1 Database Technology

Primary Database: PostgreSQL 15+ ORM: Django ORM Migrations: Django migrations


3.2 Entity Relationship Overview

User (Django Auth)
├── BusinessOwner (1:1)
│   └── Lead (1:many)
│       ├── LeadPurchase (1:many)
│       └── Bookmark (1:many)
├── Bookkeeper (1:1)
│   ├── LeadPurchase (1:many)
│   ├── Bookmark (1:many)
│   └── Subscription (1:1)
└── Administrator (1:1)

Configuration (singleton-like settings)
Transaction (payment records)
AuditLog (future - track changes)

3.3 Core Models

3.3.1 User Model (Django Built-in, Extended)

Uses Django's AbstractUser or extends via OneToOne relationship.

Fields:

# Django default fields
id (UUID or Integer, PK)
username (unique, email-based)
email (unique)
password (hashed)
first_name
last_name
is_active
is_staff
is_superuser
date_joined
last_login

# Custom fields
user_type (CharField: 'business_owner', 'bookkeeper', 'admin')
email_verified (Boolean, default=False)
email_verification_token (UUID, nullable)
email_verification_sent_at (DateTime, nullable)

Indexes:

  • email (unique)
  • username (unique)
  • user_type
  • email_verified

3.3.2 BusinessOwner Model

Purpose: Extended profile for business owners who submit leads

Fields:

id (PK, UUID)
user (OneToOne → User, on_delete=CASCADE)
business_name (CharField, max_length=255)
phone (CharField, max_length=20)
created_at (DateTimeField, auto_now_add=True)
updated_at (DateTimeField, auto_now=True)
terms_accepted (Boolean, default=False)
terms_accepted_at (DateTimeField, nullable)
consent_to_share_info (Boolean, default=False)
consent_given_at (DateTimeField, nullable)

Relationships:

  • Has many: Lead

Indexes:

  • user_id (unique)
  • created_at

3.3.3 Bookkeeper Model

Purpose: Extended profile for bookkeepers who purchase leads

Fields:

id (PK, UUID)
user (OneToOne → User, on_delete=CASCADE)
business_name (CharField, max_length=255, nullable)
phone (CharField, max_length=20)
website (URLField, nullable)
years_of_experience (IntegerField, nullable)

# Service area
service_area_type (CharField: 'nationwide', 'regional', 'local')
service_city (CharField, max_length=100, nullable)
service_state (CharField, max_length=2, nullable)  # US state code
service_radius (IntegerField, nullable)  # miles, if local/regional

# Certifications (stored as JSON or separate M2M table)
certifications (JSONField)  # ['CPA', 'EA', 'QB ProAdvisor', etc.]
specializations (JSONField)  # ['Construction', 'Healthcare', etc.]
optional_services (JSONField)  # ['Payroll', 'CFO Services', 'Tax Prep']

# Status
approval_status (CharField: 'pending', 'approved', 'rejected', 'suspended')
approved_at (DateTimeField, nullable)
approved_by (ForeignKey → User, nullable, on_delete=SET_NULL)
rejection_reason (TextField, nullable)

# Terms
terms_accepted (Boolean, default=False)
terms_accepted_at (DateTimeField, nullable)

# Timestamps
created_at (DateTimeField, auto_now_add=True)
updated_at (DateTimeField, auto_now=True)

Relationships:

  • Has one: Subscription
  • Has many: LeadPurchase
  • Has many: Bookmark

Indexes:

  • user_id (unique)
  • approval_status
  • service_state
  • created_at

3.3.4 Lead Model

Purpose: Business owner submissions (leads for bookkeepers to purchase)

Fields:

id (PK, UUID)
business_owner (ForeignKey → BusinessOwner, on_delete=CASCADE)

# Business Information
business_name (CharField, max_length=255)
industry (CharField, max_length=100)  # Or FK to Industry table
business_type (CharField)  # LLC, S-Corp, Sole Proprietor, etc.

# Contact Information (full - revealed after purchase)
contact_name (CharField, max_length=255)
email (EmailField)
phone (CharField, max_length=20)
address_line1 (CharField, max_length=255)
address_line2 (CharField, max_length=255, nullable)
city (CharField, max_length=100)
state (CharField, max_length=2)
zip_code (CharField, max_length=10)

# Business Details (preview information)
annual_revenue_range (CharField)  # '$0-50k', '$50k-100k', etc.
employee_count_range (CharField, nullable)  # '1-5', '6-10', etc.
services_needed (JSONField)  # ['Bookkeeping', 'Payroll', 'Tax Prep']
current_software (CharField, max_length=100, nullable)  # QuickBooks, Xero, etc.
urgency (CharField)  # 'Immediate', 'Within 1 month', 'Within 3 months'
remote_ok (Boolean, default=True)  # Accept remote bookkeepers?

# Detailed Description (full info - revealed after purchase)
description (TextField, nullable)  # Free-form business needs
budget_expectation (CharField, nullable)  # Monthly budget range

# Lead Quality Metrics
completion_score (IntegerField, default=0)  # 0-100, % of fields completed
quality_tier (CharField)  # 'basic', 'standard', 'premium'

# Pricing
base_price (DecimalField, max_digits=10, decimal_places=2)
final_price (DecimalField, max_digits=10, decimal_places=2)
price_override (Boolean, default=False)  # Admin manual override?

# Status
status (CharField)  # 'pending', 'purchased', 'inactive', 'reactivated'
is_active (Boolean, default=True)
purchased_at (DateTimeField, nullable)
purchased_by (ForeignKey → Bookkeeper, nullable, on_delete=SET_NULL)

# Metadata
created_at (DateTimeField, auto_now_add=True)
updated_at (DateTimeField, auto_now=True)
form_submission_ip (GenericIPAddressField, nullable)

Relationships:

  • Belongs to: BusinessOwner
  • Has many: LeadPurchase (history of purchases/reactivations)
  • Has many: Bookmark

Indexes:

  • business_owner_id
  • status
  • is_active
  • state
  • annual_revenue_range
  • created_at (DESC)
  • purchased_by_id

Revenue Range Choices:

REVENUE_RANGES = [
    ('0-50k', '$0 - $50,000'),
    ('50k-100k', '$50,001 - $100,000'),
    ('100k-250k', '$100,001 - $250,000'),
    ('250k-500k', '$250,001 - $500,000'),
    ('500k-1m', '$500,001 - $1,000,000'),
    ('1m+', '$1,000,000+'),
]

Status Flow:

pending → purchased → [inactive] → [reactivated] → purchased

3.3.5 LeadPurchase Model

Purpose: Track all lead purchase transactions (including reactivations)

Fields:

id (PK, UUID)
lead (ForeignKey → Lead, on_delete=CASCADE)
bookkeeper (ForeignKey → Bookkeeper, on_delete=CASCADE)
purchase_price (DecimalField, max_digits=10, decimal_places=2)
purchase_type (CharField)  # 'exclusive', 'shared' (future)
stripe_payment_intent_id (CharField, max_length=255, unique)
stripe_charge_id (CharField, max_length=255, nullable)
payment_status (CharField)  # 'pending', 'succeeded', 'failed', 'refunded'
refund_amount (DecimalField, max_digits=10, decimal_places=2, default=0)
refund_reason (TextField, nullable)
refunded_at (DateTimeField, nullable)
refunded_by (ForeignKey → User, nullable, on_delete=SET_NULL)  # Admin
is_active_purchase (Boolean, default=True)  # Current owner of lead?
created_at (DateTimeField, auto_now_add=True)
updated_at (DateTimeField, auto_now=True)

Relationships:

  • Belongs to: Lead
  • Belongs to: Bookkeeper

Indexes:

  • lead_id
  • bookkeeper_id
  • stripe_payment_intent_id (unique)
  • payment_status
  • is_active_purchase
  • created_at

Business Logic:

  • When a lead is reactivated and purchased again, previous purchase has is_active_purchase=False
  • Latest purchase with is_active_purchase=True is the current owner

3.3.6 Bookmark Model

Purpose: Allow bookkeepers to save leads for later

Fields:

id (PK, UUID)
bookkeeper (ForeignKey → Bookkeeper, on_delete=CASCADE)
lead (ForeignKey → Lead, on_delete=CASCADE)
created_at (DateTimeField, auto_now_add=True)

Relationships:

  • Belongs to: Bookkeeper
  • Belongs to: Lead

Indexes:

  • Composite unique: (bookkeeper_id, lead_id)
  • created_at

Constraints:

  • Unique together: (bookkeeper, lead)

3.3.7 Subscription Model

Purpose: Track bookkeeper subscription status and Stripe details

Fields:

id (PK, UUID)
bookkeeper (OneToOne → Bookkeeper, on_delete=CASCADE)

# Subscription Details
status (CharField)  # 'trial', 'active', 'past_due', 'canceled', 'expired'
subscription_type (CharField)  # 'trial', 'monthly' (future: 'annual')
monthly_price (DecimalField, max_digits=10, decimal_places=2)

# Trial Information
trial_start_date (DateField, nullable)
trial_end_date (DateField, nullable)
trial_duration_days (IntegerField, default=30)

# Paid Subscription
subscription_start_date (DateField, nullable)
current_period_start (DateField, nullable)
current_period_end (DateField, nullable)
cancel_at_period_end (Boolean, default=False)

# Stripe Integration
stripe_customer_id (CharField, max_length=255, unique, nullable)
stripe_subscription_id (CharField, max_length=255, unique, nullable)
stripe_payment_method_id (CharField, max_length=255, nullable)

# Payment History
last_payment_date (DateField, nullable)
last_payment_amount (DecimalField, max_digits=10, decimal_places=2, nullable)
next_billing_date (DateField, nullable)
failed_payment_count (IntegerField, default=0)

# Metadata
created_at (DateTimeField, auto_now_add=True)
updated_at (DateTimeField, auto_now=True)
canceled_at (DateTimeField, nullable)
cancellation_reason (TextField, nullable)

Relationships:

  • Belongs to: Bookkeeper (OneToOne)

Indexes:

  • bookkeeper_id (unique)
  • stripe_customer_id (unique)
  • stripe_subscription_id (unique)
  • status
  • current_period_end

Computed Properties:

@property
def is_active(self):
    return self.status in ['trial', 'active']

@property
def days_until_expiration(self):
    if self.status == 'trial' and self.trial_end_date:
        return (self.trial_end_date - date.today()).days
    elif self.current_period_end:
        return (self.current_period_end - date.today()).days
    return None

@property
def requires_payment_method(self):
    return self.status == 'trial' and not self.stripe_payment_method_id

3.3.8 Transaction Model

Purpose: Audit log of all financial transactions

Fields:

id (PK, UUID)
transaction_type (CharField)  # 'subscription_payment', 'lead_purchase', 'refund'
user (ForeignKey → User, on_delete=SET_NULL, nullable)  # Who initiated
bookkeeper (ForeignKey → Bookkeeper, on_delete=SET_NULL, nullable)
lead (ForeignKey → Lead, on_delete=SET_NULL, nullable)  # If lead purchase
amount (DecimalField, max_digits=10, decimal_places=2)
currency (CharField, max_length=3, default='USD')
stripe_payment_intent_id (CharField, max_length=255, nullable)
stripe_charge_id (CharField, max_length=255, nullable)
stripe_refund_id (CharField, max_length=255, nullable)
status (CharField)  # 'pending', 'succeeded', 'failed', 'refunded'
description (TextField, nullable)
metadata (JSONField, default=dict)  # Additional details
created_at (DateTimeField, auto_now_add=True)

Indexes:

  • transaction_type
  • user_id
  • bookkeeper_id
  • lead_id
  • status
  • created_at (DESC)

3.3.9 Configuration Model

Purpose: Admin-configurable platform settings (singleton pattern)

Fields:

id (PK, always 1)
# Pricing
monthly_subscription_price (DecimalField, max_digits=10, decimal_places=2, default=199.00)
trial_duration_days (IntegerField, default=30)
lead_pricing_rules (JSONField)  # Revenue-based pricing rules

# Limits & Policies
subscription_grace_period_days (IntegerField, default=7)
bookmark_retention_days (IntegerField, default=90)

# Email Configuration
email_sender_name (CharField, max_length=100, default='Bookkeeping Exchange')
support_email (EmailField)
admin_notification_email (EmailField)

# Legal
terms_of_service_url (CharField, max_length=255, default='/terms')
privacy_policy_url (CharField, max_length=255, default='/privacy')

# Feature Flags (future)
enable_shared_leads (Boolean, default=False)
enable_messaging (Boolean, default=False)
enable_reviews (Boolean, default=False)

updated_at (DateTimeField, auto_now=True)
updated_by (ForeignKey → User, on_delete=SET_NULL, nullable)

Example Pricing Rules JSON:

{
  "revenue_pricing": {
    "0-50k": {"base": 50, "quality_multiplier": true},
    "50k-100k": {"base": 100, "quality_multiplier": true},
    "100k-250k": {"base": 150, "quality_multiplier": true},
    "250k-500k": {"base": 250, "quality_multiplier": true},
    "500k-1m": {"base": 400, "quality_multiplier": true},
    "1m+": {"base": 600, "quality_multiplier": true}
  },
  "quality_multipliers": {
    "incomplete": 0.5,
    "standard": 1.0,
    "urgent": 1.2
  }
}

Constraints:

  • Only one Configuration record allowed (enforced in model/admin)

3.4 Supporting Models (Future/Optional)

3.4.1 Industry Model (Optional)

Normalize industry choices instead of free-form CharField

3.4.2 AuditLog Model (Future)

Track all changes to critical models

3.4.3 Message Model (Future)

In-platform messaging between bookkeepers and business owners

3.4.4 Review Model (Future)

Rating/review system


3.5 Data Validation Rules

Lead Validation

  • Email: Valid email format
  • Phone: US phone format (various formats accepted, normalized)
  • Zip Code: Valid US zip (5 or 9 digits)
  • State: Valid US state code
  • Completion Score: Auto-calculated based on required vs optional fields filled

Bookkeeper Validation

  • Service Area: If 'local' or 'regional', city/state/radius required
  • Certifications: Must be from predefined list
  • Specializations: Free-form or predefined list

Subscription Validation

  • Trial End Date: Must be >= trial_start_date + trial_duration_days
  • Billing Dates: current_period_end >= current_period_start

3.6 Calculated Fields & Business Logic

Lead Pricing Calculation

def calculate_lead_price(lead):
    """
    Calculate lead price based on revenue range and quality
    """
    config = Configuration.objects.get(id=1)
    pricing_rules = config.lead_pricing_rules

    # Get base price from revenue range
    revenue_range = lead.annual_revenue_range
    base_price = pricing_rules['revenue_pricing'][revenue_range]['base']

    # Apply quality multiplier
    if pricing_rules['revenue_pricing'][revenue_range]['quality_multiplier']:
        if lead.completion_score < 50:
            multiplier = pricing_rules['quality_multipliers']['incomplete']
        elif lead.urgency == 'Immediate':
            multiplier = pricing_rules['quality_multipliers']['urgent']
        else:
            multiplier = pricing_rules['quality_multipliers']['standard']

        final_price = base_price * multiplier
    else:
        final_price = base_price

    # Admin can override
    if lead.price_override:
        return lead.final_price

    return round(final_price, 2)

Lead Completion Score

def calculate_completion_score(lead):
    """
    Calculate what % of lead fields are completed
    """
    required_fields = [
        'business_name', 'industry', 'contact_name',
        'email', 'phone', 'city', 'state',
        'annual_revenue_range', 'services_needed'
    ]
    optional_fields = [
        'business_type', 'address_line1', 'address_line2',
        'zip_code', 'employee_count_range', 'current_software',
        'description', 'budget_expectation'
    ]

    required_filled = sum(1 for f in required_fields if getattr(lead, f))
    optional_filled = sum(1 for f in optional_fields if getattr(lead, f))

    total_fields = len(required_fields) + len(optional_fields)
    filled_fields = required_filled + optional_filled

    return int((filled_fields / total_fields) * 100)

3.7 Database Indexes Strategy

High-Priority Indexes:

  • Lead.created_at (DESC) - Main feed query
  • Lead.status + is_active (composite) - Filter active leads
  • Lead.state - Geographic filtering
  • Bookkeeper.approval_status - Admin queue
  • Subscription.status - Active subscriber query
  • Subscription.current_period_end - Expiration warnings

Future Optimization:

  • Full-text search on Lead.description, Lead.industry
  • Partial indexes on filtered queries (e.g., WHERE is_active=true)

3.8 Data Retention & Archival

Active Data

  • Leads: Keep indefinitely (or until business owner requests deletion)
  • Purchases: Keep indefinitely for business records
  • Transactions: Keep 7 years (financial records)

Inactive Data

  • Expired bookkeepers: Keep 1 year, then archive
  • Rejected bookkeeper applications: Keep 90 days
  • Bookmarks from expired subscriptions: Keep 90 days

Soft Deletes

  • Users: Mark inactive, don't hard delete (FK preservation)
  • Leads: Can be marked inactive, not deleted

3.9 Backup & Recovery

  • Frequency: Daily automated backups
  • Retention: 30 days of daily backups
  • Recovery Point Objective (RPO): 24 hours
  • Recovery Time Objective (RTO): 4 hours

Next Document: PRD_4_FEATURES.md - Detailed feature specifications and user flows