Skip to content

Data Syncing

This guide explains how Discord Forum API syncs content from Discord to your database.

Sync Overview

The bot maintains a real-time copy of your Discord forum content:

Discord Server
├── Forum Channel A
│ ├── Thread 1 ──────► Database
│ │ ├── Message 1
│ │ ├── Message 2
│ │ └── ...
│ └── Thread 2 ──────► Database
└── Forum Channel B
└── ... ──────────► Database

What Gets Synced

DataSyncedNotes
Server metadataYesName, icon, description
Forum channelsYesName, topic, tags
ThreadsYesTitle, status, tags
MessagesYesContent, attachments, embeds
ReactionsYesEmoji and count
UsersYesUsername, avatar, badges
Non-forum channelsNoOnly forum channels
DMsNoNever synced
Voice channelsNoNot applicable

Sync Types

1. Initial Sync

When the bot joins a server or starts up:

  1. Server scan: Identifies all forum channels
  2. Thread discovery: Lists all threads (active and archived)
  3. Message fetch: Downloads all messages in each thread
  4. User sync: Stores user profiles for message authors

This happens automatically and may take several minutes for large servers.

2. Real-time Sync

After initial sync, the bot listens for events:

EventAction
Thread createdAdd thread to database
Thread updatedUpdate title, tags, status
Thread deletedMark as deleted
Message sentAdd message to database
Message editedUpdate content
Message deletedMark as deleted
Reaction addedUpdate reaction count
Reaction removedUpdate reaction count

3. Manual Resync

Trigger a full resync with the slash command:

/sync

Sync Configuration

Environment Variables

# Sync bot messages (default: true)
SYNC_BOT_MESSAGES=true
# Sync archived threads (default: true)
SYNC_ARCHIVED_THREADS=true
# Maximum message age to sync (days, 0 = unlimited)
MAX_MESSAGE_AGE=0
# Sync interval for checking updates (seconds)
SYNC_INTERVAL=60

Channel Selection

By default, all forum channels are synced. To limit which channels sync:

  1. Use channel permissions (bot can’t see = won’t sync)
  2. Or configure allowed channels in your bot code

Data Flow

Message Processing

When a new message arrives:

Discord Event
┌─────────────────┐
│ Event Handler │
│ (discord.js) │
└────────┬────────┘
┌─────────────────┐
│ Content Parser │
│ - Markdown │
│ - Embeds │
│ - Attachments │
└────────┬────────┘
┌─────────────────┐
│ Database Write │
│ (Drizzle ORM) │
└────────┬────────┘
Stored in DB

Markdown Processing

Discord markdown is converted to HTML for easier rendering:

DiscordHTML
**bold**<strong>bold</strong>
*italic*<em>italic</em>
`code`<code>code</code>
> quote<blockquote>quote</blockquote>
@user<span class="mention">@user</span>
#channel<span class="channel">#channel</span>

Both content (raw) and contentHtml (processed) are stored.

Performance

Initial Sync Speed

Factors affecting initial sync speed:

FactorImpact
Thread countMore threads = longer sync
Messages per threadMore messages = longer sync
Discord rate limits~50 requests/second
Database speedSQLite is fast, network DB is slower

Rough estimates:

  • 100 threads, 1000 messages: ~1 minute
  • 1000 threads, 10000 messages: ~10 minutes
  • 10000 threads, 100000 messages: ~1 hour

Rate Limiting

The bot respects Discord’s rate limits:

  • Automatic retry on 429 responses
  • Exponential backoff
  • Request queuing

You don’t need to configure anything - it’s handled automatically.

Data Integrity

Consistency Guarantees

  • Eventual consistency: All data syncs, but may be briefly delayed
  • No data loss: Missed events are caught on next sync
  • Idempotent: Re-syncing the same data is safe

Handling Deletions

When content is deleted in Discord:

ApproachBehavior
Soft delete (default)Mark as deleted, keep in database
Hard deleteRemove from database entirely

Configure with:

DELETION_MODE=soft # or "hard"

Soft deletes preserve history and allow for audit trails.

Monitoring

Log Output

The bot logs sync activity:

[INFO] Starting initial sync for server: My Server (123456789)
[INFO] Found 5 forum channels
[INFO] Syncing channel: help-forum (456 threads)
[INFO] Synced 456 threads, 12340 messages
[INFO] Initial sync complete for My Server

Database Stats

Query sync status via API:

Terminal window
curl http://localhost:3000/api/servers/123456789/stats

Response includes sync information:

{
"stats": {
"threads": { "total": 456 },
"messages": { "total": 12340 }
},
"lastSyncAt": "2024-01-15T12:00:00.000Z"
}

Troubleshooting

Missing Threads

  • Verify bot can see the channel (check permissions)
  • Check if thread is archived (enable SYNC_ARCHIVED_THREADS)
  • Trigger manual resync: /sync

Missing Messages

  • Verify Message Content Intent is enabled
  • Check message age against MAX_MESSAGE_AGE
  • Bot must be in server when message was sent (or resync)

Sync Seems Slow

  • Check Discord status for issues
  • Review rate limit logs
  • Consider server load
  • Initial sync of large servers takes time

Duplicate Content

  • This shouldn’t happen (idempotent operations)
  • If it does, check for multiple bot instances
  • Clear and resync if needed

Bot Keeps Reconnecting

  • Check network stability
  • Verify token is valid
  • Review Discord gateway status
  • Increase reconnect timeout if needed

Advanced: Custom Sync Logic

For advanced use cases, you can modify sync behavior in packages/bot/src/sync/:

// Example: Skip certain thread tags
function shouldSyncThread(thread: Thread): boolean {
if (thread.appliedTags.includes('private-tag-id')) {
return false;
}
return true;
}