Episode 2: Chapter Screenshots

This documents the Chapter Screenshot Service — an async pipeline that extracts screenshots from YouTube videos at each chapter timestamp and uploads them to MinIO (S3-compatible storage).

Design / Architecture

Why a separate topic? Screenshot extraction takes 15-20s per video (yt-dlp + ffmpeg + MinIO upload). If added to the existing adventuretube-data consumer, it would block the SSE response to iOS and risk the 30s SseEmitter timeout. A separate topic keeps the save flow fast (~300ms) and processes screenshots independently.

Topic	Action	Consumer	Duration	Blocks SSE?
`adventuretube-data`	CREATE / DELETE	StoryConsumer.java	~200ms	No (SSE responds immediately)
`adventuretube-screenshots`	GENERATE_SCREENSHOTS	ScreenshotConsumer.java	~15-20s	No (separate consumer)

Sequence Diagram: Phase 1 — Save (existing flow)

Phase 1 (Save) is simplified here. For the full sequence diagram with token refresh, SSE streaming, and Zipkin traces, see Episode 1: Publish Story.

Sequence Diagram: Phase 2 & 3 — Screenshot Generation + iOS Polling

iOS Call Stack: Screenshot Flow

Phase 1 (save) reuses the existing publish flow. Phase 3 (polling) is new — not yet implemented.

uploadStory()                                      // AddStoryViewVM+Publishing.swift
│
│  Phase 1: Save (existing publish flow)
├─ publishStory(jsonData:)                          // POST /auth/geo/save
│  └─ .sink receiveValue: { response → trackingId }
│     └─ startSSETracking(trackingId:, onCompleted:)
│        └─ handleJobStatus → .COMPLETED
│           └─ onCompleted(jobStatus)
│              ├─ isPublished = true
│              ├─ publishingStatus = .completed → UI
│              │
│              │  Phase 3: Screenshot Polling (TODO)
│              └─ startScreenshotPolling(youtubeContentID:)
│                 ├─ Timer.publish(every: 5.0)
│                 │  └─ pollScreenshotStatus(youtubeContentID:)
│                 │     └─ GET /auth/geo/data/screenshot-status/{id}
│                 │
│                 └─ .sink receiveValue:
│                    ├─ .COMPLETED → applyScreenshotUrls()
│                    ├─ .FAILED    → screenshotStatus = .failed
│                    └─ .PROCESSING → keep polling

Java Call Stack: Screenshot Generation Flow

Triggered automatically after story save — no separate iOS request.

ScreenshotConsumer.consume(message)
│  └─ switch(GENERATE_SCREENSHOTS) → handleGenerateScreenshots()
│
ScreenshotService.processScreenshotJob()
│
├─ screenshotJobStatusRepository.findByYoutubeContentID()
│  ├─ .map(existing → update to PENDING, save)
│  └─ .orElseGet(() → create new, save)
│
├─ tempDir = Files.createTempDirectory("screenshot")
│
├─ for each chapter:
│  ├─ yt-dlp download 1s clip at chapter.youtubeTime
│  ├─ ffmpeg extract frame → JPG
│  ├─ s3Client.putObject(bucket, key, file)          // upload to MinIO
│  ├─ chapter.setScreenshotUrl(s3Key)
│  └─ screenshotJobStatusRepository.save(progress)
│
├─ adventureTubeDataRepository.save(data)             // update MongoDB
│
└─ jobStatus.setStatus(COMPLETED)
   └─ screenshotJobStatusRepository.save(jobStatus)

Flow Details

Phase 1: Save (unchanged)

The existing save flow is not modified. iOS gets SSE COMPLETED in ~300ms as before. The only change in StoryConsumer.handleSave() is one new line after markCompleted():

// After save + markCompleted (SSE already sent to iOS)
producer.sendScreenshotRequest(saved.getYoutubeContentID(), saved);

Phase 2: Screenshot Generation (new)

ScreenshotConsumer listens on adventuretube-screenshots topic:

Creates ScreenshotJobStatus (PROCESSING) in MongoDB
For each chapter: yt-dlp downloads a 1s clip → ffmpeg extracts frame → S3 uploads to MinIO
Updates each Chapter document in MongoDB with screenshotUrl
Marks ScreenshotJobStatus as COMPLETED

Phase 3: iOS Polls (new endpoint)

iOS polls GET /geo/data/screenshot-status/{youtubeContentID} every 5 seconds:

Response	Meaning	iOS Action
`{ "status": "PROCESSING" }`	Screenshots still generating	Keep polling
`{ "status": "COMPLETED" }`	All screenshots ready	Fetch updated data, stop polling
`{ "status": "FAILED" }`	Screenshot generation failed	Show placeholder, stop polling
`404`	No screenshot job exists	Show placeholder

Data Models

KafkaAction (updated)

public enum KafkaAction {
    CREATE,
    UPDATE,
    DELETE,
    GENERATE_SCREENSHOTS  // NEW
}

ScreenshotJobStatus (new MongoDB document)

@Document(collection = "screenshotJobStatus")
public class ScreenshotJobStatus {
    @Id private String id;
    @Indexed(unique = true) private String youtubeContentID;
    private ScreenshotStatus status;  // PENDING, PROCESSING, COMPLETED, FAILED
    private int totalChapters;
    private int completedChapters;
    private String errorMessage;
    private LocalDateTime createdAt;
    private LocalDateTime updatedAt;
    @Indexed(expireAfter = "7d") private LocalDateTime expireAt;
}

Chapter (updated)

public class Chapter {
    // existing fields...
    private String screenshotUrl;  // NEW — MinIO S3 key
}

New Components

Component	Type	Purpose
`ScreenshotConsumer`	Kafka Consumer	Listens on adventuretube-screenshots, triggers extraction
`ScreenshotService`	Service	Core logic: yt-dlp + ffmpeg + S3 upload + MongoDB update
`ScreenshotJobStatus`	Entity	Tracks screenshot generation progress per video
`MinioConfig`	Config	S3Client bean configured for MinIO
`GET /geo/data/screenshot-status/{id}`	REST Endpoint	iOS polls this for screenshot readiness

MinIO Storage (S3 on Pi1)

Property	Value
Bucket	`chapter-screenshots`
S3 Key Format	`{youtubeContentID}/ch{index}_{youtubeTime}s.jpg`
Public URL	`https://s3.travel-tube.com/chapter-screenshots/{key}`

Example files in MinIO:

chapter-screenshots/
  WsghFCuoZ6Q/
    ch1_176s.jpg     (49KB)
    ch2_431s.jpg     (67KB)
    ch3_756s.jpg     (50KB)
    ch4_989s.jpg     (20KB)
    ch5_1152s.jpg    (26KB)

Performance Impact

Zero impact on save flow. The existing save → SSE response remains ~300ms. Screenshot generation runs asynchronously on a separate Kafka consumer thread.

Operation	Duration	Blocks iOS?
Save to MongoDB + SSE	~300ms	No (returns immediately)
Screenshot generation (5 chapters)	~15-20s	No (async via separate Kafka topic)
iOS polling (per request)	~50ms	No (lightweight REST)

Dependencies

Dependency	Purpose
`software.amazon.awssdk:s3`	S3 client for MinIO uploads
`yt-dlp` (CLI)	YouTube video clip download
`ffmpeg` (CLI)	Frame extraction from video clip

Note: yt-dlp and ffmpeg must be installed on the host machine / Docker image running geospatial-service.