@startuml !theme mono title STT Service - 음성 녹음 시작 및 실시간 인식 participant "Frontend<>" as Frontend participant "API Gateway<>" as Gateway participant "RecordingController" as Controller participant "RecordingService" as Service participant "AudioStreamManager" as StreamManager participant "RecordingRepository" as Repository participant "AzureSpeechClient" as AzureClient database "STT DB" as DB database "Azure Blob Storage<>" as BlobStorage queue "Azure Event Hubs<>" as EventHub == 회의 시작 이벤트 수신 및 녹음 준비 == EventHub -> Controller: MeetingStarted 이벤트 수신\n(meetingId, sessionId) activate Controller Controller -> Service: prepareRecording(meetingId, sessionId) activate Service Service -> Service: 녹음 세션 검증 note right - 중복 녹음 방지 체크 - meetingId 유효성 검증 end note Service -> Repository: createRecording(meetingId, sessionId) activate Repository Repository -> DB: 녹음 세션 생성\n(녹음ID, 회의ID, 세션ID, 상태, 생성일시) activate DB DB --> Repository: recordingId 반환 deactivate DB Repository --> Service: RecordingEntity 반환 deactivate Repository == Azure Speech Service 초기화 == Service -> AzureClient: initializeRecognizer(recordingId, sessionId) activate AzureClient AzureClient -> AzureClient: 음성 인식기 설정 note right Azure Speech 설정: - 언어: ko-KR - Format: PCM 16kHz - 샘플레이트: 16kHz - 실시간 스트리밍 모드 - Continuous recognition end note AzureClient -> BlobStorage: 녹음 파일 저장 경로 생성\n(path: recordings/{meetingId}/{sessionId}.wav) activate BlobStorage BlobStorage --> AzureClient: 저장 경로 URL 반환 deactivate BlobStorage AzureClient --> Service: RecognizerConfig 반환 deactivate AzureClient == 녹음 상태 업데이트 == Service -> Repository: updateRecordingStatus(recordingId, "RECORDING") activate Repository Repository -> DB: 녹음 상태 업데이트\n(상태='녹음중', 시작일시, 저장경로) activate DB DB --> Repository: 업데이트 완료 deactivate DB Repository --> Service: 업데이트 완료 deactivate Repository Service --> Controller: RecordingResponse(recordingId, status, storagePath) deactivate Service Controller --> EventHub: RecordingStarted 이벤트 발행\n(recordingId, meetingId, status) Controller --> Gateway: 200 OK\n{sessionId, streamUrl} deactivate Controller == 음성 스트리밍 및 화자 식별 처리 == Frontend -> Gateway: WebSocket /ws/stt/{sessionId}\n[audio stream] activate Gateway Gateway -> Controller: 음성 데이터 수신 activate Controller Controller -> Service: processAudioStream(sessionId, audioData) activate Service Service -> StreamManager: streamAudio(audioData) activate StreamManager StreamManager -> AzureClient: recognizeAsync(audioData) activate AzureClient AzureClient --> StreamManager: partial result\n(text, timestamp) deactivate AzureClient StreamManager --> Service: recognized text deactivate StreamManager == 세그먼트 저장 == Service -> Repository: saveSttSegment(segment) activate Repository Repository -> DB: STT 세그먼트 저장\n(세션ID, 텍스트, 타임스탬프, 신뢰도) activate DB DB --> Repository: segment saved deactivate DB Repository --> Service: saved deactivate Repository Service --> Controller: streaming response\n{text, timestamp, confidence} deactivate Service Controller --> Gateway: WebSocket message deactivate Controller Gateway --> Frontend: 실시간 자막 전송\n{text, timestamp} deactivate Gateway note over Frontend, EventHub 처리 시간: - DB 녹음 생성: ~100ms - Azure 인식기 초기화: ~500ms - Blob 경로 생성: ~200ms - 실시간 인식 지연: < 1초 - 총 초기화 시간: ~0.8초 정확도: - 음성 인식 정확도: 60-95% end note @enduml