hgzero/design/backend/sequence/inner/stt-음성녹음인식.puml
ondal 715add4dbc 외부/내부 시퀀스 설계 일관성 개선 및 표준화
주요 변경사항:

[Critical]
- API 엔드포인트 통일: POST /api/minutes/{minutesId}/finalize
- 이벤트 이름 표준화: MinutesFinalized

[Warning]
- API Gateway 라우팅 규칙 문서화 (외부 시퀀스 7개 파일)
- 대시보드 API 경로 통일: GET /api/dashboard
- AI 제안 병합 프로세스 상세 문서화
- 회의록 확정 검증 로직 5단계 상세화

[Minor]
- Redis 캐시 TTL 명시 (7개 파일, TTL 정책 표준화)
- 대시보드 페이지네이션 파라미터 추가
- 에러 응답 포맷 표준화 (14개 에러 응답)

총 31개 파일 수정, 34건의 개선 사항 적용

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-23 09:48:06 +09:00

118 lines
3.1 KiB
Plaintext

@startuml
!theme mono
title 음성녹음 및 화자 식별 내부 시퀀스 (UFR-STT-010)
participant "API Gateway<<E>>" as Gateway
participant "SttController" as Controller
participant "SttService" as Service
participant "AudioStreamManager" as StreamManager
participant "SpeakerIdentifier" as Speaker
participant "Azure Speech<<E>>" as Speech
participant "SttRepository" as Repository
database "PostgreSQL<<E>>" as DB
queue "Event Hub<<E>>" as EventHub
Gateway -> Controller: POST /api/v1/stt/start-recording\n{meetingId, userId}
activate Controller
Controller -> Service: startRecording(meetingId, userId)
activate Service
Service -> Repository: findMeetingById(meetingId)
activate Repository
Repository -> DB: 회의 정보 조회\n(회의ID 기준)
DB --> Repository: meeting data
Repository --> Service: Meeting entity
deactivate Repository
Service -> StreamManager: initializeStream(meetingId)
activate StreamManager
StreamManager -> Speech: createRecognizer()\n(Azure Speech API)
note right
Azure Speech 설정:
- Language: ko-KR
- Format: PCM 16kHz
- Continuous recognition
end note
Speech --> StreamManager: recognizer instance
StreamManager --> Service: stream session
deactivate StreamManager
Service -> Speaker: identifySpeaker(audioFrame)
activate Speaker
Speaker -> Speech: analyzeSpeakerProfile()\n(Speaker Recognition API)
note right
화자 식별:
- Voice signature 생성
- 기존 프로필과 매칭
- 신규 화자 자동 등록
end note
Speech --> Speaker: speakerId
Speaker --> Service: speaker info
deactivate Speaker
Service -> Repository: saveSttSession(session)
activate Repository
Repository -> DB: STT 세션 저장\n(회의ID, 상태, 시작일시)
DB --> Repository: session saved
Repository --> Service: SttSession entity
deactivate Repository
Service -> EventHub: publish(SttStartedEvent)
note right
Event:
- meetingId
- sessionId
- startedAt
end note
Service --> Controller: RecordingStartResponse\n{sessionId, status}
deactivate Service
Controller --> Gateway: 200 OK\n{sessionId, streamUrl}
deactivate Controller
== 음성 스트리밍 처리 ==
Gateway -> Controller: WebSocket /ws/stt/{sessionId}\n[audio stream]
activate Controller
Controller -> Service: processAudioStream(sessionId, audioData)
activate Service
Service -> StreamManager: streamAudio(audioData)
activate StreamManager
StreamManager -> Speech: recognizeAsync(audioData)
Speech --> StreamManager: partial result
note right
실시간 인식:
- Partial text
- Confidence score
- Timestamp
end note
StreamManager --> Service: recognized text
deactivate StreamManager
Service -> Speaker: updateSpeakerMapping(text, timestamp)
activate Speaker
Speaker --> Service: speaker segment
deactivate Speaker
Service -> Repository: saveSttSegment(segment)
activate Repository
Repository -> DB: STT 세그먼트 저장\n(세션ID, 텍스트, 화자ID, 타임스탬프)
DB --> Repository: segment saved
Repository --> Service: saved
deactivate Repository
Service --> Controller: streaming response
deactivate Service
Controller --> Gateway: WebSocket message\n{text, speaker, timestamp}
deactivate Controller
@enduml