hgzero/design/backend/sequence/inner/stt-음성녹음인식.puml
yabo0812 d55fcfc1bd 내부 시퀀스 설계 완료 (25개 시나리오)
전체 5개 마이크로서비스의 내부 처리 흐름을 상세히 설계

[추가된 파일]
- Meeting Service: 6개 시나리오 (검증완료, 실시간수정동기화, 최종회의록확정, 충돌해결, 템플릿선택, 회의록목록조회)
- STT Service: 2개 시나리오 (음성녹음인식, 텍스트변환)
- User Service: 2개 시나리오 (사용자인증, 대시보드조회)
- Notification Service: 1개 시나리오 (알림발송)

[주요 설계 내용]
- Clean Architecture 적용 (Controller → Service → Domain → Repository)
- Cache-Aside 패턴 (Redis 기반 성능 최적화)
- Event-Driven Architecture (Azure Event Hub)
- Real-time Collaboration (WebSocket + OT 알고리즘)
- RAG 기능 (맥락 기반 AI)

[검증 결과]
- PlantUML 문법 검증: 모든 파일 통과 
- 유저스토리 매칭: 100% 일치 
- 아키텍처 패턴 준수: 완료 

[병렬 처리]
- 서브 에이전트 3개로 병렬 작업 수행
- Meeting Service, AI Service, STT/User/Notification 동시 설계

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-22 18:21:15 +09:00

118 lines
3.1 KiB
Plaintext

@startuml
!theme mono
title 음성녹음 및 화자 식별 내부 시퀀스 (UFR-STT-010)
participant "API Gateway<<E>>" as Gateway
participant "SttController" as Controller
participant "SttService" as Service
participant "AudioStreamManager" as StreamManager
participant "SpeakerIdentifier" as Speaker
participant "Azure Speech<<E>>" as Speech
participant "SttRepository" as Repository
database "PostgreSQL<<E>>" as DB
queue "Event Hub<<E>>" as EventHub
Gateway -> Controller: POST /api/v1/stt/start-recording\n{meetingId, userId}
activate Controller
Controller -> Service: startRecording(meetingId, userId)
activate Service
Service -> Repository: findMeetingById(meetingId)
activate Repository
Repository -> DB: SELECT * FROM meetings\nWHERE meeting_id = ?
DB --> Repository: meeting data
Repository --> Service: Meeting entity
deactivate Repository
Service -> StreamManager: initializeStream(meetingId)
activate StreamManager
StreamManager -> Speech: createRecognizer()\n(Azure Speech API)
note right
Azure Speech 설정:
- Language: ko-KR
- Format: PCM 16kHz
- Continuous recognition
end note
Speech --> StreamManager: recognizer instance
StreamManager --> Service: stream session
deactivate StreamManager
Service -> Speaker: identifySpeaker(audioFrame)
activate Speaker
Speaker -> Speech: analyzeSpeakerProfile()\n(Speaker Recognition API)
note right
화자 식별:
- Voice signature 생성
- 기존 프로필과 매칭
- 신규 화자 자동 등록
end note
Speech --> Speaker: speakerId
Speaker --> Service: speaker info
deactivate Speaker
Service -> Repository: saveSttSession(session)
activate Repository
Repository -> DB: INSERT INTO stt_sessions\n(meeting_id, status, started_at)
DB --> Repository: session saved
Repository --> Service: SttSession entity
deactivate Repository
Service -> EventHub: publish(SttStartedEvent)
note right
Event:
- meetingId
- sessionId
- startedAt
end note
Service --> Controller: RecordingStartResponse\n{sessionId, status}
deactivate Service
Controller --> Gateway: 200 OK\n{sessionId, streamUrl}
deactivate Controller
== 음성 스트리밍 처리 ==
Gateway -> Controller: WebSocket /ws/stt/{sessionId}\n[audio stream]
activate Controller
Controller -> Service: processAudioStream(sessionId, audioData)
activate Service
Service -> StreamManager: streamAudio(audioData)
activate StreamManager
StreamManager -> Speech: recognizeAsync(audioData)
Speech --> StreamManager: partial result
note right
실시간 인식:
- Partial text
- Confidence score
- Timestamp
end note
StreamManager --> Service: recognized text
deactivate StreamManager
Service -> Speaker: updateSpeakerMapping(text, timestamp)
activate Speaker
Speaker --> Service: speaker segment
deactivate Speaker
Service -> Repository: saveSttSegment(segment)
activate Repository
Repository -> DB: INSERT INTO stt_segments\n(session_id, text, speaker_id, timestamp)
DB --> Repository: segment saved
Repository --> Service: saved
deactivate Repository
Service --> Controller: streaming response
deactivate Service
Controller --> Gateway: WebSocket message\n{text, speaker, timestamp}
deactivate Controller
@enduml