mirror of https://github.com/hwanny1128/HGZero.git synced 2025-12-06 17:16:25 +00:00

Minseo-Jo 896589c5f4 아키텍처 패턴 설계서 업데이트 및 백업

- 클라우드 아키텍처 패턴 적용 방안 재작성
- 기존 버전을 architecture-pattern_bk.md로 백업
- .claude/settings.local.json 설정 업데이트

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-21 11:11:08 +09:00

52 KiB

Raw Blame History

클라우드 아키텍처 패턴 적용 방안

1. 문서 개요

1.1 목적

회의록 작성 및 공유 개선 서비스의 마이크로서비스 아키텍처에 적용할 클라우드 디자인 패턴을 정의합니다.

1.2 범위

본 문서는 다음 6개의 핵심 클라우드 디자인 패턴의 적용 방안을 다룹니다:

API Gateway
Queue-Based Load Leveling
Cache-Aside
Publisher-Subscriber
Asynchronous Request-Reply
Health Endpoint Monitoring

1.3 참조 문서

2. 적용 패턴 개요

2.1 패턴 분류

카테고리	패턴	적용 우선순위	주요 목적
Design & Implementation	API Gateway	높음	단일 진입점, 라우팅, 인증
Messaging	Queue-Based Load Leveling	높음	부하 분산, 비동기 처리
Data Management	Cache-Aside	높음	성능 최적화, DB 부하 감소
Messaging	Publisher-Subscriber	중간	이벤트 기반 통신
Messaging	Asynchronous Request-Reply	중간	장시간 작업 비동기 처리
Management & Monitoring	Health Endpoint Monitoring	높음	서비스 상태 모니터링

2.2 서비스별 패턴 적용 매트릭스

서비스	API Gateway	Queue-Based	Cache-Aside	Pub-Sub	Async Request	Health Monitor
User Service	✅	-	✅	✅	-	✅
Meeting Service	✅	✅	✅	✅	-	✅
Transcript Service	✅	✅	✅	✅	✅	✅
AI Service	✅	✅	✅	✅	✅	✅
Notification Service	✅	✅	-	✅	-	✅
Todo Service	✅	-	✅	✅	-	✅

3. 패턴별 상세 적용 방안

3.1 API Gateway

3.1.1 패턴 개요

문제: 클라이언트가 여러 마이크로서비스와 직접 통신하면 복잡도가 증가하고 보안 관리가 어려움

해결: 모든 클라이언트 요청을 처리하는 단일 진입점을 제공하여 라우팅, 인증, 로깅 등을 중앙화

3.1.2 적용 방안

구현 기술

Kong Gateway 또는 Spring Cloud Gateway
JWT 기반 인증/인가
Rate Limiting 및 Throttling

주요 기능

API Gateway 역할:
  라우팅:
    - /api/users/* → User Service
    - /api/meetings/* → Meeting Service
    - /api/transcripts/* → Transcript Service
    - /api/ai/* → AI Service
    - /api/notifications/* → Notification Service
    - /api/todos/* → Todo Service

  인증/인가:
    - JWT 토큰 검증
    - 사용자 권한 확인
    - 회의 참여자 검증

  부가 기능:
    - Request/Response 로깅
    - Rate Limiting (사용자당 요청 제한)
    - 요청/응답 변환
    - CORS 처리

적용 시나리오

사용자 로그인 플로우:
1. Frontend → API Gateway: POST /api/users/login
2. API Gateway → User Service: 요청 라우팅
3. User Service → API Gateway: JWT 토큰 반환
4. API Gateway → Frontend: 토큰 전달 + 로깅

회의 생성 플로우:
1. Frontend → API Gateway: POST /api/meetings (with JWT)
2. API Gateway: JWT 검증
3. API Gateway → Meeting Service: 요청 라우팅
4. Meeting Service → API Gateway: 회의 정보 반환
5. API Gateway → Frontend: 응답 전달

3.1.3 구현 예시

Kong Gateway 설정 (YAML)

services:
  - name: user-service
    url: http://user-service:8080
    routes:
      - name: user-routes
        paths:
          - /api/users
        methods:
          - GET
          - POST
          - PUT
          - DELETE
    plugins:
      - name: jwt
      - name: rate-limiting
        config:
          minute: 100
          hour: 1000

  - name: meeting-service
    url: http://meeting-service:8080
    routes:
      - name: meeting-routes
        paths:
          - /api/meetings
        methods:
          - GET
          - POST
          - PUT
          - DELETE
    plugins:
      - name: jwt
      - name: rate-limiting
        config:
          minute: 200
          hour: 2000

Spring Cloud Gateway 설정 (application.yml)

spring:
  cloud:
    gateway:
      routes:
        - id: user-service
          uri: lb://USER-SERVICE
          predicates:
            - Path=/api/users/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 10
                redis-rate-limiter.burstCapacity: 20

        - id: meeting-service
          uri: lb://MEETING-SERVICE
          predicates:
            - Path=/api/meetings/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 20
                redis-rate-limiter.burstCapacity: 40

3.1.4 주의사항

API Gateway가 SPOF(Single Point of Failure)가 되지 않도록 HA 구성 필요
Gateway 자체의 성능이 병목이 되지 않도록 수평 확장 가능한 구조 유지
과도한 비즈니스 로직 포함 금지 (단순 라우팅 및 공통 기능만 처리)

3.2 Queue-Based Load Leveling

3.2.1 패턴 개요

문제: 트래픽이 급증할 때 서비스가 과부하 상태가 되어 응답 지연 또는 실패 발생

해결: 메시지 큐를 사용하여 요청을 버퍼링하고, 서비스가 처리 가능한 속도로 소비

3.2.2 적용 방안

구현 기술

RabbitMQ 또는 Apache Kafka
Spring AMQP / Spring Kafka

적용 대상 서비스

적용 서비스:
  Meeting Service:
    - 대량 회의 생성 요청
    - 회의 종료 후 후처리 작업

  Transcript Service:
    - 회의록 생성 요청 (STT 처리)
    - 회의록 섹션별 검증 요청

  AI Service:
    - AI 일정 생성 요청
    - AI 요약 생성 요청
    - 다량의 AI 처리 요청

  Notification Service:
    - 알림 발송 요청 (이메일, SMS)
    - 대량 알림 발송

적용 시나리오

회의록 생성 플로우:
1. Frontend → Meeting Service: 회의 종료 요청
2. Meeting Service → Queue: 회의록 생성 메시지 발행
3. Transcript Service: Queue에서 메시지 소비 (속도 제어)
4. Transcript Service: STT 처리 및 회의록 생성
5. Transcript Service → Meeting Service: 완료 알림

AI 일정 생성 플로우:
1. Frontend → Meeting Service: AI 일정 생성 요청
2. Meeting Service → Queue: AI 처리 메시지 발행
3. AI Service: Queue에서 메시지 소비 (처리 속도 제어)
4. AI Service: AI 분석 및 일정 생성
5. AI Service → Notification Service: 완료 알림 발송

3.2.3 구현 예시

RabbitMQ 큐 정의

@Configuration
public class QueueConfig {

    // 회의록 생성 큐
    @Bean
    public Queue transcriptQueue() {
        return QueueBuilder.durable("transcript.creation.queue")
            .withArgument("x-max-length", 1000)  // 최대 메시지 수
            .withArgument("x-message-ttl", 3600000)  // 1시간 TTL
            .build();
    }

    // AI 처리 큐
    @Bean
    public Queue aiProcessingQueue() {
        return QueueBuilder.durable("ai.processing.queue")
            .withArgument("x-max-length", 500)
            .withArgument("x-message-ttl", 7200000)  // 2시간 TTL
            .build();
    }

    // 알림 발송 큐
    @Bean
    public Queue notificationQueue() {
        return QueueBuilder.durable("notification.queue")
            .withArgument("x-max-length", 2000)
            .withArgument("x-message-ttl", 1800000)  // 30분 TTL
            .build();
    }
}

Producer 예시 (Meeting Service)

@Service
@RequiredArgsConstructor
public class TranscriptQueueProducer {

    private final RabbitTemplate rabbitTemplate;

    public void sendTranscriptCreationRequest(TranscriptCreationMessage message) {
        rabbitTemplate.convertAndSend(
            "transcript.creation.queue",
            message,
            msg -> {
                msg.getMessageProperties().setPriority(message.getPriority());
                msg.getMessageProperties().setExpiration("3600000"); // 1시간
                return msg;
            }
        );
        log.info("Transcript creation request sent: meetingId={}", message.getMeetingId());
    }
}

Consumer 예시 (Transcript Service)

@Service
@RequiredArgsConstructor
public class TranscriptQueueConsumer {

    private final TranscriptService transcriptService;

    @RabbitListener(
        queues = "transcript.creation.queue",
        concurrency = "3-10"  // 동시 처리 워커 수 (최소 3, 최대 10)
    )
    public void handleTranscriptCreation(TranscriptCreationMessage message) {
        try {
            log.info("Processing transcript creation: meetingId={}", message.getMeetingId());
            transcriptService.createTranscript(message);
            log.info("Transcript creation completed: meetingId={}", message.getMeetingId());
        } catch (Exception e) {
            log.error("Transcript creation failed: meetingId={}", message.getMeetingId(), e);
            throw new AmqpRejectAndDontRequeueException("Failed to process transcript", e);
        }
    }
}

3.2.4 주의사항

메시지 큐가 가득 찰 경우의 처리 전략 정의 필요 (거부, 대기, 우선순위 기반 처리)
Dead Letter Queue 구성으로 실패 메시지 별도 처리
Consumer의 동시 처리 수(concurrency) 적절히 설정하여 리소스 효율 극대화

3.3 Cache-Aside

3.3.1 패턴 개요

문제: 데이터베이스 반복 조회로 인한 성능 저하 및 DB 부하 증가

해결: 애플리케이션이 캐시를 먼저 확인하고, 캐시 미스 시 DB에서 조회 후 캐시에 저장

3.3.2 적용 방안

구현 기술

Redis (분산 캐시)
Spring Cache Abstraction
Caffeine (로컬 캐시 - 옵션)

적용 대상 데이터

캐싱 대상:
  User Service:
    - 사용자 프로필 정보 (TTL: 30분)
    - 사용자 권한 정보 (TTL: 1시간)

  Meeting Service:
    - 회의 기본 정보 (TTL: 10분)
    - 회의 참여자 목록 (TTL: 10분)
    - 회의 템플릿 목록 (TTL: 1시간)

  Transcript Service:
    - 회의록 조회 (TTL: 30분)
    - 회의록 섹션 정보 (TTL: 30분)

  Todo Service:
    - 사용자별 Todo 목록 (TTL: 5분)
    - Todo 진행 상태 통계 (TTL: 5분)

캐시 전략

캐시 정책:
  읽기 집중 데이터:
    - 패턴: Cache-Aside
    - 전략: Lazy Loading
    - 예: 사용자 프로필, 회의 정보 조회

  쓰기 빈도 높은 데이터:
    - 패턴: Write-Through + Cache-Aside
    - 전략: 업데이트 시 캐시 무효화
    - 예: Todo 상태 변경, 회의 정보 수정

  캐시 무효화:
    - 데이터 변경 시 즉시 무효화
    - TTL 기반 자동 만료
    - 명시적 삭제 API 제공

적용 시나리오

사용자 프로필 조회:
1. User Service: Redis에서 사용자 프로필 조회
2. Cache Hit → 캐시 데이터 반환
3. Cache Miss → DB 조회 → Redis 저장 (TTL: 30분) → 데이터 반환

회의 정보 수정:
1. Meeting Service: 회의 정보 업데이트 (DB)
2. Meeting Service: Redis에서 해당 회의 캐시 삭제
3. 다음 조회 시 새로운 데이터로 캐시 재생성

3.3.3 구현 예시

Redis 설정 (application.yml)

spring:
  data:
    redis:
      host: localhost
      port: 6379
      password: ${REDIS_PASSWORD}
      lettuce:
        pool:
          max-active: 10
          max-idle: 5
          min-idle: 2

  cache:
    type: redis
    redis:
      time-to-live: 1800000  # 기본 TTL: 30분
      cache-null-values: false
      use-key-prefix: true

Cache 설정

@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public RedisCacheConfiguration cacheConfiguration() {
        return RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(30))
            .serializeKeysWith(
                RedisSerializationContext.SerializationPair.fromSerializer(
                    new StringRedisSerializer()
                )
            )
            .serializeValuesWith(
                RedisSerializationContext.SerializationPair.fromSerializer(
                    new GenericJackson2JsonRedisSerializer()
                )
            );
    }

    @Bean
    public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
        Map<String, RedisCacheConfiguration> cacheConfigurations = new HashMap<>();

        // 사용자 프로필 캐시 (TTL: 30분)
        cacheConfigurations.put("userProfile",
            RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofMinutes(30)));

        // 회의 정보 캐시 (TTL: 10분)
        cacheConfigurations.put("meetingInfo",
            RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofMinutes(10)));

        // Todo 목록 캐시 (TTL: 5분)
        cacheConfigurations.put("todoList",
            RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofMinutes(5)));

        return RedisCacheManager.builder(connectionFactory)
            .cacheDefaults(cacheConfiguration())
            .withInitialCacheConfigurations(cacheConfigurations)
            .build();
    }
}

서비스 레이어 적용

@Service
@RequiredArgsConstructor
public class UserService {

    private final UserRepository userRepository;

    // Cache-Aside 패턴: 조회
    @Cacheable(value = "userProfile", key = "#userId")
    public UserProfileDto getUserProfile(Long userId) {
        log.info("Cache miss - Loading user profile from DB: userId={}", userId);
        User user = userRepository.findById(userId)
            .orElseThrow(() -> new UserNotFoundException(userId));
        return UserProfileDto.from(user);
    }

    // 캐시 무효화: 업데이트
    @CacheEvict(value = "userProfile", key = "#userId")
    public UserProfileDto updateUserProfile(Long userId, UpdateUserProfileRequest request) {
        log.info("Updating user profile and evicting cache: userId={}", userId);
        User user = userRepository.findById(userId)
            .orElseThrow(() -> new UserNotFoundException(userId));
        user.updateProfile(request);
        userRepository.save(user);
        return UserProfileDto.from(user);
    }

    // 캐시 무효화: 삭제
    @CacheEvict(value = "userProfile", key = "#userId")
    public void deleteUser(Long userId) {
        log.info("Deleting user and evicting cache: userId={}", userId);
        userRepository.deleteById(userId);
    }
}

회의 정보 캐싱

@Service
@RequiredArgsConstructor
public class MeetingService {

    private final MeetingRepository meetingRepository;

    @Cacheable(value = "meetingInfo", key = "#meetingId")
    public MeetingDto getMeeting(Long meetingId) {
        log.info("Cache miss - Loading meeting from DB: meetingId={}", meetingId);
        Meeting meeting = meetingRepository.findById(meetingId)
            .orElseThrow(() -> new MeetingNotFoundException(meetingId));
        return MeetingDto.from(meeting);
    }

    @CacheEvict(value = "meetingInfo", key = "#meetingId")
    public MeetingDto updateMeeting(Long meetingId, UpdateMeetingRequest request) {
        log.info("Updating meeting and evicting cache: meetingId={}", meetingId);
        Meeting meeting = meetingRepository.findById(meetingId)
            .orElseThrow(() -> new MeetingNotFoundException(meetingId));
        meeting.update(request);
        meetingRepository.save(meeting);
        return MeetingDto.from(meeting);
    }
}

3.3.4 주의사항

캐시와 DB 간 데이터 정합성 유지 전략 필요
캐시 크기 제한 및 메모리 관리 (Eviction Policy 설정)
Hot Key 문제 대응 (특정 키에 대한 과도한 접근)
Cache Stampede 방지 (동시 다발적 Cache Miss 시 DB 부하)

3.4 Publisher-Subscriber (Pub-Sub)

3.4.1 패턴 개요

문제: 서비스 간 강한 결합으로 인한 확장성 저하 및 이벤트 기반 통신의 어려움

해결: 메시지 브로커를 통해 이벤트를 발행하고, 관심 있는 서비스가 구독하여 처리

3.4.2 적용 방안

구현 기술

RabbitMQ (Exchange/Topic 기반) 또는 Apache Kafka (Topic 기반)
Spring Cloud Stream

적용 이벤트

도메인 이벤트:
  User Events:
    - UserCreated: 사용자 생성 이벤트
    - UserUpdated: 사용자 정보 변경 이벤트
    - UserDeleted: 사용자 삭제 이벤트

  Meeting Events:
    - MeetingCreated: 회의 생성 이벤트
    - MeetingStarted: 회의 시작 이벤트
    - MeetingEnded: 회의 종료 이벤트
    - MeetingCancelled: 회의 취소 이벤트

  Transcript Events:
    - TranscriptCreated: 회의록 생성 완료 이벤트
    - TranscriptVerified: 회의록 검증 완료 이벤트
    - TranscriptShared: 회의록 공유 이벤트

  Todo Events:
    - TodoCreated: Todo 생성 이벤트
    - TodoStatusChanged: Todo 상태 변경 이벤트
    - TodoCompleted: Todo 완료 이벤트

구독자 매트릭스

이벤트 구독:
  MeetingCreated:
    - Notification Service: 참여자에게 알림 발송
    - AI Service: 회의 일정 분석 준비

  MeetingEnded:
    - Transcript Service: 회의록 생성 시작
    - Notification Service: 회의 종료 알림
    - Todo Service: 회의 기반 Todo 추출

  TranscriptCreated:
    - Notification Service: 회의록 생성 완료 알림
    - Meeting Service: 회의 상태 업데이트
    - AI Service: AI 분석 시작

  TodoCreated:
    - Notification Service: 담당자에게 알림

  TodoCompleted:
    - Notification Service: 완료 알림
    - Meeting Service: 회의 진행률 업데이트

적용 시나리오

회의 종료 후 플로우:
1. Meeting Service: MeetingEnded 이벤트 발행
2. Transcript Service: 이벤트 구독 → 회의록 생성 시작
3. Notification Service: 이벤트 구독 → 참여자에게 종료 알림
4. Todo Service: 이벤트 구독 → 회의 내용에서 Todo 추출
5. AI Service: 이벤트 구독 → AI 분석 준비

회의록 생성 완료 플로우:
1. Transcript Service: TranscriptCreated 이벤트 발행
2. Notification Service: 이벤트 구독 → 회의록 생성 완료 알림
3. Meeting Service: 이벤트 구독 → 회의 상태 '회의록 생성 완료'로 업데이트
4. AI Service: 이벤트 구독 → 회의록 AI 분석 시작

3.4.3 구현 예시

RabbitMQ Exchange/Queue 설정

@Configuration
public class RabbitMQConfig {

    // Topic Exchange 정의
    @Bean
    public TopicExchange meetingExchange() {
        return new TopicExchange("meeting.events");
    }

    @Bean
    public TopicExchange transcriptExchange() {
        return new TopicExchange("transcript.events");
    }

    @Bean
    public TopicExchange todoExchange() {
        return new TopicExchange("todo.events");
    }

    // Notification Service용 Queue
    @Bean
    public Queue notificationMeetingQueue() {
        return new Queue("notification.meeting.queue", true);
    }

    @Bean
    public Queue notificationTranscriptQueue() {
        return new Queue("notification.transcript.queue", true);
    }

    // Transcript Service용 Queue
    @Bean
    public Queue transcriptMeetingQueue() {
        return new Queue("transcript.meeting.queue", true);
    }

    // Todo Service용 Queue
    @Bean
    public Queue todoMeetingQueue() {
        return new Queue("todo.meeting.queue", true);
    }

    // AI Service용 Queue
    @Bean
    public Queue aiTranscriptQueue() {
        return new Queue("ai.transcript.queue", true);
    }

    // Binding 설정
    @Bean
    public Binding bindingNotificationMeeting() {
        return BindingBuilder
            .bind(notificationMeetingQueue())
            .to(meetingExchange())
            .with("meeting.*");  // meeting.created, meeting.ended 등 모든 이벤트
    }

    @Bean
    public Binding bindingTranscriptMeeting() {
        return BindingBuilder
            .bind(transcriptMeetingQueue())
            .to(meetingExchange())
            .with("meeting.ended");  // 회의 종료 이벤트만 구독
    }

    @Bean
    public Binding bindingTodoMeeting() {
        return BindingBuilder
            .bind(todoMeetingQueue())
            .to(meetingExchange())
            .with("meeting.ended");  // 회의 종료 이벤트만 구독
    }

    @Bean
    public Binding bindingAiTranscript() {
        return BindingBuilder
            .bind(aiTranscriptQueue())
            .to(transcriptExchange())
            .with("transcript.created");  // 회의록 생성 이벤트만 구독
    }
}

Publisher 예시 (Meeting Service)

@Service
@RequiredArgsConstructor
public class MeetingEventPublisher {

    private final RabbitTemplate rabbitTemplate;

    public void publishMeetingEnded(MeetingEndedEvent event) {
        rabbitTemplate.convertAndSend(
            "meeting.events",
            "meeting.ended",
            event,
            message -> {
                message.getMessageProperties().setContentType("application/json");
                message.getMessageProperties().setTimestamp(new Date());
                return message;
            }
        );
        log.info("Published MeetingEnded event: meetingId={}", event.getMeetingId());
    }

    public void publishMeetingCreated(MeetingCreatedEvent event) {
        rabbitTemplate.convertAndSend(
            "meeting.events",
            "meeting.created",
            event
        );
        log.info("Published MeetingCreated event: meetingId={}", event.getMeetingId());
    }
}

// 이벤트 모델
@Getter
@AllArgsConstructor
@NoArgsConstructor
public class MeetingEndedEvent {
    private Long meetingId;
    private String meetingTitle;
    private LocalDateTime startTime;
    private LocalDateTime endTime;
    private List<Long> participantIds;
    private String audioFileUrl;
    private LocalDateTime occurredAt;
}

Subscriber 예시 (Transcript Service)

@Service
@RequiredArgsConstructor
public class TranscriptEventSubscriber {

    private final TranscriptService transcriptService;

    @RabbitListener(queues = "transcript.meeting.queue")
    public void handleMeetingEnded(MeetingEndedEvent event) {
        try {
            log.info("Received MeetingEnded event: meetingId={}", event.getMeetingId());

            // 회의록 생성 시작
            transcriptService.createTranscript(
                event.getMeetingId(),
                event.getAudioFileUrl(),
                event.getParticipantIds()
            );

            log.info("Transcript creation started: meetingId={}", event.getMeetingId());
        } catch (Exception e) {
            log.error("Failed to handle MeetingEnded event: meetingId={}",
                event.getMeetingId(), e);
            throw new AmqpRejectAndDontRequeueException("Failed to process event", e);
        }
    }
}

Subscriber 예시 (Notification Service)

@Service
@RequiredArgsConstructor
public class NotificationEventSubscriber {

    private final NotificationService notificationService;

    @RabbitListener(queues = "notification.meeting.queue")
    public void handleMeetingEvents(Message message) {
        String routingKey = message.getMessageProperties().getReceivedRoutingKey();

        switch (routingKey) {
            case "meeting.created":
                MeetingCreatedEvent createdEvent =
                    (MeetingCreatedEvent) message.getBody();
                handleMeetingCreated(createdEvent);
                break;

            case "meeting.ended":
                MeetingEndedEvent endedEvent =
                    (MeetingEndedEvent) message.getBody();
                handleMeetingEnded(endedEvent);
                break;

            default:
                log.warn("Unknown routing key: {}", routingKey);
        }
    }

    private void handleMeetingCreated(MeetingCreatedEvent event) {
        log.info("Sending meeting creation notification: meetingId={}",
            event.getMeetingId());
        notificationService.notifyMeetingCreated(
            event.getParticipantIds(),
            event.getMeetingTitle(),
            event.getScheduledTime()
        );
    }

    private void handleMeetingEnded(MeetingEndedEvent event) {
        log.info("Sending meeting end notification: meetingId={}",
            event.getMeetingId());
        notificationService.notifyMeetingEnded(
            event.getParticipantIds(),
            event.getMeetingTitle()
        );
    }
}

3.4.4 주의사항

이벤트 순서 보장이 필요한 경우 Kafka Partition Key 활용
Idempotent Consumer 패턴으로 중복 처리 방지
이벤트 스키마 버전 관리 및 하위 호환성 유지
장애 시 이벤트 유실 방지를 위한 Persistent 설정

3.5 Asynchronous Request-Reply

3.5.1 패턴 개요

문제: 장시간 실행되는 작업을 동기 방식으로 처리하면 클라이언트 대기 시간이 길어지고 연결이 끊길 수 있음

해결: 요청을 비동기로 처리하고, 클라이언트가 상태를 폴링하거나 콜백으로 결과를 받음

3.5.2 적용 방안

구현 기술

RabbitMQ (Reply-To Queue)
Redis (상태 저장)
WebSocket (실시간 알림 - 옵션)

적용 대상 작업

장시간 작업:
  AI Service:
    - AI 일정 생성 (예상 시간: 30초 ~ 2분)
    - AI 회의록 요약 생성 (예상 시간: 1분 ~ 5분)
    - 대량 회의 분석 (예상 시간: 5분 ~ 30분)

  Transcript Service:
    - 음성 파일 STT 변환 (예상 시간: 1분 ~ 10분)
    - 대용량 회의록 검증 (예상 시간: 30초 ~ 3분)

처리 플로우

비동기 요청-응답 플로우:
  1. 요청 단계:
     - 클라이언트 → API Gateway → AI Service: AI 일정 생성 요청
     - AI Service: 작업 ID 생성 및 Redis에 상태 저장 (status: PENDING)
     - AI Service → 클라이언트: 작업 ID 즉시 반환 (202 Accepted)

  2. 처리 단계:
     - AI Service: Queue에 작업 메시지 발행
     - AI Worker: Queue에서 메시지 소비 및 처리 시작
     - AI Worker: Redis 상태 업데이트 (status: PROCESSING)
     - AI Worker: AI 처리 완료
     - AI Worker: Redis 상태 업데이트 (status: COMPLETED, result: {...})

  3. 결과 조회:
     방법 1 - 폴링:
       - 클라이언트 → API Gateway → AI Service: GET /api/ai/tasks/{taskId}
       - AI Service: Redis에서 상태 조회 및 반환

     방법 2 - WebSocket (옵션):
       - AI Worker: 완료 시 WebSocket으로 클라이언트에게 Push

적용 시나리오

AI 일정 생성 요청:
1. Frontend → AI Service: POST /api/ai/schedules
   Body: { meetingId: 123, transcriptId: 456 }

2. AI Service:
   - taskId 생성: "ai-schedule-uuid-12345"
   - Redis 저장: { taskId, status: "PENDING", createdAt: "2025-01-20T10:00:00Z" }
   - Queue 발행: { taskId, meetingId, transcriptId }

3. AI Service → Frontend:
   Response: 202 Accepted
   Body: {
     taskId: "ai-schedule-uuid-12345",
     status: "PENDING",
     statusUrl: "/api/ai/tasks/ai-schedule-uuid-12345"
   }

4. AI Worker:
   - Queue에서 메시지 수신
   - Redis 업데이트: { taskId, status: "PROCESSING", startedAt: "2025-01-20T10:00:05Z" }
   - AI 일정 생성 처리 (1분 소요)
   - Redis 업데이트: {
       taskId,
       status: "COMPLETED",
       completedAt: "2025-01-20T10:01:05Z",
       result: { scheduleId: 789, schedules: [...] }
     }

5. Frontend (폴링):
   - 5초마다 GET /api/ai/tasks/ai-schedule-uuid-12345 호출
   - status가 "COMPLETED"일 때 result 획득

3.5.3 구현 예시

작업 상태 모델

@Getter
@AllArgsConstructor
@NoArgsConstructor
public class AsyncTaskStatus {
    private String taskId;
    private TaskStatus status;  // PENDING, PROCESSING, COMPLETED, FAILED
    private LocalDateTime createdAt;
    private LocalDateTime startedAt;
    private LocalDateTime completedAt;
    private Object result;  // 완료 시 결과
    private String errorMessage;  // 실패 시 오류 메시지

    public enum TaskStatus {
        PENDING,
        PROCESSING,
        COMPLETED,
        FAILED
    }
}

AI Service - 요청 접수 및 작업 ID 반환

@RestController
@RequestMapping("/api/ai")
@RequiredArgsConstructor
public class AiScheduleController {

    private final AiScheduleService aiScheduleService;

    @PostMapping("/schedules")
    public ResponseEntity<AsyncTaskResponse> createSchedule(
            @RequestBody @Valid CreateScheduleRequest request) {

        String taskId = aiScheduleService.requestScheduleCreation(request);

        AsyncTaskResponse response = AsyncTaskResponse.builder()
            .taskId(taskId)
            .status(TaskStatus.PENDING)
            .statusUrl("/api/ai/tasks/" + taskId)
            .build();

        return ResponseEntity.accepted()
            .location(URI.create("/api/ai/tasks/" + taskId))
            .body(response);
    }

    @GetMapping("/tasks/{taskId}")
    public ResponseEntity<AsyncTaskStatus> getTaskStatus(@PathVariable String taskId) {
        AsyncTaskStatus status = aiScheduleService.getTaskStatus(taskId);

        if (status == null) {
            return ResponseEntity.notFound().build();
        }

        if (status.getStatus() == TaskStatus.COMPLETED) {
            return ResponseEntity.ok(status);
        } else if (status.getStatus() == TaskStatus.FAILED) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(status);
        } else {
            // PENDING or PROCESSING
            return ResponseEntity.status(HttpStatus.ACCEPTED).body(status);
        }
    }
}

AI Service - 작업 요청 처리

@Service
@RequiredArgsConstructor
public class AiScheduleService {

    private final RedisTemplate<String, AsyncTaskStatus> redisTemplate;
    private final RabbitTemplate rabbitTemplate;

    public String requestScheduleCreation(CreateScheduleRequest request) {
        // 작업 ID 생성
        String taskId = "ai-schedule-" + UUID.randomUUID().toString();

        // Redis에 초기 상태 저장
        AsyncTaskStatus taskStatus = new AsyncTaskStatus(
            taskId,
            TaskStatus.PENDING,
            LocalDateTime.now(),
            null,
            null,
            null,
            null
        );

        redisTemplate.opsForValue().set(
            "task:" + taskId,
            taskStatus,
            Duration.ofHours(24)  // TTL: 24시간
        );

        // Queue에 작업 메시지 발행
        AiScheduleMessage message = new AiScheduleMessage(
            taskId,
            request.getMeetingId(),
            request.getTranscriptId()
        );

        rabbitTemplate.convertAndSend("ai.processing.queue", message);

        log.info("AI schedule creation requested: taskId={}, meetingId={}",
            taskId, request.getMeetingId());

        return taskId;
    }

    public AsyncTaskStatus getTaskStatus(String taskId) {
        return redisTemplate.opsForValue().get("task:" + taskId);
    }
}

AI Worker - 비동기 작업 처리

@Service
@RequiredArgsConstructor
public class AiScheduleWorker {

    private final RedisTemplate<String, AsyncTaskStatus> redisTemplate;
    private final AiScheduleGenerator aiScheduleGenerator;

    @RabbitListener(queues = "ai.processing.queue", concurrency = "2-5")
    public void processScheduleCreation(AiScheduleMessage message) {
        String taskId = message.getTaskId();

        try {
            // 상태 업데이트: PROCESSING
            updateTaskStatus(taskId, TaskStatus.PROCESSING, null, null);

            log.info("AI schedule processing started: taskId={}, meetingId={}",
                taskId, message.getMeetingId());

            // AI 일정 생성 처리 (장시간 소요)
            AiScheduleResult result = aiScheduleGenerator.generateSchedule(
                message.getMeetingId(),
                message.getTranscriptId()
            );

            log.info("AI schedule processing completed: taskId={}", taskId);

            // 상태 업데이트: COMPLETED
            updateTaskStatus(taskId, TaskStatus.COMPLETED, result, null);

        } catch (Exception e) {
            log.error("AI schedule processing failed: taskId={}", taskId, e);

            // 상태 업데이트: FAILED
            updateTaskStatus(taskId, TaskStatus.FAILED, null, e.getMessage());

            throw new AmqpRejectAndDontRequeueException("Failed to process AI schedule", e);
        }
    }

    private void updateTaskStatus(String taskId, TaskStatus status,
                                   Object result, String errorMessage) {
        AsyncTaskStatus currentStatus = redisTemplate.opsForValue().get("task:" + taskId);

        if (currentStatus != null) {
            AsyncTaskStatus updatedStatus = new AsyncTaskStatus(
                taskId,
                status,
                currentStatus.getCreatedAt(),
                status == TaskStatus.PROCESSING ? LocalDateTime.now() : currentStatus.getStartedAt(),
                status == TaskStatus.COMPLETED || status == TaskStatus.FAILED ?
                    LocalDateTime.now() : null,
                result,
                errorMessage
            );

            redisTemplate.opsForValue().set(
                "task:" + taskId,
                updatedStatus,
                Duration.ofHours(24)
            );
        }
    }
}

프론트엔드 - 폴링 방식 구현

// AI 일정 생성 요청
async function requestAiSchedule(meetingId, transcriptId) {
  const response = await fetch('/api/ai/schedules', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ meetingId, transcriptId })
  });

  if (response.status === 202) {
    const { taskId, statusUrl } = await response.json();

    // 폴링 시작
    pollTaskStatus(statusUrl);
  }
}

// 작업 상태 폴링
async function pollTaskStatus(statusUrl) {
  const maxAttempts = 60;  // 최대 5분 (5초 * 60)
  let attempts = 0;

  const intervalId = setInterval(async () => {
    attempts++;

    try {
      const response = await fetch(statusUrl);
      const taskStatus = await response.json();

      if (taskStatus.status === 'COMPLETED') {
        clearInterval(intervalId);
        console.log('AI schedule completed:', taskStatus.result);
        displayScheduleResult(taskStatus.result);
      } else if (taskStatus.status === 'FAILED') {
        clearInterval(intervalId);
        console.error('AI schedule failed:', taskStatus.errorMessage);
        displayError(taskStatus.errorMessage);
      } else if (attempts >= maxAttempts) {
        clearInterval(intervalId);
        console.warn('Polling timeout');
        displayTimeout();
      } else {
        console.log('AI schedule processing... status:', taskStatus.status);
        updateProgress(taskStatus.status);
      }
    } catch (error) {
      console.error('Polling error:', error);
    }
  }, 5000);  // 5초마다 폴링
}

3.5.4 주의사항

작업 ID는 충돌하지 않도록 UUID 사용
Redis TTL 설정으로 완료된 작업 상태 자동 삭제
폴링 간격과 최대 시도 횟수 적절히 설정 (네트워크 부하 고려)
장기 실행 작업의 경우 WebSocket 또는 Server-Sent Events (SSE) 고려

3.6 Health Endpoint Monitoring

3.6.1 패턴 개요

문제: 서비스의 상태를 외부에서 확인할 방법이 없어 장애 발생 시 빠른 대응이 어려움

해결: 각 서비스가 Health Check 엔드포인트를 제공하여 상태를 모니터링하고, 장애 시 자동 복구

3.6.2 적용 방안

구현 기술

Spring Boot Actuator
Kubernetes Liveness/Readiness Probes
Prometheus + Grafana (메트릭 수집 및 대시보드)

Health Check 레벨

Health Check 계층:
  1. Liveness Probe (생존 확인):
     - 목적: 서비스가 살아있는지 확인
     - 실패 시: 컨테이너 재시작
     - 엔드포인트: /actuator/health/liveness
     - 체크 항목: 애플리케이션 기본 동작 여부

  2. Readiness Probe (준비 상태 확인):
     - 목적: 서비스가 트래픽을 받을 준비가 되었는지 확인
     - 실패 시: 트래픽 라우팅 중단 (재시작 X)
     - 엔드포인트: /actuator/health/readiness
     - 체크 항목:
       - 데이터베이스 연결
       - Redis 연결
       - RabbitMQ 연결
       - 외부 API 연결

  3. Custom Health Indicator:
     - 서비스별 비즈니스 로직 상태 확인
     - 예: AI Service → AI 모델 로딩 상태
     - 예: Transcript Service → STT 엔진 연결 상태

서비스별 Health Check 구성

모든 서비스 공통:
  Liveness:
    - /actuator/health/liveness
    - initialDelaySeconds: 30
    - periodSeconds: 10
    - timeoutSeconds: 5
    - failureThreshold: 3

  Readiness:
    - /actuator/health/readiness
    - initialDelaySeconds: 10
    - periodSeconds: 5
    - timeoutSeconds: 3
    - failureThreshold: 3

User Service:
  Dependencies:
    - PostgreSQL (user_db)
    - Redis (cache)

Meeting Service:
  Dependencies:
    - PostgreSQL (meeting_db)
    - Redis (cache)
    - RabbitMQ

Transcript Service:
  Dependencies:
    - PostgreSQL (transcript_db)
    - Redis (cache)
    - RabbitMQ
    - STT Engine (외부 API)

AI Service:
  Dependencies:
    - PostgreSQL (ai_db)
    - Redis (cache)
    - RabbitMQ
    - AI Model Server (외부 API)

Notification Service:
  Dependencies:
    - PostgreSQL (notification_db)
    - RabbitMQ
    - Email Service (SMTP)
    - SMS Service (외부 API)

Todo Service:
  Dependencies:
    - PostgreSQL (todo_db)
    - Redis (cache)
    - RabbitMQ

3.6.3 구현 예시

Spring Boot Actuator 설정 (application.yml)

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
      base-path: /actuator

  endpoint:
    health:
      enabled: true
      show-details: always  # 개발: always, 운영: when-authorized
      probes:
        enabled: true  # Liveness/Readiness 활성화

  health:
    livenessState:
      enabled: true
    readinessState:
      enabled: true
    db:
      enabled: true
    redis:
      enabled: true
    rabbit:
      enabled: true

  metrics:
    export:
      prometheus:
        enabled: true

Custom Health Indicator (Meeting Service)

@Component
public class MeetingServiceHealthIndicator implements HealthIndicator {

    private final MeetingRepository meetingRepository;
    private final RedisTemplate<String, Object> redisTemplate;
    private final RabbitTemplate rabbitTemplate;

    @Override
    public Health health() {
        try {
            // 데이터베이스 연결 확인
            meetingRepository.count();

            // Redis 연결 확인
            redisTemplate.opsForValue().get("health-check");

            // RabbitMQ 연결 확인
            rabbitTemplate.getConnectionFactory().createConnection().isOpen();

            return Health.up()
                .withDetail("database", "Connected")
                .withDetail("redis", "Connected")
                .withDetail("rabbitmq", "Connected")
                .build();

        } catch (Exception e) {
            return Health.down()
                .withDetail("error", e.getMessage())
                .build();
        }
    }
}

Custom Health Indicator (AI Service)

@Component
public class AiServiceHealthIndicator implements HealthIndicator {

    private final AiModelClient aiModelClient;
    private final AiRepository aiRepository;
    private final RedisTemplate<String, Object> redisTemplate;

    @Override
    public Health health() {
        Health.Builder builder = new Health.Builder();

        try {
            // 데이터베이스 연결 확인
            aiRepository.count();
            builder.withDetail("database", "Connected");

            // Redis 연결 확인
            redisTemplate.opsForValue().get("health-check");
            builder.withDetail("redis", "Connected");

            // AI 모델 서버 연결 확인
            boolean aiModelReady = aiModelClient.checkHealth();
            if (aiModelReady) {
                builder.withDetail("ai-model", "Ready");
            } else {
                builder.withDetail("ai-model", "Not Ready");
                return builder.down().build();
            }

            return builder.up().build();

        } catch (Exception e) {
            return builder.down()
                .withDetail("error", e.getMessage())
                .build();
        }
    }
}

Kubernetes Deployment with Probes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: meeting-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: meeting-service
  template:
    metadata:
      labels:
        app: meeting-service
    spec:
      containers:
      - name: meeting-service
        image: meeting-service:1.0.0
        ports:
        - containerPort: 8080

        # Liveness Probe: 컨테이너가 살아있는지 확인
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: 8080
          initialDelaySeconds: 30  # 초기 대기 시간
          periodSeconds: 10        # 체크 간격
          timeoutSeconds: 5        # 응답 대기 시간
          failureThreshold: 3      # 실패 허용 횟수

        # Readiness Probe: 트래픽을 받을 준비가 되었는지 확인
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

        env:
        - name: SPRING_PROFILES_ACTIVE
          value: "prod"
        - name: SPRING_DATASOURCE_URL
          valueFrom:
            secretKeyRef:
              name: meeting-db-secret
              key: url

Health Check 응답 예시

// /actuator/health
{
  "status": "UP",
  "components": {
    "db": {
      "status": "UP",
      "details": {
        "database": "PostgreSQL",
        "validationQuery": "isValid()"
      }
    },
    "redis": {
      "status": "UP",
      "details": {
        "version": "7.0.5"
      }
    },
    "rabbit": {
      "status": "UP",
      "details": {
        "version": "3.11.0"
      }
    },
    "meetingService": {
      "status": "UP",
      "details": {
        "database": "Connected",
        "redis": "Connected",
        "rabbitmq": "Connected"
      }
    },
    "diskSpace": {
      "status": "UP",
      "details": {
        "total": 107374182400,
        "free": 53687091200,
        "threshold": 10485760,
        "exists": true
      }
    }
  }
}

// /actuator/health/liveness
{
  "status": "UP"
}

// /actuator/health/readiness
{
  "status": "UP",
  "components": {
    "db": { "status": "UP" },
    "redis": { "status": "UP" },
    "rabbit": { "status": "UP" }
  }
}

Prometheus 메트릭 수집

# prometheus.yml
scrape_configs:
  - job_name: 'meeting-service'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['meeting-service:8080']

  - job_name: 'transcript-service'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['transcript-service:8080']

  - job_name: 'ai-service'
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['ai-service:8080']

3.6.4 주의사항

Health Check 엔드포인트는 가벼워야 함 (복잡한 비즈니스 로직 포함 금지)
Liveness와 Readiness를 명확히 구분 (잘못 설정 시 무한 재시작 발생 가능)
initialDelaySeconds는 애플리케이션 시작 시간보다 길게 설정
failureThreshold를 너무 낮게 설정하면 일시적 네트워크 장애에도 재시작 발생

4. 패턴 간 통합 시나리오

4.1 회의 종료 후 전체 플로우

사용자 → Frontend → API Gateway → Meeting Service
                                    ↓
                              1. 회의 종료 처리
                              2. MeetingEnded 이벤트 발행 (Pub-Sub)
                              3. Redis 캐시 무효화 (Cache-Aside)
                                    ↓
                    ┌───────────────┼───────────────┐
                    ↓               ↓               ↓
          Transcript Service  Notification    Todo Service
                 ↓             Service              ↓
          Queue 발행          알림 발송        Queue 발행
       (Queue-Based)                        (Queue-Based)
                 ↓                                  ↓
          STT Worker                         Todo Worker
       비동기 처리                           Todo 추출
    (Async Request-Reply)
                 ↓
       TranscriptCreated 이벤트
              (Pub-Sub)
                 ↓
       ┌────────┼────────┐
       ↓        ↓        ↓
   Meeting   Notification  AI Service
   Service   Service          ↓
      ↓         ↓        Queue 발행
   캐시      완료 알림   (Queue-Based)
   무효화                    ↓
                        AI Worker
                     (Async Request-Reply)

단계별 패턴 적용

API Gateway: 사용자 요청 라우팅 및 인증
Cache-Aside: Meeting Service에서 회의 정보 캐시 무효화
Pub-Sub: MeetingEnded 이벤트 발행
Queue-Based Load Leveling: Transcript Service, Todo Service가 Queue를 통해 작업 수신
Asynchronous Request-Reply: STT 처리 및 AI 분석은 비동기로 처리, 상태 폴링
Health Endpoint Monitoring: 모든 서비스 상태 모니터링

4.2 회의록 조회 플로우

사용자 → Frontend → API Gateway → Transcript Service
                                         ↓
                                   1. Redis 캐시 확인 (Cache-Aside)
                                   2. Cache Hit → 캐시 반환
                                   3. Cache Miss → DB 조회 → 캐시 저장
                                         ↓
                                    응답 반환

패턴 적용

API Gateway: 요청 라우팅 및 인증
Cache-Aside: 회의록 데이터 캐싱으로 성능 최적화

5. 패턴 적용 로드맵

5.1 1단계: 기본 인프라 (Week 1-2)

목표: 핵심 인프라 구축 및 기본 패턴 적용

작업 항목

API Gateway 설정
- Kong 또는 Spring Cloud Gateway 선택 및 설치
- 라우팅 규칙 정의
- JWT 인증 설정
Health Endpoint Monitoring 구현
- Spring Boot Actuator 활성화
- Liveness/Readiness Probes 설정
- Kubernetes Deployment 설정
메시지 브로커 설치
- RabbitMQ 설치 및 설정
- Exchange/Queue 정의
Redis 설치
- Redis 클러스터 구성
- 캐시 설정

완료 기준

모든 서비스가 API Gateway를 통해 접근 가능
Health Check 엔드포인트가 정상 동작
RabbitMQ와 Redis 연결 확인

5.2 2단계: 성능 최적화 (Week 3-4)

목표: 캐싱 및 부하 분산 패턴 적용

작업 항목

Cache-Aside 패턴 구현
- User Service: 사용자 프로필 캐싱
- Meeting Service: 회의 정보 캐싱
- Todo Service: Todo 목록 캐싱
Queue-Based Load Leveling 구현
- Meeting Service → Transcript Service (회의록 생성)
- Meeting Service → AI Service (AI 처리)
- Notification Service (알림 발송)

완료 기준

캐시 Hit Rate 70% 이상
Queue를 통한 비동기 처리 정상 동작
부하 테스트 결과 성능 개선 확인

5.3 3단계: 이벤트 기반 아키텍처 (Week 5-6)

목표: 서비스 간 느슨한 결합 및 확장성 향상

작업 항목

Pub-Sub 패턴 구현
- MeetingEnded, TranscriptCreated, TodoCreated 이벤트
- 이벤트 구독자 구현 (Notification, AI, Todo Service)
Asynchronous Request-Reply 패턴 구현
- AI Service: AI 일정 생성 비동기 처리
- Transcript Service: STT 처리 비동기 처리
- Redis 기반 상태 관리

완료 기준

이벤트 발행/구독 정상 동작
비동기 작업 상태 조회 가능
서비스 간 직접 의존성 제거 확인

5.4 4단계: 모니터링 및 안정화 (Week 7-8)

목표: 운영 안정성 확보

작업 항목

Prometheus + Grafana 구축
- 메트릭 수집 설정
- 대시보드 구성
알림 설정
- Health Check 실패 알림
- Queue 적체 알림
- 성능 저하 알림
부하 테스트 및 튜닝
- Queue Consumer Concurrency 조정
- 캐시 TTL 최적화
- API Gateway Rate Limiting 조정

완료 기준

Grafana 대시보드에서 모든 서비스 상태 확인 가능
알림 시스템 정상 동작
목표 성능 달성 (응답 시간, 처리량)

6. 모니터링 및 운영

6.1 주요 모니터링 지표

API Gateway

요청 수 (RPS)
응답 시간 (P50, P95, P99)
에러율 (4xx, 5xx)
Rate Limit 초과 횟수

Queue-Based Load Leveling

Queue 길이
메시지 처리 속도
Dead Letter Queue 메시지 수
Consumer Lag

Cache-Aside

Cache Hit Rate
Cache Miss Rate
캐시 메모리 사용량
캐시 Eviction 수

Pub-Sub

이벤트 발행 수
이벤트 구독 지연 시간
이벤트 처리 실패 수

Asynchronous Request-Reply

비동기 작업 대기 시간
작업 완료율
작업 실패율

Health Endpoint Monitoring

서비스 가용성 (Uptime)
Health Check 응답 시간
Liveness/Readiness 실패 횟수

6.2 알림 규칙

지표	임계값	조치
API Gateway 에러율	> 5%	즉시 알림, 로그 분석
Queue 길이	> 1000	Consumer 증설 검토
Cache Hit Rate	< 50%	캐시 전략 재검토
Health Check 실패	3회 연속	서비스 재시작, 근본 원인 분석
비동기 작업 실패율	> 10%	Worker 상태 점검, Queue 확인

7. 장애 대응

7.1 API Gateway 장애

증상: 모든 서비스 접근 불가

대응

API Gateway Pod 재시작
HA 구성 확인 (최소 2개 이상 인스턴스)
부하 분산 설정 점검

7.2 Queue 적체

증상: 메시지 처리 지연, Queue 길이 증가

대응

Consumer 수 증가 (Concurrency 조정)
Worker Pod 수평 확장
메시지 우선순위 재조정

7.3 Cache 장애

증상: 응답 시간 증가, DB 부하 증가

대응

Redis 연결 확인
Cache Fallback 동작 확인 (DB 직접 조회)
Redis 재시작 또는 Failover

7.4 이벤트 유실

증상: 특정 이벤트가 구독자에게 전달되지 않음

대응

RabbitMQ 연결 상태 확인
Queue Binding 설정 점검
Dead Letter Queue 확인
필요 시 수동 재발행

8. 성공 지표

8.1 성능 목표

지표	목표
API 응답 시간 (P95)	< 500ms
회의록 생성 시간	< 2분
AI 일정 생성 시간	< 1분
Cache Hit Rate	> 70%
시스템 가용성	> 99.5%

8.2 확장성 목표

항목	목표
동시 사용자 수	10,000명
동시 진행 회의 수	1,000개
일일 처리 회의 수	100,000개

9. 참고 자료

10. 문서 이력

버전	작성일	작성자	변경 내용
1.0	2025-01-20	길동	초안 작성 (6개 핵심 패턴 적용 방안)

52 KiB Raw Blame History

클라우드 아키텍처 패턴 적용 방안

1. 문서 개요

1.1 목적

1.2 범위

1.3 참조 문서

2. 적용 패턴 개요

2.1 패턴 분류

2.2 서비스별 패턴 적용 매트릭스

3. 패턴별 상세 적용 방안

3.1 API Gateway

3.1.1 패턴 개요

3.1.2 적용 방안

3.1.3 구현 예시

3.1.4 주의사항

3.2 Queue-Based Load Leveling

3.2.1 패턴 개요

3.2.2 적용 방안

3.2.3 구현 예시

3.2.4 주의사항

3.3 Cache-Aside

3.3.1 패턴 개요

3.3.2 적용 방안

3.3.3 구현 예시

3.3.4 주의사항

3.4 Publisher-Subscriber (Pub-Sub)

3.4.1 패턴 개요

3.4.2 적용 방안

3.4.3 구현 예시

3.4.4 주의사항

3.5 Asynchronous Request-Reply

3.5.1 패턴 개요

3.5.2 적용 방안

3.5.3 구현 예시

3.5.4 주의사항

3.6 Health Endpoint Monitoring

3.6.1 패턴 개요

3.6.2 적용 방안

3.6.3 구현 예시

3.6.4 주의사항

4. 패턴 간 통합 시나리오

4.1 회의 종료 후 전체 플로우

4.2 회의록 조회 플로우

5. 패턴 적용 로드맵

5.1 1단계: 기본 인프라 (Week 1-2)

5.2 2단계: 성능 최적화 (Week 3-4)

5.3 3단계: 이벤트 기반 아키텍처 (Week 5-6)

5.4 4단계: 모니터링 및 안정화 (Week 7-8)

6. 모니터링 및 운영

6.1 주요 모니터링 지표

6.2 알림 규칙

7. 장애 대응

7.1 API Gateway 장애

7.2 Queue 적체

7.3 Cache 장애

7.4 이벤트 유실

8. 성공 지표

8.1 성능 목표

8.2 확장성 목표

9. 참고 자료

10. 문서 이력

52 KiB

Raw Blame History