GPT-4 API로 프롬프트 체이닝 구현하기

문제 상황

고객 문의를 분석해서 카테고리 분류, 감정 분석, 답변 생성까지 한 번에 처리하는 기능을 구현했는데, 결과물의 품질이 들쭉날쭉했다. 하나의 긴 프롬프트로 모든 작업을 요청하다 보니 중간 과정을 제어할 수 없었고, 어느 단계에서 잘못되었는지 파악하기도 어려웠다.

프롬프트 체이닝 적용

작업을 3단계로 분리했다.

interface ChainResult {
  category: string;
  sentiment: string;
  response: string;
}

async function processInquiry(text: string): Promise<ChainResult> {
  // Step 1: 카테고리 분류
  const category = await callGPT4({
    system: "You are a customer inquiry classifier.",
    user: `Classify this inquiry into one category: ${text}`
  });

  // Step 2: 감정 분석
  const sentiment = await callGPT4({
    system: "You are a sentiment analyzer.",
    user: `Analyze sentiment (positive/neutral/negative): ${text}`
  });

  // Step 3: 답변 생성 (이전 결과 활용)
  const response = await callGPT4({
    system: "You are a customer service representative.",
    user: `Category: ${category}, Sentiment: ${sentiment}\nGenerate response for: ${text}`
  });

  return { category, sentiment, response };
}

개선 효과

각 단계별로 프롬프트를 최적화할 수 있게 되었다. 카테고리 분류는 짧고 명확하게, 답변 생성은 이전 단계의 컨텍스트를 활용하도록 구성했다.

중간 결과를 로깅해서 어느 단계에서 문제가 생기는지 즉시 파악할 수 있었고, 특정 단계만 재실행하거나 결과를 캐싱하는 것도 가능해졌다.

API 호출이 3배로 늘어나는 단점이 있지만, 병렬 처리 가능한 단계는 Promise.all로 처리해서 지연 시간을 줄였다. 비용보다 품질이 중요한 케이스였기에 충분히 가치 있는 트레이드오프였다.