Using the AI SDK - 🌲 Sonamu

Sonamu is built on the Vercel AI SDK and provides various AI capabilities. You can easily implement text generation, streaming, tool calling, speech recognition, and more.

Vercel AI SDK

This is the AI framework that Sonamu uses. Key Features:

Text Generation
Streaming Responses
Tool Calling
Structured Output
Transcription

Text Generation

generateText()

Generates plain text.

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

const result = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Tell me how to create an Express server with TypeScript',
});

console.log(result.text);
// => "To create an Express server..."

Message-Based Conversation

const result = await generateText({
  model: openai('gpt-4o'),
  messages: [
    { role: 'system', content: 'You are a friendly programming assistant.' },
    { role: 'user', content: 'What is TypeScript?' },
    { role: 'assistant', content: 'TypeScript is JavaScript with types added...' },
    { role: 'user', content: 'What are its advantages?' },
  ],
});

Streaming Responses

streamText()

Streams text in real-time.

import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';

const result = streamText({
  model: openai('gpt-4o'),
  prompt: 'Tell me a long story',
});

// Process stream
for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Integration with SSE

import { BaseModel, stream, api } from "sonamu";
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
import { z } from "zod";

class ChatModelClass extends BaseModel {
  @stream({
    type: 'sse',
    events: z.object({
      chunk: z.object({
        text: z.string(),
      }),
      complete: z.object({
        totalTokens: z.number(),
      }),
    })
  })
  @api({ compress: false })
  async *streamChat(message: string, ctx: Context) {
    const sse = ctx.createSSE(
      z.object({
        chunk: z.object({
          text: z.string(),
        }),
        complete: z.object({
          totalTokens: z.number(),
        }),
      })
    );

    try {
      const result = streamText({
        model: openai('gpt-4o'),
        messages: [
          { role: 'user', content: message },
        ],
      });

      // Real-time transmission
      for await (const chunk of result.textStream) {
        sse.publish('chunk', { text: chunk });
      }

      // Completion statistics
      const usage = await result.usage;
      sse.publish('complete', {
        totalTokens: usage.totalTokens,
      });
    } finally {
      await sse.end();
    }
  }
}

Tool Calling

Single Tool

import { openai } from '@ai-sdk/openai';
import { generateText, tool } from 'ai';
import { z } from 'zod';

const result = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Tell me the current weather in Seoul',
  tools: {
    getWeather: tool({
      description: 'Fetches the current weather for a specific city',
      parameters: z.object({
        city: z.string().describe('City name'),
      }),
      execute: async ({ city }) => {
        // Call weather API
        const weather = await fetchWeather(city);
        return {
          temperature: weather.temp,
          condition: weather.condition,
        };
      },
    }),
  },
  maxSteps: 5,  // Maximum number of tool calls
});

console.log(result.text);
// => "The current weather in Seoul is clear with a temperature of 15 degrees."

Multiple Tools

const result = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Check the weather in Seoul and tell me if I need an umbrella',
  tools: {
    getWeather: tool({
      description: 'Get weather information',
      parameters: z.object({
        city: z.string(),
      }),
      execute: async ({ city }) => {
        return await fetchWeather(city);
      },
    }),
    checkUmbrella: tool({
      description: 'Determine if an umbrella is needed based on weather information',
      parameters: z.object({
        condition: z.string().describe('Weather condition'),
      }),
      execute: async ({ condition }) => {
        return {
          needUmbrella: ['rain', 'snow'].includes(condition),
        };
      },
    }),
  },
  maxSteps: 10,
});

Structured Output

generateObject()

Generates structured data in JSON format.

import { openai } from '@ai-sdk/openai';
import { generateObject } from 'ai';
import { z } from 'zod';

const result = await generateObject({
  model: openai('gpt-4o'),
  schema: z.object({
    name: z.string(),
    age: z.number(),
    hobbies: z.array(z.string()),
    address: z.object({
      city: z.string(),
      country: z.string(),
    }),
  }),
  prompt: 'Generate information about John Doe',
});

console.log(result.object);
// {
//   name: "John Doe",
//   age: 30,
//   hobbies: ["reading", "traveling"],
//   address: {
//     city: "New York",
//     country: "USA"
//   }
// }

Practical Example

import { BaseModel, api } from "sonamu";
import { openai } from '@ai-sdk/openai';
import { generateObject } from 'ai';
import { z } from 'zod';

class ProductModelClass extends BaseModel {
  @api({ httpMethod: 'POST' })
  async generateProductDescription(productName: string) {
    const result = await generateObject({
      model: openai('gpt-4o'),
      schema: z.object({
        title: z.string(),
        description: z.string(),
        features: z.array(z.string()),
        price: z.number(),
        tags: z.array(z.string()),
      }),
      prompt: `Generate a product description for ${productName}`,
    });

    // Save to DB
    const product = await this.saveOne({
      name: result.object.title,
      description: result.object.description,
      features: result.object.features,
      price: result.object.price,
      tags: result.object.tags,
    });

    return product;
  }
}

Rtzr Provider (Speech Recognition)

Sonamu provides built-in support for Rtzr (a Korean speech recognition service).

Configuration

RTZR_CLIENT_ID=your_client_id
RTZR_CLIENT_SECRET=your_client_secret

Basic Usage

import { rtzr } from 'sonamu/ai/providers/rtzr';

const model = rtzr.transcription('whisper');

const result = await model.doGenerate({
  audio: audioBuffer,  // Uint8Array or Base64
  mediaType: 'audio/wav',
});

console.log(result.text);
// => "Hello, the weather is nice today"

console.log(result.segments);
// [
//   { text: "Hello", startSecond: 0, endSecond: 1 },
//   { text: "the weather is nice today", startSecond: 1, endSecond: 3 }
// ]

File Upload + Speech Recognition

import { BaseModel, upload, api } from "sonamu";
import { rtzr } from 'sonamu/ai/providers/rtzr';

class TranscriptionModelClass extends BaseModel {
  @upload({ mode: 'single' })
  @api({ httpMethod: 'POST' })
  async transcribeAudio() {
    const { files } = Sonamu.getContext();
    const file = files?.[0]; // Use first file

    if (!file) {
      throw new Error('No audio file provided');
    }

    // Speech recognition
    const model = rtzr.transcription('whisper');
    const buffer = await file.toBuffer();

    const result = await model.doGenerate({
      audio: buffer,
      mediaType: file.mimetype,
    });

    // Save to DB
    await this.saveOne({
      audio_url: file.url,
      transcription: result.text,
      segments: result.segments,
      language: result.language,
      duration: result.durationInSeconds,
    });

    return {
      text: result.text,
      segments: result.segments,
    };
  }
}

Rtzr Options

const result = await model.doGenerate({
  audio: audioBuffer,
  mediaType: 'audio/wav',
  providerOptions: {
    rtzr: {
      domain: 'GENERAL',  // 'CALL' | 'GENERAL'
      language: 'ko',
      diarization: true,  // Speaker separation
      wordTimestamp: true,  // Word-level timestamps
      profanityFilter: false,  // Profanity filter
    }
  }
});

Multimodal (Image Processing)

GPT-4o can accept images as input.

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

const result = await generateText({
  model: openai('gpt-4o'),
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is in this image?' },
        {
          type: 'image',
          image: imageBuffer,  // Uint8Array or URL
        },
      ],
    },
  ],
});

console.log(result.text);
// => "The image contains a cat..."

Image Upload + Analysis

import { BaseModel, upload, api } from "sonamu";
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

class ImageAnalysisModelClass extends BaseModel {
  @upload({ mode: 'single' })
  @api({ httpMethod: 'POST' })
  async analyzeImage() {
    const { files } = Sonamu.getContext();
    const file = files?.[0]; // Use first file

    if (!file || !file.mimetype.startsWith('image/')) {
      throw new Error('An image file is required');
    }

    const buffer = await file.toBuffer();

    const result = await generateText({
      model: openai('gpt-4o'),
      messages: [
        {
          role: 'user',
          content: [
            { type: 'text', text: 'Analyze this image in detail' },
            { type: 'image', image: buffer },
          ],
        },
      ],
    });

    return {
      analysis: result.text,
      imageUrl: file.url,
    };
  }
}

Error Handling

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

try {
  const result = await generateText({
    model: openai('gpt-4o'),
    prompt: '...',
  });

  return result.text;
} catch (error) {
  if (error.name === 'AI_APICallError') {
    // API call error
    console.error('API Error:', error.message);
    console.error('Status:', error.statusCode);
  } else if (error.name === 'AI_InvalidPromptError') {
    // Prompt error
    console.error('Invalid Prompt:', error.message);
  } else {
    // Other errors
    console.error('Unknown Error:', error);
  }

  throw error;
}

Cost Tracking

const result = await generateText({
  model: openai('gpt-4o'),
  prompt: '...',
});

// Token usage
console.log('Prompt Tokens:', result.usage.promptTokens);
console.log('Completion Tokens:', result.usage.completionTokens);
console.log('Total Tokens:', result.usage.totalTokens);

// Cost calculation (example)
const costPerToken = 0.00003;  // GPT-4o pricing
const cost = result.usage.totalTokens * costPerToken;
console.log('Cost:', cost);

Practical Integration Example

AI Chat API

import { BaseModel, api } from "sonamu";
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
import { z } from 'zod';

class ChatModelClass extends BaseModel {
  @api({ httpMethod: 'POST' })
  async chat(
    message: string,
    conversationId: number | null,
    ctx: Context
  ) {
    // Retrieve conversation history
    const history = conversationId
      ? await ConversationModel.findById(conversationId)
      : null;

    const messages = history?.messages || [];
    messages.push({
      role: 'user',
      content: message,
    });

    // Generate AI response
    const result = await generateText({
      model: openai('gpt-4o'),
      messages: [
        {
          role: 'system',
          content: 'You are a friendly customer support chatbot.',
        },
        ...messages,
      ],
      temperature: 0.7,
      maxTokens: 500,
    });

    // Save response
    messages.push({
      role: 'assistant',
      content: result.text,
    });

    const conversation = await ConversationModel.saveOne({
      id: conversationId,
      user_id: ctx.user.id,
      messages,
      token_usage: result.usage.totalTokens,
    });

    return {
      conversationId: conversation.id,
      message: result.text,
      usage: result.usage,
    };
  }
}

Precautions

Important considerations when using the AI SDK:

API Key Security: Use environment variables

// ❌ Hardcoded
const model = openai('gpt-4o', { apiKey: 'sk-...' });

// ✅ Environment variable
const model = openai('gpt-4o');  // Automatically uses OPENAI_API_KEY

Error Handling: Always use try-catch

try {
  const result = await generateText({ ... });
} catch (error) {
  console.error(error);
}

Token Limits: Set maxTokens

generateText({
  model: openai('gpt-4o'),
  prompt: '...',
  maxTokens: 1000,  // Cost control
});

Streaming Cleanup: Terminate stream even on errors

try {
  for await (const chunk of result.textStream) {
    // ...
  }
} finally {
  // Cleanup work
}

Rtzr File Size: Large files require chunking

if (file.size > 10 * 1024 * 1024) {
  throw new Error('File size must be 10MB or less');
}

Image Size: Check GPT-4o image limits

// Image size limit (20MB)
if (imageBuffer.length > 20 * 1024 * 1024) {
  throw new Error('Image is too large');
}

Vercel AI SDK Documentation

For more features, refer to the official documentation:

Next Steps

Agent Configuration

Configure agent basics

Creating Agents

Build agents with BaseAgentClass

Get Started

Core Concepts

Database

API Development

Frontend Integration

Testing

Advanced Features

Tools & CLI

Configuration

API Reference

Troubleshooting

FAQ

​Vercel AI SDK

​Text Generation

​generateText()

​Message-Based Conversation

​Streaming Responses

​streamText()

​Integration with SSE

​Tool Calling

​Single Tool

​Multiple Tools

​Structured Output

​generateObject()

​Practical Example

​Rtzr Provider (Speech Recognition)

​Configuration

​Basic Usage

​File Upload + Speech Recognition

​Rtzr Options

​Multimodal (Image Processing)

​Image Upload + Analysis

​Error Handling

​Cost Tracking

​Practical Integration Example

​AI Chat API

​Precautions

​Vercel AI SDK Documentation

​Next Steps

Agent Configuration

Creating Agents

Vercel AI SDK

Text Generation

generateText()

Message-Based Conversation

Streaming Responses

streamText()

Integration with SSE

Tool Calling

Single Tool

Multiple Tools

Structured Output

generateObject()

Practical Example

Rtzr Provider (Speech Recognition)

Configuration

Basic Usage

File Upload + Speech Recognition

Rtzr Options

Multimodal (Image Processing)

Image Upload + Analysis

Error Handling

Cost Tracking

Practical Integration Example

AI Chat API

Precautions

Vercel AI SDK Documentation

Next Steps