AI Categorization

Automated transaction categorization with AI

AI Transaction Categorization

Open Ledger uses advanced AI to automatically categorize your financial transactions according to double-entry accounting principles. This system analyzes transaction data, determines the economic nature of each transaction, and suggests the most appropriate categories from your chart of accounts.

How It Works

Our AI categorization system:

  1. Analyzes transaction details (amount, description, counterparty, etc.)
  2. Uses vector search to find similar historical transactions for context
  3. Enriches the data with AI-generated context about the transaction
  4. Determines the transaction type (expense, revenue, transfer, etc.)
  5. Identifies the most appropriate category codes based on your chart of accounts
  6. Provides a confidence score for each suggestion
  7. Streams results in real-time using Server-Sent Events

API Endpoint

The AI categorization endpoint is available at:

1POST /v1/ai/{entityId}/suggest-categories-vector
2Content-Type: application/json
3Authorization: Bearer your_token_here

Note: This endpoint uses Server-Sent Events (SSE) for real-time streaming of categorization results.

Request Parameters

ParameterTypeRequiredDescription
entityIdstring (path)YesThe ID of the entity to categorize in
transactionsarray (body)YesArray of transaction objects

Transaction Object Structure

1{
2 "id": "txn_123",
3 "description": "AWS Monthly Subscription",
4 "amount": 125.99,
5 "direction": "DEBIT",
6 "counterparty_name": "Amazon Web Services",
7 "date": "2023-04-15",
8 "bank_transaction_id": "tx_12345abcdef"
9}

Example Request

1// Note: This endpoint uses Server-Sent Events (SSE)
2const eventSource = new EventSource('/v1/ai/entity_12345/suggest-categories-vector');
3
4fetch('/v1/ai/entity_12345/suggest-categories-vector', {
5 method: 'POST',
6 headers: {
7 'Content-Type': 'application/json',
8 'Authorization': 'Bearer your_token_here'
9 },
10 body: JSON.stringify({
11 transactions: [
12 {
13 id: "txn_123",
14 description: "AWS Monthly Subscription",
15 amount: 125.99,
16 direction: "DEBIT",
17 counterparty_name: "Amazon Web Services",
18 date: "2023-04-15",
19 bank_transaction_id: "tx_12345abcdef"
20 },
21 {
22 id: "txn_124",
23 description: "Office supplies",
24 amount: 45.0,
25 direction: "DEBIT",
26 counterparty_name: "Staples",
27 date: "2023-04-16",
28 bank_transaction_id: "tx_67890ghijkl"
29 }
30 ]
31 })
32})

Response Format (Server-Sent Events)

The API returns categorization suggestions as Server-Sent Events. Each event contains a JSON object:

For Already Categorized Transactions:

1// Immediate response for transactions already categorized
2data: {
3 "transaction_id": "txn_123",
4 "suggested_type": "EXPENSE",
5 "suggested_category_code": 5001,
6 "account_code": 5001,
7 "category_id": 12,
8 "confidence": 1.0
9}

For New Transactions (Streaming):

1// Progress updates during categorization
2data: {
3 "transaction_id": "txn_124",
4 "status": "in_progress",
5 "suggested_type": "EXPENSE",
6 "suggested_category_code": 5002,
7 "account_code": 5002,
8 "category_id": 13,
9 "confidence": 0.88
10}
11
12// Final result
13data: {
14 "transaction_id": "txn_124",
15 "suggested_type": "EXPENSE",
16 "suggested_category_code": 5002,
17 "account_code": 5002,
18 "category_id": 13,
19 "confidence": 0.88,
20 "status": "complete",
21 "enrichment": {
22 "counterparty_info": "Staples is an office supply retail chain",
23 "transaction_context": "Purchase of office supplies",
24 "likely_business_category": "Office Expenses"
25 }
26}
27
28// End of stream
29data: [DONE]

Error Responses:

1data: {
2 "error": "Internal server error",
3 "message": "Detailed error message"
4}

Retry Logic

The API includes automatic retry logic for failed categorizations:

1data: {
2 "transaction_id": "txn_125",
3 "status": "retrying",
4 "retry_count": 1,
5 "message": "Retrying categorization (attempt 2/3)"
6}

Transaction Types

The AI categorization system identifies several transaction types:

TypeDescriptionTypical Category Pattern
EXPENSEBusiness costsSoftware Expenses, Office Supplies, etc.
REVENUEIncome from sales or servicesSales Revenue, Service Income, etc.
ASSET_TRANSFERMoving money between accountsInternal account transfers
LIABILITY_INCREASETaking on debtLoan receipts, credit increases
LIABILITY_REDUCTIONPaying off debtLoan payments, credit card payments

Integration Examples

JavaScript (with Server-Sent Events)

1async function categorizeTransactions(entityId, transactions, token) {
2 const response = await fetch(`/v1/ai/${entityId}/suggest-categories-vector`, {
3 method: 'POST',
4 headers: {
5 'Content-Type': 'application/json',
6 'Authorization': `Bearer ${token}`
7 },
8 body: JSON.stringify({ transactions })
9 });
10
11 const reader = response.body?.getReader();
12 const decoder = new TextDecoder();
13 const results = [];
14
15 while (true) {
16 const { done, value } = await reader.read();
17 if (done) break;
18
19 const chunk = decoder.decode(value);
20 const lines = chunk.split('\n');
21
22 for (const line of lines) {
23 if (line.startsWith('data: ')) {
24 const data = line.slice(6);
25 if (data === '[DONE]') {
26 return results;
27 }
28
29 try {
30 const result = JSON.parse(data);
31 if (result.status === 'complete' || result.confidence) {
32 results.push(result);
33 }
34 } catch (e) {
35 console.error('Error parsing SSE data:', e);
36 }
37 }
38 }
39 }
40
41 return results;
42}
43
44// Example usage
45const transactions = [
46 {
47 id: "txn_123",
48 description: "AWS Monthly Subscription",
49 amount: 125.99,
50 direction: "DEBIT",
51 counterparty_name: "Amazon Web Services",
52 date: "2023-04-15",
53 bank_transaction_id: "tx_12345abcdef"
54 }
55];
56
57const suggestions = await categorizeTransactions("entity_123456", transactions, "your_token");
58console.log(suggestions);

Python (with SSE Support)

1import requests
2import json
3
4def categorize_transactions(entity_id, transactions, token):
5 url = f'https://api.openledger.com/v1/ai/{entity_id}/suggest-categories-vector'
6
7 headers = {
8 'Content-Type': 'application/json',
9 'Authorization': f'Bearer {token}'
10 }
11
12 data = {'transactions': transactions}
13
14 response = requests.post(url, headers=headers, json=data, stream=True)
15
16 results = []
17
18 for line in response.iter_lines():
19 if line:
20 decoded_line = line.decode('utf-8')
21 if decoded_line.startswith('data: '):
22 data_str = decoded_line[6:] # Remove 'data: ' prefix
23
24 if data_str == '[DONE]':
25 break
26
27 try:
28 result = json.loads(data_str)
29 if result.get('status') == 'complete' or 'confidence' in result:
30 results.append(result)
31 except json.JSONDecodeError:
32 continue
33
34 return results
35
36# Example usage
37transactions = [
38 {
39 'id': 'txn_123',
40 'description': 'AWS Monthly Subscription',
41 'amount': 125.99,
42 'direction': 'DEBIT',
43 'counterparty_name': 'Amazon Web Services',
44 'date': '2023-04-15',
45 'bank_transaction_id': 'tx_12345abcdef'
46 }
47]
48
49suggestions = categorize_transactions('entity_123456', transactions, 'your_token')
50print(suggestions)

Best Practices

  1. Provide Complete Transaction Data: Include all available fields (description, counterparty, amount, date) for better categorization accuracy.

  2. Handle Streaming Responses: Implement proper SSE handling to receive real-time updates during categorization.

  3. Monitor Confidence Scores: Review suggestions with confidence scores below 0.8 before applying them.

  4. Batch Process: Send multiple transactions in a single request for efficiency.

  5. Handle Retries: The system automatically retries failed categorizations, but implement error handling for permanent failures.

  6. Set Up Categories: Ensure your entity has a well-structured chart of accounts with isCategory: true accounts.

  7. Use Vector Context: The system leverages historical transaction patterns for better accuracy over time.

Vector Enhancement Features

Our AI categorization includes advanced vector-based features:

  • Semantic Similarity: Finds similar transactions based on meaning, not just keywords
  • Historical Context: Uses past categorization patterns to improve suggestions
  • Adaptive Learning: Gets better over time as you categorize more transactions
  • Contextual Enrichment: Provides detailed analysis of counterparties and transaction purposes

Error Handling

The streaming API can return various error conditions:

1// Network or authentication errors
2data: {
3 "error": "Instance not found for this entity"
4}
5
6// Processing errors with retry
7data: {
8 "transaction_id": "txn_123",
9 "status": "retrying",
10 "retry_count": 2,
11 "message": "Retrying categorization (attempt 3/3)"
12}

Always implement proper error handling for production applications.