We built an AI Agent on Tiger Cloud

This is a submission for the Agentic Postgres Challenge with Tiger Data

What I Built

I recently had a chat with a friend who works in an e-commerce company, and he was complaining that his company spent a fortune on an AI Customer Service a…


This content originally appeared on DEV Community and was authored by Lin Liang

This is a submission for the Agentic Postgres Challenge with Tiger Data

What I Built

I recently had a chat with a friend who works in an e-commerce company, and he was complaining that his company spent a fortune on an AI Customer Service application, only to see an increase in user complaints.

I asked him what was going on, and he showed me a real conversation:

User: "I want to return this laptop. Also, I wanted to ask, how does the warranty work if I exchange it for a new one?"

AI Customer Service: "Okay, I can process the return for you. Please provide your order number."

User: "??? What about my warranty question?"

AI Customer Service: "Sorry, how can I help you?"

Here is where the problem with router-based Conversational AI is, which routes messages to agents who own specific topics and tasks.

They inherently fail for natural, free-form conversation -- not due to implementation details you can fix, but due to fundamental architectural constraints.

In this post, we'll build a Conversational AI on Parlant to handle free-form dialogues.

Understanding Parlant Approach

Parlant is an AI alignment engine designed specially for natural customer-facing conversations.

To handle free-form dialogues, Parlant introduces the Dynamic Guideline Matching mechanism that instructs your systems to evaluate all guidelines and load only the relevant ones for each conversation turn.

# Parlant approach - no routing, just conditional guidelines
await agent.create_guideline(
    condition="User asks for the refund and order amount > $500",
    action="First use order status tool, confirm injectim eligibility, then provide detailed refund process",
    tools=[check_order_status, calculate_refund_amount]
)

# More guidelines creation
....

Notice what's different here: there's no routing decision.

How the guidelines System work

Parlant's guideline matcher evaluates all guidelines and determines which ones are contextually relevant. This is where Parlant's architecture does most of its work.

Timescale Postgres - Production-Grade Vector Database

When we delve deeper into Parlant's Dynamic Guideline Matching Mechanism, we uncover a core technical challenge:

How do we efficiently retrieve the 3-5 most relevant guidelines from potentially hundreds or even thousands of available entries, all while maintaining millisecond-level latency?

This is precisely the scenario where a Vector Database comes into play.

The Guideline Matching process in Parlant is essentially a Semantic Retrieval problem. The system converts the condition field of every guideline into a Vector Embedding. When a user input arrives, the process unfolds as follows:

  • Step 1: Encode the user query and the conversation history into a Query Vector.
  • Step 2: Execute the Similarity Search on the Vector Database.
  • Step 3: Return the Top-K (e.g., 3-5) most relevant guidelines.
  • Step 4: Inject these retrieved guidelines into the LLM's context window.

Parlant's guideline matching process

This requires the Vector Database to possess three core capabilities:

  • High-performance Approximate Nearest Neighbor (ANN) Search.
  • Flexible metadata filtering.
  • Real-time vector updates.

Timescale Postgres provides production-grade support across all three of these dimensions.

Demo

This demo primarily showcases the core capability: using intelligent Guidelines (or Directives) to allow the Large Language Model (LLM) to answer general questions directly, while automatically triggering a Knowledge Base Retrieval for specialized or complex questions, and providing a structured, traceable answer.

If you'd like to run the demo locally, please download the repo, and follow the instructions in README.md to set up your local development environment.

How I Used Agentic Postgres

To start, you need to follow the provided instructions to create a Tiger-Cloud account. For more information, refer to this guide.

Database setup

Then follow the instructions to create a new database service. For more details on how to create a new Tiger Cloud database, refer to this guide.

Let's create a table named documents to store all documents with both text content and vector embeddings, and the necessary indexes for fast searching across different methods.

-- Enable extensions
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table for our news articles
CREATE TABLE documents (
    id TEXT PRIMARY KEY,
    title TEXT,
    content TEXT,
    embedding VECTOR(1024) -- jina-embeddings-v3 dimensions
);

-- Create indexes for performance
CREATE INDEX ON documents USING GIN (to_tsvector('english', content));
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops);

You are allowed to execute the table creation SQL in the TigerCloud console.

SQL Editor

Implementing Hybrid Search with Jina and PgVector

In this section, we will build a hybrid search engine leveraging Jina and pgvector on Tiger Cloud. Before that, let's build an overall understanding of the architecture and concepts required.

The process of Hybrid Search

The architecture comprises five steps:

  1. Documents: The process starts with a collection of documents.
  2. Embedding generation: a. Semantic search using Jina: Documents are processed using Cohere to create dense embeddings, which capture the semantic meaning of the text. b. Keyword search using PostgreSQL: Documents are processed using PostgreSQL to create sparse embeddings focusing on the text's specific keywords.
  3. Storage: Both dense and sparse embeddings are stored on Timescale’s PostgreSQL.
  4. Retrieval and reranking: The system retrieves results from the Timescale database in response to a query. The results are then reranked using Jina to prioritize the most relevant documents.
  5. Results generation: The reranked results are compiled into a final list, generating a ranking of the most relevant documents for the query.

Here's the implementation detail in Python:

from typing import Any
from env import SEMANTIC_SCORE_THRESHOLD


class MultiSearchTool:

    def __init__(self, connection, jina_client):
        self.conn = connection
        self.jina = jina_client


    def extract_search(self, query: str, limit: int = 5):
        """Extract keyword matching for specific terms, names, dates"""
        with self.conn.cursor() as cur:
            # Search for exact phrase and important keywords
            cur.execute("""
                SELECT id, title, content, 1.0 AS score, 'exact' AS search_type
                    FROM documents
                    WHERE content LIKE %s
                    ORDER BY char_length(content)  -- Prefr shorter, more focused articles
                    LIMIT %s
            """, (f"%{query}%", limit))
            return cur.fetchall()


    def fulltext_search(self, query: str, limit: int = 5):
        """PostgreSQL full-text search with ranking"""
        with self.conn.cursor() as cur:
            cur.execute("""
                SELECT id, title, content,
                       ts_rank_cd(
                            to_tsvector('english', content), 
                            plainto_tsquery(%s)
                        ) AS score,
                       'fulltext' AS search_type
                FROM documents,
                     plainto_tsquery('english', %s) AS query
                WHERE to_tsvector('english', content) @@ query
                ORDER BY score DESC
                LIMIT %s
            """, (query, query, limit))
            return cur.fetchall()


    def semantic_search(self, query: str, limit: int=5):
        """Vector-based semantic search using Jina embeddings"""
        try:
            # Generate query embedding
            query_embeddings = self.jina.embed([query], task="retrieval.query")

            with self.conn.cursor() as cur:
                cur.execute("""
                    SELECT * FROM (
                        SELECT id, title, content,
                            1 - (embedding <=> %s::vector) AS score,
                            'semantic' AS search_type
                        FROM documents
                    ) a
                    WHERE score >= %s
                    ORDER BY score DESC
                    LIMIT %s
                """, (
                    query_embeddings[0],
                    SEMANTIC_SCORE_THRESHOLD,
                    limit,
                ))

                return cur.fetchall()

        except Exception as e:
            print(f"Semantic search failed: {e}")
            return []


    def combine_and_deduplicate(self, *result_sets):
        """Combine results from multiple search methods, removing duplicaties"""
        seen_ids = set()
        combined = []

        # Process results in order of priority
        for results in result_sets:
            for result in results:
                doc_id = result[0]
                if doc_id not in seen_ids:
                    seen_ids.add(doc_id)
                    combined.append({
                        "id": doc_id,
                        "title": result[1],
                        "content": result[2],
                        "score": result[3],
                        "search_type": result[4]
                    })

        return combined


    def rerank_results(
            self, query: str, results: list[Any], top_k: int = 5
        ):
        """Use Jina's rerank API for final relevance scoring"""
        if not results:
            return []

        # Prepare documents for reranking (truncate long articles)
        # Take first 2000 chars to stay within rerank limits
        documents = [result["content"][:2000] for result in results]

        try:
            rerank_response = self.jina.rerank(
                query, documents=documents, top_k=top_k
            )

            # Map reranked results back to original data
            reranked = []
            for rerank_result in rerank_response:
                original_idx = rerank_result["index"]

                result = results[original_idx].copy()
                result["rerank_score"] = rerank_result["relevance_score"]

                reranked.append(result)

            return reranked

        except Exception as e:
            print(f"Reranking error: {e}")
            return results[:top_k]


    def hybrid_search(self, query: str, limit: int = 5):
        """Main hybrid search function combining all methods"""
        # Cast wide net with all search methods
        extract_results = self.extract_search(query, limit=limit)
        fulltext_results = self.fulltext_search(query, limit=limit)
        semantic_results = self.semantic_search(query, limit=limit)

        # Combine and deduplicate (extract matches prioritized first)
        combined = self.combine_and_deduplicate(
            extract_results,
            fulltext_results,
            semantic_results
        )

        # Rerank for final relevance
        final_results = self.rerank_results(query, combined, limit)

        return final_results

Overall Experience

Tiger Cloud is a cloud PostgreSQL platform tailored for AI applications. This article demonstrates how to leverage its integral vector search capability to construct a conversational AI agent.

However, when deploying the application to a production environment, performance is a critical factor that must be addressed. We can utilize pgvectorscale, another extension provided by Tiger Cloud, to significantly improve application performance. Due to constraints of time and scope, a subsequent post will be dedicated to discussing the production-level usage of pgvectorscale and other performance optimization techniques.


This content originally appeared on DEV Community and was authored by Lin Liang


Print Share Comment Cite Upload Translate Updates
APA

Lin Liang | Sciencx (2025-11-07T07:48:45+00:00) We built an AI Agent on Tiger Cloud. Retrieved from https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/

MLA
" » We built an AI Agent on Tiger Cloud." Lin Liang | Sciencx - Friday November 7, 2025, https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/
HARVARD
Lin Liang | Sciencx Friday November 7, 2025 » We built an AI Agent on Tiger Cloud., viewed ,<https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/>
VANCOUVER
Lin Liang | Sciencx - » We built an AI Agent on Tiger Cloud. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/
CHICAGO
" » We built an AI Agent on Tiger Cloud." Lin Liang | Sciencx - Accessed . https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/
IEEE
" » We built an AI Agent on Tiger Cloud." Lin Liang | Sciencx [Online]. Available: https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/. [Accessed: ]
rf:citation
» We built an AI Agent on Tiger Cloud | Lin Liang | Sciencx | https://www.scien.cx/2025/11/07/we-built-an-ai-agent-on-tiger-cloud/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.