SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries

SUTRA models are connected, up-to-date, and hallucination-free models that provide factual responses with a conversational tone. They are online LLMs that use, infer, and process real-time knowledge from the internet and leverage it to provide the most up-to-date information when forming responses. SUTRA-Online models can accurately respond to time-sensitive queries, extending its knowledge beyond a static training corpus. Online models can therefore accurately answer questions like "Who won the game last night” or “What’s the most popular movie right now?”.

\ We evaluated the SUTRA models using the Fresh Prompt framework [Vu et al., 2023], developed by Google for assessing online LLMs [Press et al., 2022], and discovered that SUTRA-Online models surpass the competing search

\ Table 8: SUTRA quantitative MMLU results across a subset of supported languages for fine-grained tasks on the MMLU benchmark.

\ Table 9: Performance Comparison of Language Models for handling fresh (realtime queries) with valid premise according to freshness LLM benchmark from Vu et al. [2023]

\ engine-augmented models from Google, as well as OpenAI’s GPT-3.5 and Perplexity AI. The benchmark contains exhaustive questions covering various nuanced online scenarios covering never-changing, in which the answer almost never changes; slow-changing, in which the answer typically changes over the course of several years; fast-changing, in which the answer typically changes within a year or less. SUTRA performed well across majority of these scenarios, as shown in Table 9.

:::info Authors:

(1) Abhijit Bendale, Two Platforms (abhijit@two.ai);

(2) Michael Sapienza, Two Platforms (michael@two.ai);

(3) Steven Ripplinger, Two Platforms (steven@two.ai);

(4) Simon Gibbs, Two Platforms (simon@two.ai);

(5) Jaewon Lee, Two Platforms (jaewon@two.ai);

(6) Pranav Mistry, Two Platforms (pranav@two.ai).

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

This content originally appeared on HackerNoon and was authored by Speech Synthesis Technology

Print Share Comment Cite Upload Translate Updates

APA

Speech Synthesis Technology | Sciencx (2025-06-27T02:00:06+00:00) SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries. Retrieved from https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/

MLA

" » SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries." Speech Synthesis Technology | Sciencx - Friday June 27, 2025, https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/

HARVARD

Speech Synthesis Technology | Sciencx Friday June 27, 2025 » SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries., viewed ,<https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/>

VANCOUVER

Speech Synthesis Technology | Sciencx - » SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/

CHICAGO

" » SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries." Speech Synthesis Technology | Sciencx - Accessed . https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/

IEEE

" » SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries." Speech Synthesis Technology | Sciencx [Online]. Available: https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/. [Accessed: ]

rf:citation

» SUTRA-Online: Quantitative Evaluation for Real-Time, Factual LLM Queries | Speech Synthesis Technology | Sciencx | https://www.scien.cx/2025/06/27/sutra-online-quantitative-evaluation-for-real-time-factual-llm-queries/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Table of Links

6 Quantitative Evaluation for Real-Time Queries

Related Posts