Blog

Natural Language Database Interaction via LLMs: Text-to-Query Agent

1. What is the text to query?

  • Text-to-Query is a technology that lets users interact with databases in plain language. Instead of writing SQL or structured queries, users simply type questions like “Show me last month’s sales by region”, and the system automatically converts them into executable database queries.
  • Once executed, the results can be displayed, summarized, or exported for further analysis. This enables users to spot trends, identify patterns, and generate insights without needing coding expertise.
  • By bridging the gap between natural language and structured data, Text-to-Query makes data access faster and more inclusive. Non-technical teams can create reports and gain insights independently, while technical teams save time for advanced analysis instead of routine query writing. The result is a stronger data-driven culture across organizations, where decision-making becomes faster, more accurate, and widely accessible.

2. Progress in Text-to-Query Approaches

 Over the years, Text-to-Query has evolved significantly. Early systems relied on rigid rule-based methods, later replaced by deep learning and Transformer-based approaches that improved query understanding. With the rise of large language models, text-to-Query developments have grown cumulatively over the years.

  • 2000–2010: Rule-Based and Template Method

       Natural language interfaces to databases relied on rule-based and template-based systems. These approaches had limited flexibility and were restricted to predefined queries.

  • 2010–2015: Early Deep Learning Approaches

       Early deep learning attempts using LSTM and RNN started to appear in Text-to-SQL research.

  • 2016–2019: Deep Learning-Based Text-to-SQL

       Early Text-to-SQL models such as Seq2SQL and SQLNet relied on LSTM and RNN architectures. These models encoded natural language questions and generated SQL queries, often using attention mechanisms to align question tokens with database schema elements.[1][2]

  • 2020–2022: Pretrained Models with Fine-Tuning

       Pretrained Transformer models such as BERT, RoBERTa, and T5 were fine-tuned for Text-to-SQL tasks. Benchmark datasets like Spider and CoSQL became popular for addressing multi-table and multi-turn query challenges.[3][4]

  • 2023–2025: LLM-Based Approaches

       Large language models (e.g., GPT, Gemini) achieved state-of-the-art results in natural language to query translation. Techniques like schema linking, prompt engineering, and constrained decoding improved accuracy and reliability in generated queries.[5]

3. LLM-Based Approaches

 3.1 Use Case of LLM-Based Text-to-Query

  • LLM-based Text-to-Query approaches are particularly well-suited for companies that provide a product or service used by other businesses, where extensive user or operational data is stored in the provider’s database. Through this technology, clients of the product can access their own data directly via natural language queries, often through an intuitive chatbot interface. This eliminates the need for technical knowledge or writing complex SQL queries, allowing users to quickly retrieve information, perform statistical analyses, and monitor real-time updates.

 

  • This approach is especially valuable in multi-client environments where each client has unique datasets and reporting needs. Traditional rule-based or customized solutions may struggle to scale across diverse clients and data structures, but LLM-based methods can generalize effectively, adapting to different schemas and requirements. Companies can provide a consistent and flexible experience, enabling each client to explore and analyze their own data seamlessly.

 

  • For example, a SaaS platform providing inventory management could allow each client company to query their stock levels, sales trends, or supply chain status directly through a chatbot. Meanwhile, the platform itself can serve multiple clients with different database structures, without building separate query systems for each. This flexibility makes LLM-based Text-to-Query ideal for products where both the provider and its clients benefit from fast, reliable, and user-friendly access to structured data. [6]

3.2 Building a Baseline Text-to-Query Agent

  • First of all, the working principle of the agent is to retrieve the relevant data from the user’s database that can answer the user’s question. To do this, the user’s question must first be converted into a database-compatible query with the help of LLMs, ensuring it correctly reflects the intent of the question. After that, the query is executed and the result data is retrieved.
                 Figure 1: Example of Text-to-Query Agent Query and Executed Output
Figure 1: Example of Text-to-Query Agent Query and Executed Output
  • Depending on the size of the output and the user’s request, the results are then presented back to the user through the LLM. For example, if the user requests an Excel report, the data can be provided as an Excel file. Alternatively, if the user asks a question that can be answered directly, the LLM can communicate with the user through the chatbot interface and provide the answer immediately. In some cases, the agent can also determine the most suitable way to present the results, choosing between different formats or interaction methods to optimize the user experience.
Figure 2 Generated Answer of Text-to-Query using Previous Result of Query
Figure 2 Generated Answer of Text-to-Query using Previous Result of Query

 3.2.1 What an LLM Needs to Generate Queries

 

Figure 3: Flow of Text-to-Query Agent
Figure 3: Flow of Text-to-Query Agent

Database Schema:

LLM needs to have a comprehensive understanding of the user’s database. This includes detailed knowledge of the database schema, the tables within the database, and the columns in those tables along with their descriptions. In addition, the LLM must be aware of the relationships between tables, including foreign keys and primary keys. It is also important for the LLM to recognize the type of database being used, such as SQL, MongoDB, or others.

Sending all database information to an LLM as a single prompt allows the model to access detailed knowledge about the database. However, in databases with a large number of tables, for example 100 to 200, or in highly complex schemas, it may not be feasible for the LLM to fully process every table. Including all this information can also significantly impact both the model’s performance and the speed of its responses.

To address this, the database schema can be stored as vectors in a vector database. When a user submits a query, only the columns and schema elements most likely relevant to that query are retrieved using a ranking mechanism and sent to the LLM. This ensures that the model is not overwhelmed with unnecessary tables or columns and can focus on the information that matters. As a result, the LLM produces more accurate queries while maintaining better performance and efficiency.

In this context, the key factor is having a database schema that accurately represents the database. It is crucial that the LLM receives only the relevant tables and columns, ensuring that no necessary information is missed, while at the same time minimizing the inclusion of unnecessary tables. To achieve this, RAG (Retrieval-Augmented Generation) or similar retrieval-based approaches can be employed.

To select the most suitable method, a well-designed test dataset should be prepared. For example, one could create a database consisting of 50 different tables and extract its schema. Then, a set of questions that users might ask is generated, and for each question, the ground truth list of tables and columns required to answer it is established.

When a user submits a query, the method retrieves the schemas of the corresponding tables. The retrieved tables can then be compared with the ground truth using evaluation metrics such as recall, which measures how many of the necessary tables and columns were correctly retrieved, and subset or containment check, which verifies whether all required ground truth tables are included within the predicted tables. By using these metrics, different retrieval approaches can be systematically tested and evaluated to determine which method performs best.

User Intent:

For generating correct queries, it is not enough for the LLM to understand the database; it must also accurately capture the user’s intent. In an agent-based setup, the system should determine whether the user wants the result in the form of a report, a visualization (e.g., a chart), a direct textual answer, or route the request to other specialized agents depending on the user’s needs. Intent can be determined using techniques such as classification, few-shot prompting, or retrieval-augmented generation (RAG), which help the system accurately interpret and act on the user’s request.

If the user’s query is incomplete or ambiguous, the intent recognition module can trigger a clarification step, asking follow-up questions before routing the request. This is not only important for selecting the right agent but also for resolving ambiguities in the user’s request. For example, if the user asks “What are the most popular products?”, the system should clarify whether they mean by total revenue, highest growth, most sales in the last week, customer ratings, number of transactions, regional demand, etc. Handling these cases ensures that the chosen agent receives a precise task and can return accurate and meaningful results.

Examples and Query Patterns:

Since the LLM will ultimately generate queries, providing it with example queries that are aligned with the database structure and table content can significantly improve the quality of the generated results. These examples serve as concrete references, helping the model understand typical query patterns, field types, and expected outputs.

Additionally, query generation can be approached in a step-by-step manner, where the LLM incrementally constructs the query by first identifying the relevant collections or tables, then selecting the necessary fields, applying filters, and finally adding aggregations or sorting as needed. This stepwise process allows for better control, reduces errors, and ensures that the generated queries are both syntactically correct and semantically meaningful.

3.2.2 Query Generation and Agent Performance

To thoroughly evaluate the performance of a Text-to-Query agent, it is essential to create comprehensive and diverse test datasets. These datasets should cover different database structures, from simple single-table schemas to highly complex multi-table schemas. In addition, separate tests should be conducted for different database types, such as SQL and MongoDB, since the agent may not deliver the same performance across all of them.

For testing purposes, sample databases can either be generated or sourced from real-world datasets. Alongside these, a set of user questions expressed in natural language should be prepared. Each question should have a corresponding ground truth query and expected query output. These paired datasets form the basis for evaluating the agent’s ability to accurately translate natural language input into executable queries.

During testing, the agent should be provided with these questions to generate queries. Each generated query is then executed against the relevant database, and the resulting outputs are compared to the expected outputs. After that, the agent can be evaluated in terms of both accuracy and performance.

  1. Model Accuracy

  • Ability to generate syntactically correct queries.
  • Correct selection of relevant tables and columns
  • Returning results that match the expected outputs.
  1. System Performance

  • Average response time for query generation and execution.
  • Resource usage (e.g., memory and compute requirements).
  •  Average execution cost across different query complexities.,

By observing both accuracy and performance metrics across various scenarios, developers can gain a comprehensive understanding of the agent’s operational efficiency, reliability, and scalability. This ensures the Text-to-Query agent is robust, capable of handling diverse user intents and database configurations, and delivers consistent results in real-world applications.

Table 1: Advantages and Limitations of Text-to-Query Agent
Table 1: Advantages and Limitations of Text-to-Query Agent

4. Conclusion

Text-to-Query technology, especially when powered by large language models, significantly simplifies database interactions for users of all technical levels. By converting natural language questions into accurate queries, it enables faster data retrieval, insightful analysis, and seamless reporting. With careful consideration of database schemas, user intent, and agent design, this approach can be adapted across industries and database types, offering an efficient and user-friendly solution for modern data-driven decision-making. Moreover, when integrated with chatbot interfaces, it allows for more interactive and natural user experiences, automates multiple tasks, and further simplifies complex data operations, making it easier for users to explore and analyze their data.

At MDP Group, Text-to-Query technologies are applied to deliver advanced solutions that simplify data access and analysis. By addressing clients’ unique needs and challenges, the most suitable methods are implemented to ensure efficient, reliable, and user-friendly products. In this way, organizations are enabled to gain actionable insights and strengthen their decision-making processes.

To explore how Text-to-Query technologies can transform your business intelligence, get in touch with the MDP AI Team.

5. References

[1] Zhong, V., Xiong, C., & Socher, R. (2017). Seq2SQL: Generating structured queries from natural language using reinforcement learning.

[2] Yu, T., Yasunaga, M., Yang, K., Zhang, R., Wang, D., Li, Z., & Radev, D. (2018). SyntaxSQLNet: Syntax tree networks for complex and cross-domain text-to-SQL task.

[3] Deng, X., Awadallah, A. H., Meek, C., Polozov, O., Sun, H., & Richardson, M. (2020). Structure-grounded pretraining for text-to-SQL.

[4] Qi, J., Tang, J., He, Z., Wan, X., Cheng, Y., Zhou, C., Wang, X., Zhang, Q., & Lin, Z. (2022). RASAT: Integrating relational structures into pretrained seq2seq model for text-to-SQL.

[5] Cao, Z., Zheng, Y., Fan, Z., Zhang, X., Chen, W., & Bai, X. (2024). RSL-SQL: Robust schema linking in text-to-SQL generation.

[6] “Democratize data for timely decisions with text-to-SQL at Parcel Perform”, AWS Blogs, 09 JUL 2025.

 

Similar
Blog

Your mail has been sent successfully. You will be contacted as soon as possible.

Your message could not be delivered! Please try again later.