Tag Archives: Splunk query optimization

Introducing SchemaWiseAI: The AI-Powered Solution for Seamless Database Query Mapping

Introducing SchemaWiseAI: The AI-Powered Solution for Seamless Database Query Mapping
AI-Generated Image

In today’s data-driven world, businesses and organizations generate vast amounts of data every day. Cybersecurity analysts, data engineers, and database administrators are increasingly turning to Large Language Models (LLMs) to help generate complex database queries. However, these LLM-generated queries often don’t align with an organization’s specific database schema, creating a major headache for data professionals.

This is where SchemaWiseAI comes in — a middleware tool designed to bridge the gap between generic AI outputs and the specific needs of your data infrastructure; currently in proof-of-concept stage. With SchemaWiseAI, you no longer need to manually adjust LLM-generated queries. The tool automatically transforms queries to match your exact data schema, saving time, reducing errors, and making data management easier.

What is SchemaWiseAI?

SchemaWiseAI is a middleware solution that adapts LLM-generated queries to match the unique database schemas of your organization. By ingesting your custom data structures, SchemaWiseAI ensures that every query is perfectly formatted and tailored to your needs, removing the need for manual adjustments. This powerful tool makes your data queries accurate, efficient, and easy to use, so you can focus on what matters most—getting insights from your data.

Why SchemaWiseAI?

LLMs can produce useful queries, but they often come with generic field names and structures that don’t fit your system. This mismatch requires tedious manual work to adapt each query to your specific data schema, causing unnecessary delays and increasing the chances of errors.

SchemaWiseAI solves this problem by automatically mapping field names and data structures to your custom schema. It makes sure that the queries generated by LLMs are accurate, efficient, and ready for execution in your environment, without the need for manual intervention.

Key Features of SchemaWise AI

  1. Field Name Mapping: Automatically converts generic field names from LLM-generated queries into your custom names.
  2. Query Transformation: Transforms AI-generated queries to fit your exact data schema.
  3. Template-Based Query Generation: Quickly generates queries using predefined templates that match your system.

Example

The current proof-of-concept (POC) version of SchemaWiseAI includes a network proxy mapping feature. Below is a snippet of this mapping, which shows how internal field names used within the organization (on the left) are automatically mapped to new field names. For example, proxy log data with specific field names like “srcip“, “dstip“, “status“, etc., is automatically transformed and mapped to standardized names such as “src“, “dst“, “http_status“, and so on.

"proxy_logs": {
            "fields": {
                "srcip": {"map_to": "src", "type": "string"},
                "dstip": {"map_to": "dst", "type": "string"},
                "bytes": {"map_to": "bytes_total", "type": "string"},
                "status": {"map_to": "http_status", "type": "string"},
                "dhost": {"map_to": "dest_host", "type": "string"},
                "proto": {"map_to": "protocol", "type": "string"},
                "mtd": {"map_to": "method", "type": "string"},
                "url": {"map_to": "uri", "type": "string"}
            }

Output

The final outcome of this schema transformation appears as follows:

User Prompt Request: List all HTTP GET requests with status 404 from the last hour

Using template query: sourcetype=”proxy” | where mtd=”GET” AND status=404 | stats count as request_count by url, srcip | sort -request_count

Final Query: sourcetype=”proxy” | where method=”GET” AND http_status=404 | stats count as request_count by uri, src | sort -request_count

For more transformation examples, check out Github.

Why Choose Ollama for SchemaWiseAI?

At the core of the current SchemaWiseAI is Ollama (https://ollama.com/), a powerful, local AI platform that runs models directly on your machine, ensuring security, privacy, and speed. Here’s why Ollama is the ideal platform for SchemaWiseAI:

  1. Privacy and Security: Run AI models locally, ensuring that your sensitive data remains secure.
  2. Customizable AI: Tailor the LLM to your specific database needs with ease.
  3. Real-Time Performance: No cloud latency, providing fast, on-demand query generation.
  4. Cost-Effective: Avoid high cloud processing costs by running everything on your own infrastructure.

To get started with Ollama, review my last post where I shared steps on how to install and configure Ollama on Kali.

Who Can Benefit from SchemaWiseAI?

SchemaWiseAI is designed for professionals who work with data and rely on accurate, fast, and customized queries. Key users include:

  1. Cybersecurity Analysts: Quickly generate and refine queries for security logs and threat detection.
  2. Data Engineers: Automate the process of adapting AI queries to fit specific database structures.
  3. Database Administrators: Ensure that all queries are properly aligned with custom schemas, reducing errors and failures.
  4. Business Intelligence Analysts: Easily generate optimized queries for reporting, dashboards, and insights.

Current Limitations

  1. Support for More LLMs: Expanding beyond Ollama to include platforms like OpenAI and other popular models.
  2. Integration with More Data Schemas: Supporting a wider range of schemas, such as Palo Alto logs, DNS logs, and Windows logs.
  3. Improved UX/UI: Enhancements to the user interface for a more intuitive experience.
  4. Expanded Query Optimization: More features to optimize queries for different platforms and use cases.
  5. To manage scalability limitations: take machine learning, pattern-based learning approach, or a hybrid approach.

Getting Started with SchemaWiseAI

Ready to give SchemaWiseAI a try? Follow these easy steps to get started listed on Github: https://github.com/azeemnow/Artificial-intelligence/tree/main/SchemaWiseAI

Conclusion: Transform Your Data Queries with SchemaWiseAI

SchemaWiseAI is the perfect solution for organizations looking to streamline their query generation process, improve query accuracy, and save time. Whether you’re a cybersecurity analyst, data engineer, or business intelligence analyst, SchemaWiseAI is designed to make working with data more efficient.

By automating the transformation of LLM-generated queries into organization-specific formats, SchemaWiseAI saves you the time and effort needed for manual adjustments. And with future features like broader LLM support, expanded schema integration, and improved user experience, SchemaWiseAI is positioned to become a game-changer in the world of data querying.

Disclosure:

Please note that some of the SchemaWiseAI code and content in this post were generated with the help of AI/Large Language Models (LLMs). The generated code and content has been carefully reviewed and adapted to ensure accuracy and relevance.

 

Tagged , , , , , ,
Advertisements