Following up on my previous post about foundational AI concepts, I’m back with Part 2 of my AI learning journey!
While Part 1 covered how AI works, this post tackles how we can use AI responsibly. A crucial side of AI goes beyond the technical aspects into: governance, ethics, risk management, and ensuring AI benefits everyone.
AI Generated
My AI Governance Notes
I’m sharing my notes below to help others navigate AI governance. These break down complex frameworks into digestible insights.
Note: These are personal notes, not comprehensive guides. Use them as a starting point for understanding responsible AI practices.
Key takeaway: AI governance isn’t about slowing innovation. It’s about ensuring innovation benefits everyone.
Frameworks, Standards, & More
EU AI ACT (risk-based approach)
Risk Categories
Prohibited: Social scoring, subliminal manipulation, real-time biometric ID
High-Risk: Biometric systems, employment, education, law enforcement, healthcare
Data Used: Training data sources and characteristics
Metrics: Performance and fairness measures
Ethical Concerns: Identified risks and mitigations
Deployment Context: Where and how it’s used
Risk Management Key Components
Model Inventory: Catalog of all AI systems
Tiering: Risk-based classification system
Controls: Safeguards and mitigations
Incident Response Plan: Procedures for problems
Human Oversight Levels
Human-in-the-loop: Human makes final decisions. High-stakes decisions. Human-on-the-loop: Human monitors, can intervene. Medium-risk applications. Human-out-of-loop: Automated with oversight. Low-risk, high-volume.
Lately, Iโve been running security assessments on various LLM applications using NVIDIAโs GARAK tool. If you havenโt come across it yet, GARAK is a powerful open-source scanner that checks LLMs for all kinds of vulnerabilities, everything from prompt injection to jailbreaks and data leakage.
The tool itself is fantastic, but there was one thing driving me crazy: the reports.
The Problem with JSONL Reports
GARAK outputs all its test results as JSONL files (JSON Lines), which are basically long text files with one JSON object per line. Great for machines, terrible for humans trying to make sense of test results.
I’d end up with these massive files full of valuable security data, but:
Couldn’t easily filter by vulnerability type
Had no way to sort or prioritize issues
Couldn’t quickly see patterns or success rates
Struggled to share the results with non-technical team members
Anyone who’s tried opening a raw JSONL file and making sense of it knows the pain I’m talking about.
The Solution: JSONL to Excel Converter
After wrestling with this problem, I finally decided to build a solution. I created a simple Python script that takes GARAK’s JSONL reports and transforms them into nicely organized Excel workbooks.
The tool
Takes any JSONL file (not just GARAK reports) and converts it to Excel
Creates multiple sheets for different views of the data
Adds proper formatting, column sizing, and filters
Generates summary sheets showing test distributions and success rates
Makes it easy to identify and prioritize security issues
Here’s what the output looks like for a typical GARAK report:
Summary sheet: Shows key fields like vulnerability type, status, and probe class
All Data sheet: Contains every single field from the original report
Status Analysis: Breaks down success/failure rates across all tests
Probe Success Rates: Shows which vulnerability types were most successful
Why This Matters
If you’re doing any kind of LLM security testing, quickly making sense of your test results is key. This simple conversion tool has saved me hours and helped me focus on real vulnerabilities instead of wrangling with report formatting.
The best part is, the code is super simple; just a few lines of Python using pandas and xlsxwriter. I’ve put it up on GitHub for anyone to use.
Wrapping Up
Sometimes the simplest tools make the biggest difference. I built this converter to scratch my own itch, and it’s been surprisingly effective at saving time and effort.
If you’re doing LLM security testing with GARAK, I hope it helps make your workflow smoother too.
Also, check out my second tool: GARAK Live Log Monitor with Highlights. It’s a bash script that lets you watch GARAK logs in real-time, automatically highlights key events, and saves a colorized log for later review or sharing.
In the fast-growing world of artificial intelligence (AI), Ollamais becoming a popular tool for people who want to run powerful AI language models on their own computers. Instead of relying on cloud servers, Ollama lets you run AI models locally, meaning you have more privacy and control over your data. This guide will show you how to install and set up Ollama on Kali Linux so you can experiment with AI models right from your device.
What Is Ollama?
Ollamais a software framework that makes it easy to download, run, and manage large language models (LLMs) like LLaMA and other similar models on your computer. Itโs designed for privacy and efficiency, so your data doesnโt leave your device. Ollama is getting more popular with developers and researchers who need to test AI models in a secure, private environment without sending data over the internet.
Why Use Ollama?
Ollama is gaining popularity for several reasons:
Privacy: Running models locally means your data stays on your device, which is crucial for people handling sensitive information.
Performance: Ollama is optimized to run on CPUs, so you donโt need a high-end graphics card (GPU) to use it.
Ease of Use: With simple commands, you can easily download and manage different AI models, making it accessible for beginners and advanced users alike.
Why Install Ollama on Kali Linux?
Kali Linuxis a popular choice for cybersecurity professionals, ethical hackers, and digital forensics experts. Itโs packed with tools for security testing, network analysis, and digital investigations. Adding Ollama to Kali Linux can be a big advantage for these users, letting them run advanced AI language models right on their own computer. This setup can help with tasks like analyzing threats, automating reports, and processing natural language data, such as logs and alerts.
By using Ollama on Kali Linux, professionals can:
Make Documentation Faster: AI models can help write reports, summaries, and other documents, saving time and improving consistency.
Automate Security Analysis: Combining Ollama with Kaliโs security tools allows users to build scripts that look for trends, scan reports, and even identify potential threats.
Before You Begin Install
To get started with Ollama on Kali Linux, make sure you have:
Kali Linux version 2021.4 or later.
Enough RAM (at least 16GB is recommended for better performance).
sudo access on your system
Note: Ollama was initially built for macOS, so the setup on Linux may have some limitations. Be sure to check Ollamaโs GitHub page for the latest updates.
Steps to Install Ollama on Kali Linux
Step 1: Update Your System
First, update your system to make sure all packages are up to date. Open a terminal and type:
sudo apt update && sudo apt upgrade -y
Install Ollama:
The official Ollama installation for Ubuntu or Debian-based systems is much simpler and usually involves running a curl command to download and execute an installation script:
curl -fsSL https://ollama.com/install.sh | sh
Verifying the Installation
ollama --version
You can also just enter ollama in the terminal and if its installed correctly you should see the following:
Installing and Running LLMs
The process for installing and running LLMs on Kali Linux is the same as on other Linux distributions:
To Install an LLM:
ollama pull <LLM_NAME>
In my case above, I installed llama3.2:1b model. You can see full library of models available on Ollama’s Github.
Start Prompt
After you’ve completed the previous steps, you can start Ollama with the specific model that you installed and send your prompts:
ollama run <LLM_NAME>
Conclusion
Ollama provides a great way to run large language models on your own machine, keeping data secure and private. With this guide, you can install and configure Ollama on Kali Linux and explore AI without relying on cloud-based services. Whether youโre a developer, AI enthusiast, or just curious about AI models, Ollama lets you experiment with language models directly from your device.
Stay tuned to the Ollama GitHub page for the latest features and updates. Happy experimenting with Ollama on Kali Linux!
Disclosure: Some of the content in this blog post may have been generated or inspired by large language models (LLMs). Effort has been made to ensure accuracy and clarity.
Artificial Intelligence (AI) is transforming the world around us, influencing industries from healthcare to finance. Recently, I had the opportunity to dive into an AI class, which provided a foundational overview of the core concepts driving this innovative field. Here, Iโm excited to share my class notes.
AI Generated
The Birth of AI and Early Challenges
The term “Artificial Intelligence” (AI) first appeared in 1955, coined by American computer scientist John McCarthy. Just a year later, McCarthy played a pivotal role in organizing the Dartmouth Summer Research Project on Artificial Intelligence. This landmark conference brought together researchers from various disciplines and laid the groundwork for the development of related fields like data science, machine learning, and deep learning.
However, these early efforts in AI faced significant hurdles. The computers of the 1950s lacked the capacity to store complex instructions, hindering their ability to perform intricate tasks. Additionally, the exorbitant cost, leasing a computer back then could cost a staggering $200,000 per month!; severely limited access to this technology. Fortunately, advancements in computer technology over the following decades led to significant improvements in processing power, efficiency, and affordability, paving the way for a wider adoption of AI.
As AI systems become more complex and play larger roles in our lives, understanding how they make decisions is just as important as what those decisions are. To help make sense of this, three key concepts often come up: Explainable AI, Transparency, and Interpretability. The table below breaks down these terms in simple language to clarify what they mean and why they matter.
Term
What It Means in Simple Terms
What It Focuses On
When You See It
Example
Explainable AI (XAI)
AI that can tell you why/how it made a decision in a way you can understand
Giving clear reasons or justifications for specific AI outputs
Usually used when AI is complex and needs extra help explaining its decisions
A tool that explains why a loan was denied by highlighting key factors
Transparency
Being open about how the AI system works overall, its data, methods, and design. Transparency can answer the question of โwhat happenedโ in the system
Sharing details about the AIโs structure and training process, but not explaining individual decisions
When you want to understand the general workings of the AI, not specific outcomes
Publishing the training data sources and model type publicly
Interpretability
How easy it is for a person to see and follow how the AI made a decision
The simplicity and clarity of the modelโs decision-making process itself
Often refers to models that are simple enough to understand directly
A decision tree that shows step-by-step how it classified an input
Artificial Intelligence is a branch of computer science that deals with the creation of intelligent agents, which are systems that can reason, learn, and act autonomously.
Types of AI: Weak & Strong
When people talk about AI, they usually mean one of two kinds.
Weak AI, sometimes called Narrow AI, is designed to do just one thing well. Think of a GPS app like Google Maps that finds the best route for you or voice assistants like Siri and Alexa that understand simple commands. These systems are really good at their specific tasks but canโt do anything outside of them.
Strong AI, also known as Artificial General Intelligence or AGI, is different. This type would be able to learn and think across many different areas, kind of like a human. It would understand new situations and make decisions on its own, not just follow pre-set instructions. We donโt have strong AI yet, but itโs what many researchers are aiming for, something like the sci-fi idea of a truly intelligent robot or assistant that can help with anything you ask.
AI Breakdown
Artificial Intelligence
Machine Learning
Deep Learning: Deep Learning is a type of machine learning that uses artificial neural networks, inspired by the structure of the human brain. These networks can learn complex patterns from large amounts of data and achieve high accuracy.
Deep Neural Networks (DNNs)
Inspired by the human brain, they learn from data (like a baby learning a language) to recognize patterns and make predictions.
Need lots of data. The more data they see, the better they perform.
Highly accurate. Great for tasks like image recognition and speech recognition.
Neural Network Layers/Architectures
The different ways that DNNs can be constructed
Finding the right Layer/Architecture combination is a creative and challenging process.
Generative Adversarial Networks (GANs)
GANs are a type of deep learning system using two neural networks: a generator and a discriminator.
Imagine two art students competing. The generator keeps creating new art pieces, while the discriminator tries to identify if a piece is real or a forgery.
Through this adversarial training, both networks improve. The generator creates more realistic forgeries, and the discriminator gets better at spotting them.
Have the potential to be used in defensive & offensive cybersecurity.
Diffusion models are a recent advancement in generative AI specifically focused on creating high-quality, realistic images. They work by learning to remove noise from random noise, essentially reversing a noise addition process.
Analogy for understanding DNNs, GANs, & Diffusion models:
Think of DNNs as the general tools in a workshop. They provide the foundational capabilities for various tasks.
GANs are like specialized sculpting tools. They excel at creating new and interesting shapes (images) but might require more effort to refine the final product.
Diffusion models are like high-precision restoration tools. They meticulously remove noise and imperfections to create a clear and detailed image, but the process might take longer.
*RL, NLP, CV don’t always require Deep Learning to function.
But when Deep Learning is used, the power of Deep Neural Networks is applied; which improves the accuracy but require more data and computing power.
When Deep Learning is applied, the word Deep is appended in the front (i.e., Deep Computer Vision)
Adding DNNs isn’t a silver bullet to solving all use cases.
Details
Machine Learning
Machine Learning is a sub-field of AI that focuses on teaching computers to make predictions based on data.
There are three key aspects to designing a Machine Learning solution:
Objectives: What do you want the program to achieve? (e.g., spam detection, weather forecasting)
Data: The information the program will learn from. This data can be labeled (supervised learning) or unlabeled (unsupervised learning).
Algorithms: The method the program uses to learn from the data.
Data Types:
Structured Data:
This type of data is organized and follows a predefined format, like a spreadsheet with clear headings and rows/columns. It’s easily searchable and analyzed by computers.
Unstructured Data:
This data doesn’t have a fixed format and can be messy or complex. Examples include emails, social media posts, images, and videos.
While it requires additional processing, unstructured data can be incredibly valuable.
Humans primarily communicate using unstructured data, like natural language.
Unstructured data is vast and growing rapidly, exceeding the amount of structured data in the world.
AI Features
Before teaching a machine learning model, itโs important to pay attention to the data it learns from the features. How you choose, prepare, and check these features can make a big difference in how well the model works and how fair its decisions are. Hereโs a simple breakdown of some key steps involving features and how they can help reduce bias.
Term
Simple Explanation
What It Focuses On
When in ML Pipeline
Relation to Bias Mitigation
Example / Note
Feature Validation
Ensuring features are accurate and consistent
Data quality checks
Early and ongoing data processing
Important for data quality; indirect impact on bias
Checking for missing or incorrect values
Feature Selection
Choosing which data inputs to use in the model
Picking relevant, useful, and fair features
Before or during model training
Helps reduce bias by excluding problematic features
Removing sensitive features like race or gender
Feature Transformation
Changing features into suitable formats or scales
Data preparation like normalization or encoding
Before or during training
No direct bias mitigation; just data formatting
Scaling age values to a 0โ1 range
Feature Engineering
Creating or modifying features to improve model
Combining selection, transformation, creation
During feature preparation
Can reduce or introduce bias depending on design
Creating โincome-to-debtโ ratio feature
Feature Importance
Measuring which features impact model predictions most
Understanding feature influence after training
After training, for interpretation
Does not fix bias; just shows what matters most
Income strongly influences loan approval
Before teaching a machine learning model, itโs important to pay attention to the data it learns from: the features. How you choose, prepare, and check these features can make a big difference in how well the model works and how fair its decisions are. Hereโs a simple breakdown of some key steps involving features and how they can help reduce bias.
Why Data is the Most Important Factor for Fairness in AI?
The biggest factor in making AI fair is the data itโs trained on. If the training data doesnโt include enough variety (especially from different groups of people or situations), the AI will likely pick up and even amplify those biases. No matter how advanced the model, how carefully the data is labeled, or how accurate the final predictions are, none of that can fix problems caused by limited or unrepresentative training data.
Think of it like this: AI learns patterns from the data it sees. If that data doesnโt show the full diversity of the real world, the AI will have blind spots and make biased decisions. Other factors can help improve a model, but they canโt make up for training data that doesnโt reflect the real-world variety it needs to understand.
Machine Learning Types
Supervised Learning is like being taught in school. The data you train the model on has labels or pre-defined categories. The model learns the relationship between these labels and the data to make future predictions.
Example: Classifying emails as spam or important.
Unsupervised Learning is more like exploring on your own. The data you provide has no labels. The model finds hidden patterns or groups within the data.
Example: Grouping customers based on their shopping habits.
Deterministic AI vs. Non-Deterministic AI
Aspect
Deterministic AI
Non-Deterministic AI
Output
Always the same for the same input
Can vary for the same input
Decision-making
Follows fixed, predefined rules or algorithms
Involves randomness, probabilities, or learning
Examples
Rule-based systems (e.g., Chess engines, SPSS for statistical analysis)
Unlikely, as outputs are strictly defined by rules and logic
More likely due to probabilistic nature, especially in language models (e.g., ChatGPT)
Strengths
Reliable, consistent, easier to validate
Flexible, adaptable, handles complex and dynamic environments
Weaknesses
Limited by rigid decision-making and lack of flexibility
Can be inconsistent, prone to hallucinations, harder to explain
Common Machine Learning Techniques
Regression: Used for predicting continuous values, like house prices or temperature.
Real Life Example:Uber uses regression to create a dynamic pricing model. This model considers factors like time of day, demand, and location to predict the optimal price for a ride. This balances customer retention (not setting prices too high) with price maximization (earning as much revenue as possible).
Classification: Sorting things into categories, like spam detection or identifying fraudulent credit card activity.
Real Life Example:American Express uses classification algorithms to identify potentially fraudulent activities on their credit cards. The algorithm is trained on historical data of fraudulent and legitimate transactions. It analyzes factors like purchase location, amount, and spending habits to flag suspicious activity in real-time.
Clustering: Finding natural groups within unlabeled data, like grouping customers with similar shopping habits.
Real Life Example: Spotify uses clustering for collaborative and content-based filtering to personalize user experience. Collaborative filtering groups users with similar listening habits and recommends music enjoyed by similar users. Content-based filtering clusters songs based on audio features and recommends songs similar to what a user already enjoys.
Association Rule Learning: Discovering hidden relationships between things in unlabeled data, like recommending movies based on what other viewers with similar tastes watched.
Real Life Example:Bali Tourism Board uses association rule learning to determine which attraction combinations tourists visit most often and when. By analyzing tourist data, they can uncover patterns like “tourists visiting temples often visit beaches afterward.” This allows for better optimization of infrastructure, staffing, and accommodation availability at different attractions throughout the day.
Reinforcement Learning (RL)
Where a computer is given a problem. It is rewarded (+1) for finding a solution or punished (-1) for not finding a solution.
Unlike supervised learning, the computer (agent) is NOT given instructions on how to complete the task. Instead, it learns through trial and error to learn which actions are good and which are bad.
Similar to how humans learn.
How babies learn how to walk. When a baby falls over, it feels pain and learns not to repeat the same action again.
Therefore, reinforcement learning is the closest technology we have got in terms of true artificial intelligence.
Reinforcement learning algorithms are better tools to find solutions free from bias and discrimination.
It is adaptable and doesn’t require retraining.
Can learn live online (Spotify/Ecommerce Recommendation)
Difference between Reinforcement Learning and Unsupervised Learning:
Reinforcement learning is about learning through rewards and punishments to make decisions, while unsupervised learning is about finding hidden patterns or structures in data without any rewards or feedback.
Analogy:
Imagine you’re observing a dog. In reinforcement learning, you are training the dog by giving it a treat when it sits and saying “no” when it misbehaves. The dog learns which actions lead to rewards.
In unsupervised learning, you are not interacting with the dog at all. Instead, you’re just watching a group of dogs and trying to figure out patterns on your own, like which ones behave similarly or belong to the same breed.
Natural Language Processing (NLP)
Field of AI concerned with the interactions between computers and human natural languages.
Focuses on programming computers to process and analyze large amounts of natural language data.
Most complex part of NLP is extracting accurate context and meaning given a natural language.
Involves two main tasks (some applications may require both, while others may only need one):
Natural Language Understanding (NLU):
Maps the given input from natural language into a formal representation and analyzes it.
Example: If the input is audio, then speech recognition is applied first. This converts the audio to text, and then the hard part of interpreting the meaning of the text is performed.
Natural Language Generation (NLG):
Process of producing meaningful phrases and sentences in the form of natural language from some internal representation.
NLG is generally considered much easier than NLU.
It can also convert generated text into speech (e.g., Siri and Alexa).
Generative AI: a broad description of AI that generates data, such as text, images, audio, video, code using some form of AI.
Natural Language Processing (NLP) Use Cases
ChatBots
Analyze survey results
Document review for compliance, legality, typo, etc.
Focus on developing algorithms capable of understanding, generating, and interacting with human languages.
Their key advantage is that they excel at understanding context over long stretches of unstructured text data.
How do LLMs work?
LLMs are built on layers of neural networks, specifically designed to mimic how humans process and generate language.
They can’t think like humans, but they leverage human language patterns to simulate human-like text generation.
One way they do this is by predicting the next most likely word in a sequence.
How are LLMs trained?
Pre-Training:
LLMs are exposed to massive amounts of text data from various sources like wikis, blogs, social media, news articles, and books.
During this process, they learn and practice predicting the next word in sentences.
Fine-Tuning:
The model is then trained on datasets specific to a particular task, allowing it to apply its capabilities to solve specific business challenges.
The versatility of LLMs lies in their ability to be customized for a wide range of tasks, from general to highly specific.
Examples of Large Language Models
Model
Advantages
Category
GPT-1
First publicly available GPT (Generative Pre-Trained) model
Text
GPT-2
Significantly increased performance over GPT-1
Text
GPT-3
State-of-the-art performance on many NLP tasks
Text
Jurassic-1 Jumbo
Large and powerful LLM, excels in code generation
Text
GPT-j
Focused on translation tasks, excels in multilingual translation
Text
DALL-E 2
Generates high-quality and creative images
Image
Midjourney
Generates high-quality and creative images. It utilizes Discord as its interface for generating AI art.
Image
Stable Diffusion
Generates high-quality and creative images
Image
BERT
Excellent for text understanding and question answering
Text
Bard
Large language model from Google AI, similar to LaMDA
Text
LaMDA
Focuses on dialogue applications, can be informative and comprehensive
Text
Prompt Engineering: crafting ideal inputs in order to get the most out of large models.
Token: A unit easily understood by a language model. One word can be made up of multiple tokens. Example:
Words
Tokens
Everyone
[Every, one]
I’d love
[I, ‘d, love]
Tokenization: The mechanism by which a model splits its inputs into tokens. A given method can greatly affect a modelโs output. Large Language Models take an input and produces a token as output. The model uses its training from vast text sources to predict what words likely follow the input tokens.
Improves the accuracy and reliability of the traditional large language models (LLMs) by incorporating information from external sources.
It is ideal for situations where the underlying data changes frequently and the system needs to generate tailored outputs based on the latest information.
Traditional LLMs generate text based solely on the input they receive and their learned parameters. They do not directly retrieve external information during the generation process.
External Knowledge Base: RAG integrates the LLM with an external source of reliable information, like a specialized database or knowledge base. This allows the LLM to access and reference factual information when responding to prompts or questions.
In a way, attempts to combine LLM capabilities with traditional Search Engine (i.e. if ChatGPT & Google had a child)
Improve Response Generation: When you ask a question, RAG first uses the LLM to understand your intent. Then, it retrieves relevant information from the external knowledge base and feeds it back into the LLM. Finally, the LLM uses this combined knowledge to generate a response that is both comprehensive and accurate.
Multimodal Generative AI: Involves multiple data types (text, images, audio).
Expert System: Rule-based systems with static logic; less flexible for dynamic, frequently changing information.
RAGs Help LLMs by:
Be more factually accurate
Stay up-to-date with current information
Provide users with a better sense of where their answers are coming from
Challenges:
LLMs: LLMs generate text based on the patterns and associations learned from vast amounts of text data. Therefore, the main challenge with LLMs lies in their potential to generate incorrect or misleading information, especially in scenarios where the training data is biased or incomplete.
RAGs: While RAG addresses some of these issues by leveraging external knowledge, it introduces challenges related to the retrieval process itself(ethical, violations of terms), such as ensuring the retrieved information is accurate, up-to-date, and relevant to the context of the generated text.
Computer Vision (CV)
There are 2 types of Computer Vision algorithms:
Classical CV: excels at specific tasks like object detection (identifying cats or dogs) with high speed and accuracy.
Deep Learning CV: used when classical methods don’t provide enough power for complex tasks. Deep Learning algorithms can learn intricate patterns from vast amounts of data, enabling them to tackle more challenging computer vision problems. Examples:
Facial Recognition: Deep learning algorithms can analyze facial features with high accuracy, enabling applications like unlocking smartphones with your face, identifying individuals in security footage, or even personalizing advertising based on demographics.
Self-Driving Cars: Deep learning is crucial for self-driving cars to navigate complex environments. These algorithms can process visual data from cameras in real-time, allowing the car to identify objects like pedestrians, vehicles, and traffic signals, and make decisions accordingly.
Medical Image Analysis: Deep learning can analyze medical images (X-rays, MRIs) with impressive accuracy, assisting doctors in tasks like cancer detection, anomaly identification, and treatment planning.
Object Detection and Tracking: While classical CV can handle basic object detection, deep learning excels at identifying and tracking a wider range of objects in real-time. This is valuable for applications like surveillance systems, sports analytics, and autonomous robots.
Image Segmentation: Deep learning can segment images into specific regions, allowing for a more granular understanding of the content. This is useful in applications like autonomous farming (identifying crops vs weeds), self-checkout systems (differentiating between items), and augmented reality (overlaying virtual objects on real-world scenes).
Handwriting Recognition (HWR)
Imagine you have a letter written by your friend, but instead of typing, they used pen and paper. Handwriting Recognition (HWR) is like a magic decoder for that letter. So, HWR takes your friend’s written letter (handwriting) and turns it into something a computer can understand (recognition). It basically translates the scribbles into text you can read on a screen!
AI architecture is basically how an AI system is builtโhow it learns, thinks, and makes decisions. Think of it like the blueprint or design plan of a machine. Different architectures use different methods to understand and work with data.
Core Types of AI Architectures
Category
Examples
Description
Symbolic AI
Expert Systems, Logic Rules
Uses predefined logic and rules, no learning. Good for explainability.
Statistical / Classical ML
Decision Trees, Random Forests, SVM
Learns from data using traditional algorithms. Structured, tabular data.
Neural Networks (Deep Learning)
CNNs, RNNs, Transformers, LLMs
Learns patterns from unstructured data (images, text). Highly scalable.
Generative Models
GANs, VAEs, Diffusion Models, LLMs
Learns to generate new data similar to what it was trained on.
Hybrid / Neuro-symbolic
Symbolic + Neural (e.g., Logic + LLM)
Combines rule-based reasoning with learning-based perception.
Uses search to retrieve facts, then generates answers. Reduces hallucinations.
General-Purpose AI (GPAI)
Not yet fully realized (AGI)
Hypothetical AI that can perform any cognitive task. LLMs are precursors.
What About Auxiliary Tools?
These are tools and software that help you build, run, and manage AI systems. Theyโre not AI themselves but are essential to making AI work in the real world. Some examples:
Tool Type
Examples
Purpose
Frameworks
TensorFlow, PyTorch, Scikit-learn
Build and train models
Data Processing
Pandas, NumPy, Apache Spark
Prepare data for AI
Model Ops / Deployment
MLflow, Kubernetes, SageMaker
Manage lifecycle of models
Monitoring
WhyLabs, Evidently AI, Arize
Track model performance
Explainability
SHAP, LIME
Make AI decisions interpretable
Data Labeling
Labelbox, Prodigy
Annotate training data
Embedding Stores / Vector DBs
Pinecone, FAISS, Weaviate
Power retrieval in RAG systems
Prompt Engineering / LLM Toolkits
LangChain, LlamaIndex
Build apps on top of LLMs
This is just the tip of the iceberg when it comes to AI. There’s a whole world out there, and I’m just getting started exploring it myself. I hope to share future notes on any additional learning. Feel free to reach out if you have any questions or comments.
AI Architecture = the structural design of how an AI system thinks and learns.
Auxiliary Tools = supporting tech used to build, deploy, interpret, and scale AI systems.
AI Stack Layers (Simplified)
Platforms and Applications (Top Layer) This is where people interact with AI. It includes things like:
Cloud services (AWS SageMaker, Google AI Platform)
Apps powered by AI (chatbots, recommendation systems)
APIs that let developers plug AI into their own software
Think of this as the user-friendly interface and infrastructure that make AI accessible and useful. Itโs above the actual AI models and tools, it wraps everything so users can easily use the AI.
Model Types (Middle Layer) These are the actual AI architectures: the brains behind the scenes. Examples:
Neural networks (including transformers)
Decision trees and random forests
Expert systems (rule-based AI)
Generative models
This layer does the learning and decision-making. It takes data and figures out patterns or generates new content.
Auxiliary Tools (Supporting Layer) These help build, train, deploy, monitor, and explain the AI models. Examples:
Frameworks like PyTorch or TensorFlow
Data processing libraries
Monitoring and explainability tools
Vector databases for retrieval
What Does It Mean for AI to Be Robust?
A robust AI system holds up well when things get messy in the real world. It keeps doing its job correctly even when faced with noisy data, weird inputs, or deliberate attempts to throw it off course.
Example: Think of a robust self-driving car that stays safe on the road despite sudden downpours, partially covered street signs, or unexpected construction work. It handles these curveballs without missing a beat.
This isn’t the same as:
Reliable: A reliable system is consistent but only under normal conditions. It’s like that friend who’s always on time when everything goes according to plan, but falls apart when complications arise. Example: A reliable car performs perfectly on a clear day with perfect road markings, but might get confused during a heavy fog or when facing unusual traffic patterns.
Resilient: A resilient system bounces back after problems but doesn’t necessarily stay steady during the rough patch. Example: A resilient car might temporarily shut down certain functions when it detects a problem, then restore normal operation once conditions improve. It recovers well, but doesn’t necessarily power through difficulties.
Disclaimer: Various AI tools were used to assist with research and content writing for this blog. While every effort has been made to verify the accuracy of the information presented, some details may evolve as AI technology advances. Readers are encouraged to consult additional sources for the most up-to-date information.