AI Observe 360

An enterprise AI model guardrail portal that monitor LLM & ML ethical engagement, identify and solve ML data risks

About


Project Type / Enterprise SaaS, AI guardrails

Role / Lead UX Designer, Researcher

Stakeholders / PM, AI Engineer, Data Scientist

Timeline / Jan 2025 - June 2025

As Kimberly-Clark continues to integrate AI solutions across every step of its business operations, understanding the engagement, value, and risks of these tools is critical to effectively leveraging AI for decision-making while minimizing legal and compliance risks related to data misuse.

Powered by real engineering insights, the system enables users to monitor analytics on LLM keyword trend, ML data alerts, and Agentic AI engagement—from high-level overviews to deep-dive analyses.

Challenges


  • 0→1 initiative

  • Navigate ambiguity around AI model structures

  • Design a storytelling analytics journey that transforms data into actions.

AI 360 began as a 0-to-1 initiative with no existing infrastructure to build upon. This meant we started without clear visibility into how AI models were architected, what metrics mattered most to users, or how those metrics should connect to form an intuitive and insightful narrative.

Our challenge was to design a system that not only made AI performance data discoverable and comprehensible, but also turned those metrics into meaningful stories that could drive informed business decisions—beyond just numbers on a dashboard.

But what are the metrics for AI/ML Models users are looking for?

Leadership

AI 360 Overview

Although all users start at the overview, but project managers and engineers need to dive into detailed use-case views. Because these pages mirror the complexity of how AI/ML setups, our design needed to bring clarity without sacrificing any essential information.

AI Engineers

Metrics

Not everything needs to start from research, for this project action-first approach allow us to move faster and iterate earilier.

Instead of running broad surveys to capture every possible metric, we partnered directly with stakeholders validated them with which revealed a clear pattern: leadership prioritized organization-wide engagement, while AI engineers focused on model performance. These behavioral insights shaped the initial “Overview → Deep Dive” structure of AI 360.

How might we bring clarity to AI model structures that are inherently ambiguous and difficult for users to comprehend?

Deep Dive

Design Perspective

Solution:

  • Aligned terminology with clear diagrams.

  • Adapted the journey to engineering’s mental model.

  • Improve the journey with UX principles grounded in the shared alignment.

Engineer Perspective

user flow

Misalignment:

Shared terms didn’t equal shared understanding—design and engineering defined “products,” “features,” and “use cases” differently, leading to a broken product hierarchy and a confusing exploration flow.

Design sees a hierarchy of Product → Features → Models, while developers treat anything using a model—product or feature alike—as a “use case,” because model quality controls all feature behavior.

Learning that is crucial for design to suggest the deep-dive journey start at the product level and let users move directly between the data models powering it, bypassing feature-level complexity.

With the high-level architecture aligned, we began to craft the experience leading to the goal of action-driven data monitoring experience.

Data Storytelling

The AI/ML monitoring scope was originally defined across three focus areas. However, mid-project, business stakeholders challenged the direction, leading us to shift priorities. Instead of spinning off an entirely new monitoring section, I reframed the proposal to emphasize the long-term value of all three areas while recommending a short-term focus on the most feasible ones—those with manageable technical complexity, fewer operational constraints, and, most importantly, reliable data that didn’t rely on heuristic assumptions.

Stakeholder Concerns:

High Cognitive Load: Especially for those unfamiliar with AI/ML performance monitoring.

Weak Storytelling Hierarchy: The cause-and-effect flow is implicit, making the data exploration journey—value decay → accuracy → drift → data quality—hard to comprehend.

Lower Engagement With Actions: Real data quality issues are buried at the end, causing deviation from the root reason for value decay.

UX Concerns:

Requires User Interaction to See Full Picture: Some users may not click through, missing important metrics unless prompted

Inefficient for Expert Users: For users who already learned the process, they might want to skip the flow.

Work Around:

Added a Root Cause Summary:
Provides instant insights linked to their graphs for clarity and trust.

Toggle to Switch Views: Guided Mode ↔ Expert Mode

Instructional Design Enhancements:
Introduced numbered steps across the diagnostic journey to illustrate cause-and-effect order.