Open Standard

The VDX Framework

An open quality standard for evaluating AI creative tools — image models, video models, and creative platforms. Free to use. Take what you need. The only ask: call it VDX, so everyone speaks the same language.

Last updated: April 2026

Why an Open Standard?

Most AI tool comparisons use a single overall score or subjective preference votes. Neither tells you why one tool is better for your specific use case. The VDX Framework breaks quality into structured dimensions so assessments are comparable, explainable, and actionable — no matter who runs them.

These dimensions were shaped by creators and creative platforms who told us what actually matters when choosing AI tools. We publish them openly because comparable assessments help everyone — creators making decisions, platforms improving their products, and reviewers producing evaluations others can build on.

Use the full framework or just the dimensions that matter most to you. The point is a shared vocabulary: when someone says “VDX Visual Fidelity: 4.2”, everyone knows exactly what that means.

How It Works

The framework uses a two-level hierarchy. L1 dimensions are the primary quality categories — these are what most people will use. Each L1 has L2 sub-dimensions for deeper analysis when it matters. For most comparisons, scoring at the L1 level is enough.

The quick version:

Pick the framework that matches what you're evaluating (image, video, or platform)
Score each L1 dimension on a 1–5 scale
Weight dimensions by what matters for your use case
Compare across tools using the same dimensions

Only go to L2 when you need to understand why something scored the way it did, or when two tools are close on an L1 and you need a tiebreaker.

Framework 1

AI Image Models

4 dimensions, 12 sub-dimensions. Covers everything from photorealism to typography to hand anatomy.

Visual Fidelity

Overall image quality, aesthetics, and visual polish.

•Aesthetics — artistic quality, color harmony, visual impact
•Image Quality — sharpness, noise, artifact-free rendering
•Composition — framing, balance, visual hierarchy

Physics & Logic

Realistic lighting, materials, gravity, and physical plausibility.

•Static Physics — gravity, support, spatial relationships
•Material Physics — textures, reflections, transparency
•Biomechanics — natural poses, joint articulation, hand anatomy

Subject & Object Integrity

Accurate anatomy, object coherence, and scene consistency.

•Human Subjects — anatomy, faces, hands, proportions
•Object Integrity — structural coherence, correct details
•Scene Logic — spatial relationships, context consistency

Instruction Adherence

How faithfully the output matches the prompt.

•Semantic Accuracy — correct subjects, actions, attributes
•Spatial Framing — camera angle, layout, positioning
•Text Rendering — accuracy and legibility of in-image text

Framework 2

AI Video Models

5 dimensions, 13 sub-dimensions. Extends the image framework with motion, temporal consistency, and audio.

Visual Fidelity

Frame-level image quality, lighting, and clarity.

•Resolution & Clarity — sharpness, detail preservation
•Lighting & Color — exposure, color accuracy, consistency
•Artifact-Free — no flicker, banding, or compression issues
•Composition — framing, visual balance across frames

Motion & Physics

Realistic movement, physics, and temporal stability.

•Physics Accuracy — gravity, collisions, fluid dynamics
•Motion Fluidity — smooth movement, natural speed
•Temporal Stability — no morphing, warping, or jitter between frames

Subject & Scene Consistency

Identity and environment maintained across the clip.

•Scene Coherence — consistent environment, lighting, perspective
•Identity Preservation — subjects stay recognizable throughout

Instruction Adherence

How faithfully the video matches the prompt.

•Semantic Accuracy — correct subjects, actions, scene described
•Camera Control — requested angles, movements, transitions

Audio & Sync

Audio quality and synchronization with visuals.

•Audio Quality — clarity, naturalness, appropriate sound design
•Lip Sync — mouth movement matches speech timing and phonemes

Framework 3

Creative AI Platforms

5 dimensions, 20 sub-dimensions. Evaluates the full platform experience beyond just output quality — from onboarding to export to trust.

Generation Core

The core creation experience — from prompt to output.

•Onboarding — clicks to first output, guided tutorials
•Prompt Assistance — enhancement, suggestions, style presets
•Model Selection — number/quality of models, routing transparency
•Speed — generation time, queue handling, batch support
•Output Quality — overall quality across available models

Post-Generation

What you can do with outputs after they're created.

•Iteration Tools — inpainting, outpainting, variations, upscaling
•Editing Suite — built-in editing (crop, filter, enhance, layers)
•Cross-Modal — image, video, audio, text, 3D in one platform

Output & Access

Getting your work out of the platform and managing it.

•Export Options — formats, resolution, watermark-free exports
•Output Management — gallery, history, search, collections
•Mobile Experience — app quality, responsive design, touch UX
•Templates & Presets — pre-made templates, reusable style presets

Integration & Value

Developer tools, customization, and team workflows.

•API Access — availability, documentation, developer tools
•Customization — LoRA training, fine-tuning, custom models
•Collaboration — team features, sharing, co-editing, workspaces
•Pricing Transparency — clear pricing, no hidden fees, fair credits

Trust & Experience

Rights, safety, and platform reliability.

•Usage Rights — commercial rights, ownership, licensing clarity
•Content Safety — moderation, NSFW filters, safety guardrails
•Platform Trust — reputation, billing practices, refund policy
•UX Polish — UI quality, responsiveness, error handling

Who Shaped This

The VDX dimensions come from direct input from creators who use AI tools daily and creative platforms building these products. They told us what actually drives their tool and model choices — and what existing benchmarks miss. This is what they care about most.

The framework continues to evolve. If you think a dimension is missing, poorly defined, or weighted wrong, we want to hear from you.

How Vibedex Applies the VDX Framework

For our own benchmarks, we go further. Every model runs the same 200+ test prompts under identical conditions. Each output is scored by an AI judge (Gemini 3 Pro) across the relevant dimensions. We use intent-aware weighting — a product photography prompt weights Visual Fidelity higher, while a fantasy illustration prioritizes Subject Integrity.

200+

Test Prompts

Photorealism, illustration, typography, product shots, concept art, and edge cases

Models & Platforms

Image models, video models, and creative platforms — same prompts, same conditions

3,500+

Evaluations

Every model-prompt pair scored across multiple quality dimensions

Limitations

No benchmark is perfect. We believe in being transparent about ours:

•Automated scoring — our evaluations are AI-judged. Human review validates trends but does not produce individual scores. Multi-judge validation is in progress.
•English-focused prompt set — all evaluation prompts are currently in English. Multi-language support is planned.
•Single generation per pair — we generate one output per model-prompt combination. No cherry-picking, but also no variance sampling.
•Models update frequently — providers ship updates regularly. Our scores reflect performance at evaluation date and are re-run periodically.
•Artistic subjectivity — style preference is inherently personal. Our scores measure technical quality, not taste.

Use the VDX Framework

The VDX Framework is open for everyone — creators, reviewers, platforms, researchers, consultants. Use all of it or just the dimensions that matter for your evaluation. Score your own tools, publish your own comparisons, build products that optimise for these dimensions.

The only thing we ask: refer to it as the VDX Framework so that others recognise the same standard. When everyone uses the same dimensions, assessments become comparable across the industry — and that helps everyone make better decisions.

Attribution:

Evaluated using the VDX Framework (vibedex.com/methodology)

See VDX in action

Vibedex applies the VDX Framework to recommend the best AI model for your specific prompt — weighted by what your image or video demands.

Try Vibedex →Read our research