Open Standard
The VDX Framework
An open quality standard for evaluating AI creative tools — image models, video models, and creative platforms. Free to use. Take what you need. The only ask: call it VDX, so everyone speaks the same language.
Last updated: April 2026
Why an Open Standard?
Most AI tool comparisons use a single overall score or subjective preference votes. Neither tells you why one tool is better for your specific use case. The VDX Framework breaks quality into structured dimensions so assessments are comparable, explainable, and actionable — no matter who runs them.
These dimensions were shaped by creators and creative platforms who told us what actually matters when choosing AI tools. We publish them openly because comparable assessments help everyone — creators making decisions, platforms improving their products, and reviewers producing evaluations others can build on.
Use the full framework or just the dimensions that matter most to you. The point is a shared vocabulary: when someone says “VDX Visual Fidelity: 4.2”, everyone knows exactly what that means.
How It Works
The framework uses a two-level hierarchy. L1 dimensions are the primary quality categories — these are what most people will use. Each L1 has L2 sub-dimensions for deeper analysis when it matters. For most comparisons, scoring at the L1 level is enough.
The quick version:
- Pick the framework that matches what you're evaluating (image, video, or platform)
- Score each L1 dimension on a 1–5 scale
- Weight dimensions by what matters for your use case
- Compare across tools using the same dimensions
Only go to L2 when you need to understand why something scored the way it did, or when two tools are close on an L1 and you need a tiebreaker.
AI Image Models
4 dimensions, 12 sub-dimensions. Covers everything from photorealism to typography to hand anatomy.
Visual Fidelity
Overall image quality, aesthetics, and visual polish.
- •Aesthetics — artistic quality, color harmony, visual impact
- •Image Quality — sharpness, noise, artifact-free rendering
- •Composition — framing, balance, visual hierarchy
Physics & Logic
Realistic lighting, materials, gravity, and physical plausibility.
- •Static Physics — gravity, support, spatial relationships
- •Material Physics — textures, reflections, transparency
- •Biomechanics — natural poses, joint articulation, hand anatomy
Subject & Object Integrity
Accurate anatomy, object coherence, and scene consistency.
- •Human Subjects — anatomy, faces, hands, proportions
- •Object Integrity — structural coherence, correct details
- •Scene Logic — spatial relationships, context consistency
Instruction Adherence
How faithfully the output matches the prompt.
- •Semantic Accuracy — correct subjects, actions, attributes
- •Spatial Framing — camera angle, layout, positioning
- •Text Rendering — accuracy and legibility of in-image text
AI Video Models
5 dimensions, 13 sub-dimensions. Extends the image framework with motion, temporal consistency, and audio.
Visual Fidelity
Frame-level image quality, lighting, and clarity.
- •Resolution & Clarity — sharpness, detail preservation
- •Lighting & Color — exposure, color accuracy, consistency
- •Artifact-Free — no flicker, banding, or compression issues
- •Composition — framing, visual balance across frames
Motion & Physics
Realistic movement, physics, and temporal stability.
- •Physics Accuracy — gravity, collisions, fluid dynamics
- •Motion Fluidity — smooth movement, natural speed
- •Temporal Stability — no morphing, warping, or jitter between frames
Subject & Scene Consistency
Identity and environment maintained across the clip.
- •Scene Coherence — consistent environment, lighting, perspective
- •Identity Preservation — subjects stay recognizable throughout
Instruction Adherence
How faithfully the video matches the prompt.
- •Semantic Accuracy — correct subjects, actions, scene described
- •Camera Control — requested angles, movements, transitions
Audio & Sync
Audio quality and synchronization with visuals.
- •Audio Quality — clarity, naturalness, appropriate sound design
- •Lip Sync — mouth movement matches speech timing and phonemes
Creative AI Platforms
5 dimensions, 20 sub-dimensions. Evaluates the full platform experience beyond just output quality — from onboarding to export to trust.
Generation Core
The core creation experience — from prompt to output.
- •Onboarding — clicks to first output, guided tutorials
- •Prompt Assistance — enhancement, suggestions, style presets
- •Model Selection — number/quality of models, routing transparency
- •Speed — generation time, queue handling, batch support
- •Output Quality — overall quality across available models
Post-Generation
What you can do with outputs after they're created.
- •Iteration Tools — inpainting, outpainting, variations, upscaling
- •Editing Suite — built-in editing (crop, filter, enhance, layers)
- •Cross-Modal — image, video, audio, text, 3D in one platform
Output & Access
Getting your work out of the platform and managing it.
- •Export Options — formats, resolution, watermark-free exports
- •Output Management — gallery, history, search, collections
- •Mobile Experience — app quality, responsive design, touch UX
- •Templates & Presets — pre-made templates, reusable style presets
Integration & Value
Developer tools, customization, and team workflows.
- •API Access — availability, documentation, developer tools
- •Customization — LoRA training, fine-tuning, custom models
- •Collaboration — team features, sharing, co-editing, workspaces
- •Pricing Transparency — clear pricing, no hidden fees, fair credits
Trust & Experience
Rights, safety, and platform reliability.
- •Usage Rights — commercial rights, ownership, licensing clarity
- •Content Safety — moderation, NSFW filters, safety guardrails
- •Platform Trust — reputation, billing practices, refund policy
- •UX Polish — UI quality, responsiveness, error handling
Who Shaped This
The VDX dimensions come from direct input from creators who use AI tools daily and creative platforms building these products. They told us what actually drives their tool and model choices — and what existing benchmarks miss. This is what they care about most.
The framework continues to evolve. If you think a dimension is missing, poorly defined, or weighted wrong, we want to hear from you.
How Vibedex Applies the VDX Framework
For our own benchmarks, we go further. Every model runs the same 200+ test prompts under identical conditions. Each output is scored by an AI judge (Gemini 3 Pro) across the relevant dimensions. We use intent-aware weighting — a product photography prompt weights Visual Fidelity higher, while a fantasy illustration prioritizes Subject Integrity.
200+
Test Prompts
Photorealism, illustration, typography, product shots, concept art, and edge cases
30
Models & Platforms
Image models, video models, and creative platforms — same prompts, same conditions
3,500+
Evaluations
Every model-prompt pair scored across multiple quality dimensions
Limitations
No benchmark is perfect. We believe in being transparent about ours:
- •Automated scoring — our evaluations are AI-judged. Human review validates trends but does not produce individual scores. Multi-judge validation is in progress.
- •English-focused prompt set — all evaluation prompts are currently in English. Multi-language support is planned.
- •Single generation per pair — we generate one output per model-prompt combination. No cherry-picking, but also no variance sampling.
- •Models update frequently — providers ship updates regularly. Our scores reflect performance at evaluation date and are re-run periodically.
- •Artistic subjectivity — style preference is inherently personal. Our scores measure technical quality, not taste.
Use the VDX Framework
The VDX Framework is open for everyone — creators, reviewers, platforms, researchers, consultants. Use all of it or just the dimensions that matter for your evaluation. Score your own tools, publish your own comparisons, build products that optimise for these dimensions.
The only thing we ask: refer to it as the VDX Framework so that others recognise the same standard. When everyone uses the same dimensions, assessments become comparable across the industry — and that helps everyone make better decisions.
Attribution:
Evaluated using the VDX Framework (vibedex.com/methodology)
See VDX in action
Vibedex applies the VDX Framework to recommend the best AI model for your specific prompt — weighted by what your image or video demands.