Why AI images are detectable at all

AI image generators work by learning statistical distributions of pixel values across millions of training images. When they generate a new image, they sample from these learned distributions. The sampling process is sophisticated but it is not physically grounded: the generator has no understanding of optics, physics, or biology. It reproduces patterns that statistically resemble photographs but lacks the physical constraints that shape real photographs.

These missing physical constraints leave traces. Real photographs have noise patterns from camera sensors, optical aberrations from lenses, compression artifacts from specific codecs, and biological constraints on how faces, hands, and eyes are structured. AI-generated images either omit these patterns or reproduce them imperfectly in ways that are statistically distinguishable.

The main detection signals

〜

Frequency domain analysis

High reliability

Real camera images have characteristic patterns in the frequency domain (Fourier space) based on their optical and sensor properties. AI generators produce frequency signatures that differ from real cameras, typically showing periodic artifacts or missing the noise floor that real sensors introduce. This signal is examined in the low-frequency and mid-frequency bands.

⊞

CNN feature extraction

High reliability

Convolutional neural networks trained on both real and AI-generated images learn to recognize feature patterns that distinguish the two. These features are not human-interpretable but correspond to subtle texture regularities that real images have and AI images produce differently. This is one of the most powerful signals for recent generators.

◫

Edge and boundary artifacts

Medium-High reliability

Real photographs have continuous, physics-consistent gradients at object boundaries. AI generators sometimes produce edge artifacts: slightly unnatural transitions, halos at hair-background boundaries, inconsistent depth-of-field application, and uneven smoothness at the border between detailed and plain areas.

▦

Texture statistics

Medium reliability

Natural textures (skin, fabric, foliage, water) follow statistical distributions that have been characterized by human visual system research. AI generators often over-regularize texture, producing skin that is too smooth, fabric that is too uniform, or foliage that repeats at implausible scales.

◎

Semantic consistency checks

Medium reliability

Some detection approaches look for physical impossibilities: reflections in eyes or glasses that do not match the lighting, hands with the wrong number of fingers or joints at impossible angles, text in images that is garbled or semi-coherent, and backgrounds with perspective inconsistencies.

≡

Metadata analysis

Low-Medium reliability

Real photographs typically contain EXIF metadata including camera model, GPS, timestamp, and lens information. AI-generated images often lack this metadata entirely, or contain metadata inconsistencies (for example, a claimed smartphone image without GPS data).

Detection accuracy by image type

Detection rates vary significantly by the type of AI generator used and the content of the image:

Diffusion model photorealistic portraits (Midjourney v6, DALL-E 3)

Strong frequency and CNN signatures; skin texture statistics are distinguishable

75-90%

Diffusion model scenic/landscape images

Lower accuracy; natural scenes have less constrained expectations than human faces

65-80%

GAN-generated faces (StyleGAN, older generators)

GAN upsampling artifacts are well-characterized; older generators have strong fingerprints

85-95%

AI-generated illustrations and art

Art is not constrained by photorealistic physics; texture and frequency signals are weaker

50-70%

Screenshots of AI-generated text

Image detector less useful here; text detection on the text content is more accurate

60-75%

AI image that has been compressed, cropped, or edited

Post-processing can degrade or destroy some frequency fingerprints; accuracy drops

40-65%

Where image detection reaches its limits

Image detection is meaningfully harder than text detection in several respects:

Post-processing erases fingerprints

Compression (especially JPEG at low quality settings), color grading, resizing, and social media platform re-encoding can degrade or destroy frequency domain fingerprints. An AI image passed through Instagram or Twitter may score significantly lower than the original. This is a fundamental limitation: any transformation that modifies pixel values modifies the statistical fingerprint.

Newer generators improve every few months

The detection arms race is ongoing. Newer model versions specifically address known detection weaknesses. Midjourney v6 is harder to detect than v4. DALL-E 3 produces different artifacts than DALL-E 2. Detection models need to be regularly updated with synthetic images from new generator versions to maintain accuracy.

Non-photorealistic content lacks ground truth

For art, illustration, and stylized images, there is no clear physical-world ground truth for what the image should look like. Detection accuracy on these content types is lower because the signals derived from photorealistic physics constraints do not apply.

Hybrid content (AI-edited real photos)

Inpainting, outpainting, and generative fill applied to real photographs produce images that are partly real and partly AI-generated. Detectors trained on fully-synthetic vs fully-real binaries handle these edge cases poorly. This is an emerging challenge as editing tools become more widespread.

How to use image detection effectively

•Use the highest-quality version of the image available. If you have the original before social media re-encoding, use that. Lossless PNG preserves more frequency information than compressed JPEG.
•Treat image detection as one signal among several. A high score is meaningful evidence; a low score on a compressed or heavily edited image does not prove the image is genuine.
•If an image contains text you want to verify, extract the text and run it through text detection as well. Airno's text detector is more accurate than the image detector for text content.
•For high-stakes verification (alleged documentary photos, claimed news images), use multiple independent tools and consult human experts. Single-tool AI detection is not sufficient for public-interest decisions.
•The semantic consistency checks (impossible hands, incorrect reflections, inconsistent lighting) are worth examining manually even if the automated score is low.

For more context on AI detection in high-stakes journalistic contexts, see AI Detection for Journalists. For how text detection works (a different set of signals), see How AI Detection Works and What Is Perplexity in AI Detection?

Check an image now

Upload or drag in any image. Frequency analysis, CNN features, artifact detection, and metadata checks. Free, no account needed.

Try Airno free