The Untold Story of How JPEG Became the Internets Universal Language
by Scott
Every time you scroll through a news website, browse a shopping platform, or open an attachment in your email, you are almost certainly looking at a JPEG. You probably do not think about it. Why would you? The format is invisible in the way that truly successful technologies become invisible. It is simply there, like electricity or running water, so deeply woven into the fabric of digital life that questioning its existence feels almost absurd. But JPEG did not have to win. It was not inevitable. It was not even the best option available at the time it rose to dominance. The story of how a committee-designed compression standard from the late 1980s became the universal visual language of the internet is a story about timing, politics, compromise, and the strange way that good-enough technology tends to beat great technology when the stakes are high enough.
To understand where JPEG came from, you have to go back to a world before the web existed in any recognizable form. In the mid-1980s, the digitization of images was becoming a serious problem for a range of industries. Medical imaging, satellite photography, desktop publishing, and early multimedia computing all faced the same fundamental challenge: images, when stored as raw pixel data, were enormous. A single uncompressed photograph at even modest resolution could consume more storage than most personal computers of the era could handle. Transmitting such an image over a network connection was a practical impossibility for most organizations. Something had to give.
The Joint Photographic Experts Group was formed in 1986 as a collaboration between two standards bodies, the International Organization for Standardization and the International Electrotechnical Commission. The name itself was a compromise, reflecting the awkward institutional parentage of the effort. The group’s mandate was to develop a standard method for compressing continuous-tone still images, meaning photographs and photorealistic graphics as opposed to line drawings or text. The word “joint” in the name was not about any philosophical unity of purpose. It was bureaucratic shorthand indicating that two separate standards organizations had agreed to work together rather than produce competing formats.
The people who gathered in those early meetings were not visionaries dreaming of a future internet. They were engineers, academics, and industry representatives trying to solve a practical storage and transmission problem for their respective fields. The group evaluated dozens of proposed compression techniques between 1986 and 1988, running formal tests on image quality and compression efficiency. The winning approach was based on a mathematical technique called the Discrete Cosine Transform, a method that had been developed in the early 1970s by Nasir Ahmed, a professor at Kansas State University who had been inspired by an earlier technique called the Karhunen-Loeve transform. Ahmed had developed the DCT without any particular application in mind. He was a mathematician solving a mathematical problem. The idea that his work would eventually become the foundation of the most widely used image format in human history would almost certainly have seemed far-fetched to him at the time.
The way JPEG compression actually works is elegant, and understanding it helps explain both its strengths and its notorious weaknesses. When you compress an image using JPEG, the file is first divided into small blocks, typically eight pixels by eight pixels. Each block is then transformed using the Discrete Cosine Transform, which converts the spatial information in the block into frequency information. Think of it this way: instead of describing a block of pixels by saying what color each individual pixel is, the DCT describes the block by saying how much of certain patterns of variation it contains. High-frequency components represent rapid changes across the block, like sharp edges. Low-frequency components represent gradual changes, like a smooth gradient of sky color.
Here is where the compression magic happens. Human vision is significantly less sensitive to high-frequency variations than to low-frequency ones. We are very good at noticing gradual changes in brightness and color, and considerably less good at noticing fine texture detail, especially in areas of high contrast. JPEG exploits this by keeping the low-frequency components with high precision and aggressively rounding off or eliminating the high-frequency components. This process is lossy, meaning information is permanently discarded. When you decompress the image, you do not get back exactly what you put in. What you get back is an approximation that, within a fairly wide range of compression levels, looks close enough to the original that most human observers cannot tell the difference under normal viewing conditions.
The standard was finalized in 1992, and it was a genuine technical achievement. For photographic images, JPEG could achieve compression ratios of ten to one or even higher while maintaining what most observers would consider acceptable image quality. The same image that would have taken a megabyte to store uncompressed could be stored in a hundred kilobytes. For the hardware and network infrastructure of the early 1990s, this was not a marginal improvement. It was a transformation.
But technical achievement alone does not create a universal standard. The story of JPEG’s ascent is also a story about what was happening in the broader technology landscape at exactly the right moment. In 1993, NCSA Mosaic was released, the first web browser to display images inline with text rather than opening them in a separate window. Before Mosaic, the web was primarily a text medium. After Mosaic, images became an expected part of the web experience almost immediately. The browser needed to support some image format for these inline images, and the choices available were limited.
The Graphics Interchange Format, or GIF, had been introduced by CompuServe in 1987 and was already established as a practical format for web use. GIF had real advantages: it supported animation, it handled transparency, and it was lossless, meaning images were stored exactly as they appeared. But GIF used a compression algorithm called LZW, which had been patented by Unisys, and the format itself was limited to 256 colors. For photographic images, 256 colors was woefully inadequate. GIF worked beautifully for simple graphics, logos, and the kind of small decorative images that were common in early web design, but photographs rendered in GIF looked terrible. Anyone who wanted to put a real photograph on a web page needed something else.
JPEG was the something else. It had no practical color limitation for photographs, it could achieve much higher compression ratios than GIF for photographic content, and crucially, it was not encumbered by patent restrictions on its core compression method in a way that would prevent browser makers from implementing it freely. When the major browsers of the mid-1990s added JPEG support alongside GIF, they were essentially choosing the two formats that would define web visual design for the next decade and beyond.
The timing could not have been more significant. The mid-1990s were a period of explosive growth in internet usage. Between 1993 and 1995, the web grew from a few hundred servers to tens of thousands, and then to hundreds of thousands. Every website that launched during this period made the same basic technical choices, partly because there were so few choices available, and partly because they were following the examples set by the sites that already existed. JPEG inherited a kind of network effect that had nothing to do with technical superiority and everything to do with being available at the right moment.

There were competitors. The PNG format, short for Portable Network Graphics, was developed in the mid-1990s specifically as a response to the GIF patent situation, but also as a genuinely superior format for many use cases. PNG is lossless, supports full color, handles transparency far more elegantly than GIF, and in many cases achieves better compression than GIF for non-photographic images. PNG eventually became widely used and is now essentially universal, but it never displaced JPEG for photographs because it does not offer the same level of compression for photographic content. A lossless PNG of a high-resolution photograph can be several times larger than a JPEG of equivalent quality. On slow connections, that difference matters enormously.
JPEG 2000 was introduced around the year 2000 as a next-generation replacement for the original JPEG standard. It used a different mathematical approach called wavelet compression rather than the Discrete Cosine Transform, and it was genuinely superior in almost every measurable way. JPEG 2000 offered better image quality at equivalent file sizes, supported both lossy and lossless compression in the same format, handled transparency, and did not produce the characteristic blocky artifacts that JPEG produces at high compression levels. The format was adopted for some specialized applications, notably in digital cinema and medical imaging, where its quality advantages justified the learning curve. But it never made a dent in the web. JPEG was already everywhere. Every browser supported it, every camera produced it, every image editing application understood it. The infrastructure cost of switching was simply too high to justify even a substantially better technology.
This pattern repeated itself multiple times in the years that followed. Microsoft developed a format called HD Photo, later renamed JPEG XR, around 2006 and 2007. It offered better compression than standard JPEG with fewer artifacts. It was implemented in Internet Explorer and supported in some other software, but it gained almost no traction on the web at large. Google developed WebP around 2010, a format based on the VP8 video codec that offered significantly better compression than JPEG for many types of images. WebP took years to achieve meaningful browser support and has never fully displaced JPEG despite genuine technical advantages. More recently, formats like AVIF, based on the AV1 video codec, and JPEG XL, a genuinely next-generation format with remarkable compression efficiency, have entered the picture. Both offer real improvements over the original JPEG. Both face the same structural inertia that has protected JPEG for three decades.
The reason JPEG persists is not technical. Most working engineers would readily admit that better formats exist for most use cases. The reason JPEG persists is that it is already everywhere in a way that creates a cost for switching that almost no improvement in technology can justify. Every camera manufactured in the past twenty years defaults to JPEG. Every image editing application on earth opens JPEG files. Every browser on every platform renders JPEG without any additional configuration. There are estimated to be trillions of JPEG files stored across the world’s devices and servers. The format has achieved what economists sometimes call a lock-in, a state where the switching costs are so high that even a substantially superior alternative cannot gain traction through technical merit alone.
This is a pattern that appears repeatedly in the history of technology. The QWERTY keyboard layout was designed in the 1870s partly to prevent mechanical typewriter keys from jamming, a problem that has been irrelevant for most of a century. Alternative keyboard layouts, some of which have been demonstrated to allow faster typing, have never displaced it. The American electrical grid runs on 60-hertz alternating current at 120 volts, a choice made in the late nineteenth century that differs from the standards adopted by most of the rest of the world. The advantages of harmonizing these systems have never been compelling enough to justify the disruption of actually doing so.
JPEG belongs in this category of technologies that are not the best available option but have become too deeply embedded to displace. What makes JPEG unusual is the specific circumstances of how it became embedded. Unlike the QWERTY keyboard or the American electrical standard, JPEG did not win primarily because it was first. GIF was first, as far as the web was concerned. JPEG won because it was first at solving a specific problem, which was displaying full-color photographic images efficiently over slow network connections, at the exact moment when that problem became the defining challenge of a new and rapidly expanding medium.
There is something worth noticing in the way the format encodes images that connects to its cultural legacy. JPEG is fundamentally optimized for human perception. It does not try to store images accurately. It tries to store them in a way that looks accurate to a human observer under normal conditions. The entire architecture of the format is built around what human eyes notice and what they do not. The high-frequency visual information that gets discarded in JPEG compression is information that, in most viewing contexts, most people would never perceive as missing. The format is not a precise record of light. It is an efficient representation of a human visual experience.
This has led to some interesting cultural phenomena. The JPEG artifact, the characteristic blocky distortion that appears when an image is compressed too aggressively, has become a kind of visual shorthand. In meme culture and internet art, the heavily compressed JPEG artifact has become an aesthetic in its own right, associated with the early internet, with low-resolution screenshots passed around social media, with the particular visual texture of images that have been saved and resaved many times, each save introducing another round of lossy compression and degrading the image slightly further. There is even a technical term for the quality loss that accumulates through repeated compression: generation loss. An image that has been through many JPEG compression cycles is said to be a low-generation copy, a phrase borrowed from the era of analog magnetic tape dubbing.
The format has also shaped photography in ways that most photographers have never consciously considered. For roughly fifteen years, from the mid-1990s through the late 2000s, JPEG was effectively the only output format available to consumer digital camera users. Professional cameras eventually offered RAW formats that captured unprocessed sensor data, but even most professionals delivered final images as JPEGs. The entire visual culture of digital photography developed within the constraints of what JPEG could and could not do well. The format is excellent at smooth tonal gradations and moderate detail. It is poor at very fine texture, sharp geometric patterns, and images with large areas of solid color next to sharp edges. Subtly and over time, these characteristics influenced what photographers chose to shoot, what subjects were considered photogenic in the digital era, and what kinds of images looked good when shared online.
The committee that created JPEG in the late 1980s was solving a practical engineering problem. They were not thinking about internet culture or the visual language of the twenty-first century. They were thinking about storage costs and transmission times for technical and commercial applications. The people who were thinking about the web in 1993 and 1994 were not thinking about what image format would define visual communication for the next thirty years. They were thinking about how to make web pages load faster on dial-up connections. The decisions that matter most often look, in the moment they are made, like minor technical choices. The fact that Mosaic supported JPEG seemed like a practical detail. The fact that cameras defaulted to JPEG seemed like a sensible decision about file sizes. The accumulated weight of millions of such practical decisions created a structure so robust that it has resisted concerted technical efforts to displace it for decades.
This is, in the end, what makes the JPEG story genuinely instructive. It is not a story about technical excellence, though the format is genuinely clever. It is not a story about visionary design, though the mathematical foundations are elegant. It is a story about how technology becomes infrastructure, and about the strange durability of standards that achieve critical mass at the right moment. The internet speaks JPEG not because JPEG is the best language available for describing images, but because it was the right language at the right time, and because once enough of the world started speaking it, the conversation became too large and too established to restart in any other tongue.
The next time you look at a photograph on a website, consider for a moment the chain of mathematical transformations that produced what you are seeing. Somewhere in that image, the ghost of a DCT algorithm developed by a Kansas mathematician in the early 1970s is shaping what you perceive. Somewhere in the slight softness of fine details, or the faint blockiness in a compressed background, is the signature of a committee that met in hotel conference rooms in the late 1980s and tried to figure out how to make images smaller. They had no idea what they were building. Almost nobody who builds the foundations of important things ever does.