Artificial intelligence is the newest and hungriest market for high-performance computing, and system architects are working around the clock to squeeze every drop of performance out of every watt. Swedish startup Zero pointarmed with €5 million in latest funding, he desires to help them with a novel nanosecond-scale memory compression technique – and yes, it’s exactly as complicated because it sounds.
The concept is this: losslessly compress data just before it enters RAM and decompress it afterwards, effectively expanding the memory channel by 50% or more by simply adding one small element to the chip.
Compression is, after all, a fundamental technology in computing; as ZeroPoint CEO Klas Moreau (left in the photo above, with co-founders Per Stenström and Angelos Arelakis) noted: “Today we wouldn’t store data on a hard drive without compressing it. Research suggests that 70% of the data in memory is unnecessary. So why don’t we compress in memory?”
The answer is that we do not have the . Compressing a large file for storage (or encoding it as we say when it’s video or audio) is a task that can take seconds, minutes or hours, depending on your needs. However, data passes through memory in a fraction of a second, being transferred and transferred as fast as the processor can do it. A delay of one microsecond to remove “unnecessary” bits from the data packet entering the memory system would have disastrous effects on performance.
Memory does not necessarily advance at the same speed as the processor, although the two (along with many other chip components) are inextricably linked. If the processor is too slow, the data is backed as much as memory, and if the memory is too slow, the processor wastes cycles waiting for the next stack of bits. Everything works together as you’d expect.
While super-fast memory compression has been demonstrated, this creates a second problem: you essentially have to decompress the data as fast as you compressed it, restoring it to its original state, otherwise the system will have no idea the right way to do it. deal with it. So unless you change the entire architecture to the latest compressed memory mode, it doesn’t make sense.
ZeroPoint claims to have solved each of those problems with hyperfast, low-level memory compression that requires no real changes to the remainder of the computer system. You add their technology to your chip and it’s like doubling your memory.
While the minutiae will likely only be understood by those in the field, the basics are easy enough for the uninitiated to grasp, as Moreau proved when he explained it to me.
“We take a very small amount of data – a cache line, sometimes 512 bits – and identify the patterns within it,” he said. “It is the nature of information that it is filled with information that is not very efficient, information that is poorly localized. It depends on the data: the more random, the less compressible. However, when we glance at most data loads, we see that we are in the range of two to 4 times [more data throughput than before]”
It’s no secret that memory can be compressed. Moreau said that everybody involved in large-scale computing knew about this possibility (he showed me a 2012 paper that demonstrated it), but he more or less wrote it off as academic, not possible to implement on a large scale. But ZeroPoint, he said, has solved the problems of compression – reorganizing compressed data for even greater efficiency – and transparency, so the technology not only works, but works quite seamlessly in existing systems. And all this happens in a few nanoseconds.
“Most compression technologies, both software and hardware, are on the order of thousands of nanoseconds. CXL [compute express link, a high-speed interconnect standard] you can reduce that number to hundreds,” Moreau said. “We can get that number down to three or four.”
Here’s CTO Angelos Arelakis explaining it his way:
ZeroPoint’s debut is actually timely, as firms around the world seek faster and cheaper computations with which to coach the next generation of artificial intelligence models. Most hyperscalers (if they need to be called that) prefer any technology that can give them more power per watt or allow them to cut back their energy bills a bit.
The foremost caveat is simply that, as already mentioned, it has to be incorporated into the chip and integrated from scratch – you can’t just put a ZeroPoint key in a rack. To this end, the company is working with chipmakers and system integrators to license technology and hardware design for standard chips to attain high-performance computing.
Of course, these include Nvidia cards and Intel processors, but increasingly also firms like Meta, Google and Apple that have designed custom hardware to internally run AI and other expensive tasks. ZeroPoint is positioning its technology as a cost saving, not a premium: it’s possible that by effectively doubling memory, the technology pays for itself soon.
The just-closed €5 million A round was led by Matterwave Ventures, with Industrifonden acting as the local leader in the Nordics, and also joined by existing investors Climentum Capital and Chalmers Ventures.
Moreau said the money should allow them to expand into U.S. markets in addition to double sales in the Swedish markets they are already pursuing.