78
Points
37
Comments
spidy__
Author

Top Comments

whacked_newJun 26
Somewhat related is stavros's method to compress 500KB to something like 50 bytes https://www.stavros.io/posts/compressing-images-with-stable-...

main drawback is that it's not lossless ;-)

but this is great. I hope this actually becomes a format that wraps the weights and transformer module (maybe this can also be NAS-optimized too?). Maybe it would even work for video?

It's like calling gzip but instead of compression level you choose kolmogorov complexity level

userbinatorJun 26
Fabrice Bellard may have been the first to do this, 7 years ago: https://news.ycombinator.com/item?id=27244004
SubiculumCodeJun 26
What do those compress to with conventional approaches? For comparison.

I am curious. A classic machine learning ensemble approach is to overfit a collection of small models then bag them (e.g. voting) allowing the models to generalize.

I'm sure someone's tried to overfit a bunch of transformers for compression like this, then bag them to see how well it does?

jmspringJun 26
The model is the important part, a huffman code or adaptive huffman or other sorts of encoders would be much better on a dataset based on the model. You need the model to also decode. And on a dataset of sufficient size, embedding the model and the benefit of it's memorization of the file can be offset.

A non-general compression algorithm (model - I don't mean a distinct llm, but "modeling data") targeted at a specific dataset will always do better than a general algorithm.

The reason I mentioned the "encoder" doesn't matter - arithmetic coding, for the data it is presented, will beat huffman/adaptive huffman every day, but it's the model that is where the real "compression" comes into play.

I've implemented enough "coders" over the years, including arithmetic for both commercial and research purposes (was a student of Glen Langdon).

wildstrawberryJun 26
Three questions:

1. How much was AI used to generate documentation for this project?

2. The 100MB CSV data sources are not provided in the repo so it doesn't seem possible to reproduce your results. The enwik9 dataset says it is a "slice" of the larger data set, and there are many NYC taxi trip record datasets that exist. Can you provide the datasets used to generate your results?

3. I am surprised to see performance comparisons only between your transformer and WinZIP. What were your results when comparing your transformer to more modern approaches like LZMA2 (level 9), BZIP2 and ZPAQ (max effort)?

purple-leafyJun 26
Dumb question: can you train a model to predict the next byte of ANOTHER MODEL

So apply this same logic to compressing a bigger model within a smaller model

I know this is absolutely regarded, but humour me please

jxmorris12Jun 26
Lo and behold, a nice arithmetic coding implementation that wasn't written by an LLM! A sight for sore eyes – a treat, even. Looks like it was written by someone else though.

Check it out: https://github.com/samyak112/pym-particles/blob/main/arithme...

7373737373Jun 23
What does it compress the full 1GB file to? http://prize.hutter1.net/
Visit the Original Link

Read the full content on news.ycombinator.com

Source
news.ycombinator.com
Author
spidy__
Posted
June 23, 2026 at 01:11 PM


More Top Stories

om.co Jun 25
Om Malik has died
72773 commentsby minimaxir
Details
scrollprize.org Jun 25
An entire Herculaneum scroll has been read for the first time
1144237 commentsby verditelabs
Details
graphicore.github.io Jun 26
Libre Barcode Project
633 commentsby luu
Details
jeffgeerling.com Jun 26
Framework's 10G Ethernet module exposes USB-C's complexity
10944 commentsby Alupis
Details
fernandoi.cl Jun 26
What happened after 2k people tried to hack my AI assistant
7319 commentsby cuchoi
Details
expression.fire.org Jun 25
The 'papers, please' era of the internet will decimate your privacy
569252 commentsby bilsbie
Details
👋 Need help with code?