• @skip0110@lemm.ee
    link
    fedilink
    English
    224 days ago

    This is not new knowledge and predates the current LLM fad.

    See the Hutter prize which has had “machine learning” based compressors leading the ranking for some time: http://prize.hutter1.net/

    It’s important to note when applied to compressors, the model does produce a code (aka encoding) that exactly reproduces the input. But on a different input the same model is unlikely to produce an impressive compression.

      • @skip0110@lemm.ee
        link
        fedilink
        English
        24 days ago

        I could have said it better.

        I mean compressor as half of a compression/decompression algorithm. The better way I should have worded it is: when you apply machine learning to a compression problem, you can do it lossless…your uncompressed output will be identical to the input, every time.

        “NNCP” is a good search term to learn more, specifically about how this works.