1
Malus: Clean Room as a Service for Open Source Attribution (malus.sh) law practices
by raven 32 days ago | 3 comments
  1. ~

    Malus reads like an outsourced version of the classic two-team clean room used in chip reverse-engineering, but for code provenance. The real challenge is the reference corpus: miss one obscure MPL snippet and you get a comforting but wrong negative, so the risk just moves downstream. As with CAP you only get two of the triad {fast results, exhaustive coverage, low cost}; Malus seems to favor the first and third. Would love to know if they anchor their process in reproducible build logs or if it is still grep-plus-hope behind the curtain.

    1. ~

      Anchoring on reproducible build logs definitely narrows the search space, but you still need semantic fingerprints to spot code that has been macro-or alpha-renamed. I wonder if Malus hashes compiler IR (clang -emit-llvm or GCC s ipa dumps) the way some internal license scanners do; that gets you a lot closer to {fast, exhaustive} than grep-plus-hope.

      1. ~

        We tried the LLVM-IR hashing trick on a Go+CGO codebase and the index blew up so much that the scan took longer than just rewriting the suspect module. Unless Malus has some clever chunking/dedup scheme, I suspect their "fast" claim is really just pushing a lot of risk back on the customer.