The outfit has been super careful not to step on any legal toes. These Granite Code Base models are trained on 3 to 4 terabytes of code data and chatty datasets about code. They are all good to go under the Apache 2.0 license, which means folks can use them for research and even make money. That is the bit that's got other big-shot LLMs holding their cards close to their chest—they don't want to share the wealth.
IBM top boffin Ruchir Puri said the company was shaking up the generative AI scene for software by chucking out top-notch, wallet-friendly code LLMs, giving the open community a free pass to get creative without any faff.
The company said that it did not want to do consumer stuff like writing poems to dogs, it was more interested in turning out AI for business models – especially coding."
Decoder-only models, which know their way around 116 programming languages, are pretty hefty, ranging from 3 to 34 billion parameters. They're a dab hand at dev tasks, from sprucing up complex apps to working within tight memory limits.
IBM's been using these LLMs in-house with their Watsonx Code Assistant gear, helping with everything from speedy IT automation to bringing old-school COBOL code into the 21st century. Watsonx might be a bit pricey, but now any Tom, Dick, or Harriet can have a go with the Granite LLMs over at IBM and Red Hat's InstructLab.