The world’s most sensitive computer code is vulnerable to attack. A new encryption method can help

Nowadays data breaches aren’t rare shocks – they’re a weekly drumbeat. From leaked customer records to stolen source code, our digital lives keep spilling into the open.

Git services are especially vulnerable to cybersecurity threats. These are online hosting platforms that are widely used in the IT industry to collaboratively develop software, and are home to most of the world’s computer code.

Just last week, hackers reportedly stole about 570 gigabytes of data from a git service called GitLab. The stolen data was associated with major companies such as IBM and Siemens, as well as United States government organisations.

In December 2022, hackers stole source code from IT company Okta which was stored in repositories on GitHub.

Cyberattackers can also quietly insert malicious code into existing projects without a developer’s knowledge. These so-called “software supply-chain” attacks have turned development tools and update channels on git services into high-value targets.

As we explain in a new conference paper, our team has developed a new way to make git services more secure, with very little impact on performance.

The gold standard

We already know how to keep conversations private: secure messenger services such as Signal and WhatsApp use end-to-end encryption, which locks messages on your device and only unlocks them on the recipient’s device. This protects the data even if the service platform is hacked, which is why it’s considered the gold standard to protect data.

But git services, which are widely used by major tech companies and startups, currently don’t use end-to-end encryption. The same is true for most of the other tools we use to work together, such as shared documents.

Because git services allow a huge number of collaborators to work on the same project at the same time, the software codes they host are constantly written and updated at a very rapid rate. This makes using standard encryption impractical. To do so would take up too much bandwidth to transmit all of the data for even one word change, and make the services very inefficient.

But our new encryption method overcomes this challenge.

Striking an important balance

The method we have developed uses what’s known as “character-level encryption”. This means only edits to a software code stored on the git service are treated as new data to be encrypted – rather than the entire code.

Think of it as encrypting the tracked changes in a word document, instead of a new version every time.

This method strikes an important balance. It keeps the updated code private and secure while reducing the amount of communication between user and git services, as well as the amount of storage required.

Importantly, this new method is also compatible with existing git services, making it easy for people to adopt. It also doesn’t interfere with other functions of git servers, such as hosting, saving bandwidth and indexing, so people can keep using these servers as they normally would – just with the added benefit of extra security.

A broader end-to-end encrypted internet

This new tool is currently free and open-source for all users. It can be installed easily like a patch when using git services, and will run in the background as users access git services just like before.

But this is just the starting point for a broader shift towards online collaboration that is secured by end-to-end encryption.

Extending the same guarantees to shared documents, spreadsheets and design files is possible, but will require sustained research and investment.

One complication to ensure security is managing encryption keys or credentials for users to decrypt encrypted data. Fortunately, our previous research shows us how to create a secure cloud storage system that will allow users to safely store their credentials.

Just as importantly, we must balance security with compliance and accountability. Universities, hospitals and government agencies are required to retain and, in some cases, provide lawful access to certain data. Meeting these obligations, without weakening end-to-end encryption, pushes us to research new techniques.

The goal is not secrecy at all costs, but verifiable controls that respect both privacy and the rule of law.

We don’t need a brand new internet to get there. We need pragmatic upgrades that fit the tools people already use – paired with clear, provable guarantees.

Messaging proved that end-to-end encryption can scale to billions. Code and cloud files are next, and with continued research and targeted investment, the rest of our everyday collaboration can follow.

So before too long, you will hopefully be able to work on a shared document with colleagues with the peace of mind that it, too, has gold standard security.

This article is republished from The Conversation, a nonprofit, independent news organization bringing you facts and trustworthy analysis to help you make sense of our complex world. It was written by: Qiang Tang, University of Sydney; Moti Yung, Columbia University, and Yanan Li, University of Sydney