How to secure git

How to secure git

How can I make git more secure?

Git is super powerful.  We use git to interact with our most important intellectual property:  our source code.  For a SaaS provider this source code really is the whole business. If someone steals it, your IP is gone and so, probably too is your business.  So this begs the question:  How do you secure your use of git?

Git was originally written by Linus Torvalds to use for the development of the Linux kernel, and most engineers that use it only use like 1% of it:  git clones, pulls, pushes, checkouts, and commits.

I use git for many reasons now:  First, for the ability to automatically create a local (and centralized!) backup of what I’m working on (git commit, git push).  Second, for the ability to share my work with the team and let them iterate on the code themselves independently (git clone, git pull).  And finally, for the ability to fall back to an earlier version of a thing I was working on (git commit, git checkout).

I use git every day to work with my team of software engineers to build SecureStack products.  I also use git when I interact with open-source projects like MVSP.   Part of the power of git is you can use it for basically anything:   source code, text files, binary files, zip files, DLLs and a whole more.   So a lot of people end up using it as a backup tool, which isn’t really what it was intended to do.  This can cause problems later on and I’ve seen this first hand with customers of ours.

Unfortunately, there are definitely some gotchas that I learned along the way, that I am going to share with you now.  Enjoy!

1. Create a .gitignore file

As you are working on code you will want to make sure that your changes are being tracked.  The first step in doing this is to use a git add . command.   When you use the period as a wildcard you add anything in the working directory to that git repository.  And this is where many engineers create problems for themselves, without even knowing it!  Perhaps you had a .env file in that local directory that included all the usernames and passwords and connection strings for your app.  When you do your git add . you are adding that .env file to the repository, or any other config files that include sensitive data in them.

And it’s not just about making sure that sensitive data doesn’t get committed to your repository, it’s also about saving space and lots of time.  As an example, when working with javascript code it’s really common to add the whole ./node_modules directory into the repository.  Same with the package-lock.json.  These files and directories are NOT necessary and don’t need to be included in the repo.  When you do a npm build or run yarn these files will get created at build time so you don’t have to add them.

Create a local .gitignore file and add the appropriate exclusions there. You can get a list of customized exclusions from https://github.com/github/gitignore

2. Don’t store secrets in git repos

All developers have at some time thought it was perfectly acceptable to store credentials in a .env or config file.  It’s just something you do because no one tells you not to, and no one shows you a better way to do it.  And, to be fair it’s definitely a lot more complicated to use a secret store or key store to hold your sensitive credentials and application config details, than it is to simply put them in a local file and source that file.  But then at some point all developers will accidentally add that file to a gir repo and push to a centralized location and then the credentials go out with it and are distributed to everyone that pulls or looks at that code.  Then when the team finally notices and understands the problem, they have to go *fix* the problem!  That means changing the credentials (hopefully!) but beyond that it means that the team as a group has to do agree a solution and then follow through and use something like a secret store.
Truth is that many teams don’t realize how easy this can be.

3. Use git hooks to automate tests

Most developers don’t realize that every git repo comes with a pre-configured set of git “hooks” that allow them to automate parts of their continuous integration workflows.  If you look in the .git/hooks/ directory of your git repo you’ll see 12 files by default.  Of those, the most powerful and best one to use for this purpose is the pre-commit git hook.  These git hooks are just scripts so you can drop bash commands right in.  Create a new file in that directory called “pre-commit”, make sure its got execute permissions and drop your security tests directly in!

You can read more in our blog post about git hooks here.

 

4. Use SSH keys to interact with repos

Most companies use their company email as the login to their source code management platform like GitHub or Bitbucket.  Then, to make it worse, most orgs don’t enforce using MFA or the use of SSH keys, so now all the bad guys have to do is phish your engineers and get their passwords and your crown jewels are now in the hands of criminals.  So, the easy fix is to enforce the use of MFA and/or SSH keys during login.  Why?  Because this verifies that the user is who they say they are with at least one additional factor of authentication.  I would suggest you use SSH AND MFA with something like Krypt.  Some organizations only require occasional multi factors, like every 30 days or similar, but to me, it makes sense to have the power of multiple factors with every push.

 

5. Upgrade your git!

Many developers using a mac don’t realize that the version of git they are using is probably old.  Most git installs on Apple devices come from the Xcode package and typically are two or more years old.  For example, until I installed git from brew I was using the most recent version of git via Xcode which was 2.24.3 which can out in late 2019!

You can find the most recent version of git at https://git-scm.com/download/

get latest version of git

6. Don’t make stuff public that shouldn’t be

Most of the git repos we find sensitive data in are set as publicly accessible, and they shouldn’t have been.  This one is easy to do especially with GitHub where the default behavior for new repos is different depending on whether you are in an “Organization” or not.  The default is to create new repos as private if you are in an org.   But if you are creating a repo in your own account, the default is still public.  Doh!

If you *are* going to create a public repo make sure you can ensure that no sensitive data will go into that repo.  Often, what a repository starts out as isn’t what it ends up as.

github default is public

7. Don’t upload your .git directory with your web content

Many developers deploy website changes or content via git.  We did this for years when we built our website with static HTML and we could simply commit changes to a git repository.  Unfortunately, many people that do this don’t realize they are exposing their .git directory and all its contents with each deployment.   If your webserver allows directory listing then anyone that goes to https://example.com/.git/ will be able to see all of the git contents including the config file.  The URL for your git repo will be in that file and anyone that is handy with git will be able to download your git repo and dredge it for username/password credentials, database names or any other sensitive data you’ve saved in the past.

git-repo-exposed

Many developers I’ve talked to about this problem push back saying that  you can’t get much with a git repo and to them I say, look what I found today:

DB_NAME=production
DB_USER=dbadmin
DB_PASSWORD=REDACTED
DB_HOST=ls-REDACTED.rds.amazonaws.com

 

That is a customer’s production database creds sitting in a .env file in their exposed .git directory.  I wonder what they could do with that?!

git-repo-exposed

SecureStack provides security coverage across the whole of your SDLC

Our platform helps you protect your most valuable asset:  Your source code.

SecureStack is easy to use as it’s a SaaS-based platform so you can be up and running in less than 3 minutes with complete coverage.

 

If you like what you see, book a demo!

 

Paul McCarty

Founder of SecureStack

DevSecOps evangelist, entrepreneur, father of 3 and snowboarder

Forbes Top 20 Cyber Startups to Watch in 2021!

 

The Log4J Vulnerability & Log4Shell Incident Explained

The Log4J Vulnerability & Log4Shell Incident Explained

What is the Log4J vulnerability? 

Log4j 2 is an open source Java logging library developed by the Apache Foundation. It is a key building block which is reused to provide logging functionality to help system developers troubleshoot in a large number of applications globally.

Many forms of enterprise and open-source software, including cloud platforms, popular apps such as Minecraft, websites and email services, use Log4j; it is a dependency for many services. 

The Log4Shell incident explained

On 9th December 2021, the project disclosed the vulnerability publicly on GitHub. They identified that an exploit in the popular Java logging library log4j (version 2) has been discovered, resulting in unauthenticated Remote Code Execution (RCE), by logging a certain string.

Common Vulnerability Scoring System (CVSS) rated the vulnerability as a critical 10/10 severity. Given how common this library is, the impact of the exploit (full server control) as well as how easy it is to exploit; the impact of this vulnerability is quite severe. The critical vulnerability (CVE-2021-44228) exists in certain versions of the Log4j library. It’s termed “Log4Shell” for short. 

A malicious cyber actor could exploit this vulnerability to execute arbitrary code and compromise systems and networks. Thousands of devices around the world connected to the internet could be at risk. 

What vulnerabilities of Log4J should I check for? 

SecureStack offers users an easy solution to identify if you are at risk, simply scan your application and SecureStack will identify within 4 minutes, all affected versions of log4j in the application stack (as well as additional vulnerabilities, misconfigurations and issues you may have). Once the application issue is identified, a simple step-by-step solution will be provided to mitigate the risk. 

Alternatively, System administrators should check potentially vulnerable servers for outbound traffic to hosts outside the local network which may indicate communication with command and control nodes or traffic to internal hosts indicating attempts of lateral movement. 

Requirements for an exploitation 

  • A server with one of the vulnerable log4j versions listed below:
    • 2.0-beta9 to 2.12.1
    • 2.13.0 to 2.14.1
    • 2.15.0-rc1
    • 2.16.0 to 2.17.0
  • An endpoint with any protocol (HTTP, TCP, etc), that allows an attacker to send the exploit string.
  • A log statement that logs out the string from that request.

Find out if your application has a vulnerable log4j version by scanning your application here.

What’s a CVE and how does that relate to log4j? 

CVE is short for “common vulnerabilities & exposures” which basically is a way to describe a specific type of vulnerability in a standard way.  Each specific vulnerability gets its own CVE number that starts with the year it was added to the CVE database and then a five digit number.

The log4j class of vulnerabilities is large and multifaceted so there are actually 5 CVE’s related to the log4j family of vulnerabilities:

  • CVE-2021-44228
  • CVE-2021-4104
  • CVE-2021-44832
  • CVE-2021-45046
  • CVE-2021-45105

Find log4j vulnerabilities with SecureStack

The SecureStack platform is the only solution in the world that can help you find log4j vulnerabilities in your source code, cloud provider, and by scanning your running web application.  It’s super easy to find and then mitigate log4shell with one solution.

 

Create Free Account

What is a SBOM?

What is a SBOM?

One of my friends messaged me on LinkedIn today and asked “What is this SBOM you keep talking about?”  I realized that he’s right and I should probably explain what an SBOM is.  First, the term refers to a “Software Bill of Materials”.  An SBOM is a complete inventory of all the software and dependencies of an application and is typically delivered in the form of a JSON or XML document.  Why is an SBOM critical?  Well, one reason is that they provide visibility into the software supply chain behind a particular application.  It also helps teams test an application for compliance and identify security risks that an app might have.

Wikipedia defines an SBOM this way:
“A software bill of materials (SBOM) is a list of components in a piece of software. Software vendors often create products by assembling open source and commercial software components. The SBOM describes the components in a product.  It is analogous to a list of ingredients on food packaging: where you might consult a label to avoid foods that may cause allergies, SBOMs can help organizations or persons avoid consumption of software that could harm them.”

Why is everybody talking about SBOM now?

The need for something like an SBOM specification has existed for many years but has become more visible in the last two years as several top several high-profile security incidents. In particular, the SolarWinds incident in early 2020 as well as ongoing issues with NPM have made it obvious that there needed to be more governance around the software supply chain than there was.

Applications have become increasingly complex which in turn makes securing and managing those applications more difficult. Cloud-native services and open source make up a growing part of the average modern web application. Software engineering teams don’t necessarily understand how adopting these new technologies or services changes the threat model for their applications. The SBOM is meant, in part, to at least make these components visible to the naked eye so teams can understand the scope and breadth of their applications

The Biden administration passed an executive order in 2021 entitled “The Executive Order on Improving the Nations Cybersecurity”. This executive order requires that any companies wishing to sell to the US government must meet SBOM requirements.

What’s in an SBOM?

At its simplest a SBOM is a list of all the different component parts of an application. This typically includes several types of information about each component:

  • Component name
  • Supplier or vendor of component
  • Software license type
  • Component version number
  • Known vulnerabilities for component
  • Transitive dependencies to other software

Defining these core data points about each application composition object means that engineering teams are in a better place to be able to mitigate or remediate issues in these components. Tracking this information over time means that a properly built SBOM will catch and make visible any changes that can adversely affect the application.

 

How do you create a SBOM?

Because there are several different formats for SBOMs organizations will need to make a decision which they will support internally. Or, alternatively, they will need to have the right data in place to create any format they are asked for.

Unfortunately, there is no consensus on what a SBOM should look like or include. There are LOTS of suggestions, and vendors describing their way as the only way to create and handle SBOMs. The reality is that there are at least 3 different formats for SBOMs:

If you like what you see, book a demo!

 

Paul McCarty

Founder of SecureStack

DevSecOps evangelist, entrepreneur, father of 3 and snowboarder

Forbes Top 20 Cyber Startups to Watch in 2021!

 

DevSecOps predictions for 2022

DevSecOps predictions for 2022

2021 was a CRAZY year!

We spent most of 2021 at home.  We had to build new ways of working and migrate things to the cloud WAYYY too quickly.  We saw new types of threats to our applications including “dependency confusion attacks” and “software supply chain attacks”.  We began to question all our dependencies and vendors.  Who are you, and what are you made of?!  We saw outages increase for Azure, AWS, GitHub and Atlassian.   And, we finished it off with arguably the GOAT vulnerability:  Log4shell.  What a crazy f**king year, right?!

It’s now time to turn our attention to the new year and I thought I would crack a tasty beverage and wax prolific on what I see coming for the DevSecops and AppSec communities for 2022.  As always, I appreciate your feedback so feel free to hit me up in the comments below or ping me on the socials.

1. “Software Bill of Materials” (SBOM) is everywhere! 

Every American we talk to whether they are in venture capital, startups or enterprise is asking us if we can provide SBOM capabilities. Australians are beginning to become aware of the idea as well via our five eyes osmosis.   The impetus behind this all are a new set of requirements set down by the Biden administration in 2021 that enforce SBOM standards on orgs trying to sell into US government. My first prediction for 2022 is that SBOM will evolve to become “Application Composition Graph” as the idea of SBOM expands to include the non-source code dependencies in applications. Think AWS Cognito, Cloudflare Workers, API Gateways, Lambda, Azure Functions, Firebase, etc. If you take any of those services out of an application that uses them, that app stops working.  If the app can’t live without something, then that’s a primitive dependency. Therefore, these dependencies need to be expressed in the same way that a node or python library does. Like many standards, the people defining the standards often don’t understand how the underlying controls or requirements are enforced. As an example of this the current SBOM standards and formats like CycloneDX only really work with software package files. They will have to evolve to provide SBOM features for cloud infra, SaaS tooling, CI/CD processes and more as these components are just as required as the underlying source code. This new “enhanced” SBOM complexity absolutely screams out for a graph to express those complicated back and forth relationships between dependencies. However, you will always need to provide for it in something easy to undestand for machines so this means that whatever “graph” like representation we use, this new SBOM will have to be able to be represented in JSON or YAML. At SecureStack we have been using an internal description of an application called a WHAM since April 2020. This WHAM stands for “workload hierarchy abstraction model” and is really about understanding how data enters an application and what it touches. For example, the CDN, loadbalancer, webserver, appserver, app and database are all listed in order and their individual dependencies and relationshipos are defined. This is similar to SBOM and makes it easy for us to map to the new SBOM requirements. Checkout my teams thoughts on SBOM here: https://securestack.com/sbom

2. CI/CD Visibility

The theme from 2021 was: how do we modernize the software delivery process because we all might be working from home for a while? The ability to focus on modernization was made possible by the fact that everybody was at home and the company VPN and infrastructure was not always capable of enabling or satisfying the needs of software engineering teams. So this created a forcing function to move to cloud-based SCM like GitHub and Bitbucket as well as cloud-based continuous deployment platforms like CircleCI. Stuck at home, engineering teams had time to finally build out automation that we had been talking about for years, enabled by the ease of the new cloud SCM vendors. At SecureStack, we saw this a lot as legacy enterprise customers suddenly started using Bitbucket Cloud and GitHub because there were too many obstacles to them using their existing legacy SCM solutions. But now, in 2022, these orgs have moved successfully to the cloud and are building their first production-quality continuous delivery models, they need to focus on gaining more visibility into all this new “stuff” so management will continue to give them the green light. And the budgets. Salivating startups seeing new market opportunities will build visibility platforms to compete with native functions from GitHub, Bitbucket, and all the other platforms. Up until now different parts of the #devtools space have been pretty siloed. Different focuses like “code quality”, “code security”, “application performance”, “dev collaboration”, “api functionality” and “developer insights” will come together in explosions of tooling to join existing continuous deployment solutions to create richer ways to define maturity and progress for engineering groups and their managers. All of this brings us to….

3. CI/CD is the new Supply Chain attack target

In 2021 we saw attacks targeting npm libraries like ua-parser-js as well as dependency confusion attacks targeting Microsoft and malicious code found its way into the Linux kernel. This got me thinking about my own continuous deployment pipelines and how we use third-party software in those pipelines. And that got me thinking specifically about GitHub Actions and Bitbucket Pipelines. These two technologies run pieces of automation in your continuous integration and deployment processes and many orgs rely on third-party providers to deliver these stackable pieces of automation. And here’s the thing: These Actions and Pipelines have complete unmitigated access to your source code during the CD processes. During the testing and build phases for all your continuous deployment vendors spin up temporary transient containers to test your code and do “stuff”. Some of that stuff involves security testing or functional testing. There are thousands of these automations out there and many orgs use these functions without really understanding what they are doing. Imagine if someone got access to one of these popular Actions and inserted a single line of code that sent your source code to an S3 bucket. Whoops, your intellectual property just got sent to China! Or maybe it added a line to your source code to use a new dependency that was malicious? There are a million things that a bad guy could do with this super powerful moment in time. Most of the GitHub Actions for example are maintained by small teams of volunteer contributors. How many of them use MFA and/or SSH keys to interact with GitHub? How many of them are using EDR on their laptops to make sure that someone isn’t manipulating files on their local repositories?! It’s really hard to define security standards for a group of people who are volunteering their time. So just like with the NPM dependency shit show, the GitHub Actions and Bitbucket PIpelines and similar are places we need to start looking and thinking about how do we secure these new tools?

 

4. DevSecOps hype-cycle will hit escape velocity

Every vendor will claim to be a “DevSecOps” solution. In 2022 we will see XDR, EDR, CSPM, and other solution providers start marketing in this space claiming to be the “missing piece of the DevSecOps puzzle”. It will get totally non-sensical over the next year. Tools that have no business in the continuous deployment process, will build GitHub Actions and start advertising how they are an integral part of your continuous deployments. Non-technical managers who are easily swayed by big names will not be able to discern what is legit and what isn’t, which is the whole point of this marketing frenzy. Technical people will be forced to add Actions and Pipelines that just make their deployments even slower and make them start building rogue CI/CD solutions.

 

5. Software Composition Analysis hits a snag

Traditional software composition analysis (SCA) solutions will stagnate as development teams realize how little value they provide in isolation. This is one reason that Snyk is acquiring startups left, right and center. For example, NPM has had built-in SCA functionality via its npm audit since 2018 and moreover, npm audit messages have been shown by default during every npm action since npm 6. Unfortunately, pretty much every developer and really the whole javascript community ignores these messages every day. Engineers simply ignore these messages, both on their laptops and in their continuous deployment processes.

 

6. Semgrep will be acquired

Or more correctly, the company behind the amazing Semgrep tool, R2C will be acquired. Semgrep is just too good to not get gobbled up for hundreds of millions of dollars. It probably won’t be Microsoft because they (well, GitHub actually) already bought Semmle in 2019. And it probably won’t be AWS because they keep thinking they can compete with their Code* products and so far they haven’t shown any interest in acquiring in this space. Snyk bought Deepcode in 2020 but I think the bigger blocker here is that Snyk doesn’t target a community, while that’s exactly what Semgrep is all about. So who does that leave? Google? Cloudflare? My pick might seem like an odd one, but I’m gonna say IBM. Or, actually, the new MSP part of IBM that has been spun off on the NYSE: Kyndryl. Kyndryl has cash on hand (via IBM debt) and they desperately want to get into the DevSecOps play as a way to build out both their services business AND their product line.

 

7. Palo Alto will buy a #devtools startup

Why not? PAN is trying to build an end-to-end platform, so why not throw some code security tooling under the umbrella? There are a bunch of players out there to buy.  Aqua already bought up Argon.io so you can mark them off the list but you still have Oxeye, Cycode, Cobalt,

 

8. Java will become persona non grata

In light of the recent log4shell shit show and its never-ending patching requirements, many in the industry have started saying openly that we need to get rid of Java, permanently. Like, the Godfather and the cannoli permanently. This technology is old, is SLOOOWWW, and has too much attack surface. There is so much XSS, CSRF Advertising a JSESSIONID: cookie is now a liability and the attack traffic you will have to block at your edge has now become one of your teams full-time jobs as they manage traditional firewall and WAF rules to address the never-ending onslaught. Not to mention that we’ve basically forgotten as a culture how to install a JDK or JRE on our laptops which kind of says it all, right? In a post-Java world, just imagine the cost savings in no more WebSphere, JBOSS, or WebLogic, or Liferay licensing! Java was great back in the day, but it’s not worth the hassle now.

 

9. AWS will give up on its internal Code * services

AWS has a series of software development solutions all starting with Code: CodeStar, CodePipeline, CodeDeploy, CodeBuild, CodeCommit and CodeArtefact. No one uses these tools unless they are clueless or complete AWS fanboys. Don’t get me wrong, we love AWS but AWS can’t be everything to everyone. I know that’s their play, but I can’t help but think of the amount of energy and time they are wasting internally to make all these Code* solutions get traction. The reality is the market is set and other vendors outside of AWS do this whole software development lifecycle stuff much better. Every software engineer (aside from fanboys) understands this, but when is AWS gonna finally admit they can’t do everything?

10. CEOs will start going to jail for hiding data breaches

Australia

 

11. Everything will become “critical infrastructure”

Australia

 

 

Paul McCarty

Founder of SecureStack

DevSecOps evangelist, entrepreneur, father of 3 and snowboarder

Forbes Top 20 Cyber Startups to Watch in 2021!