/jul 3, 2019

Introducing Veracode’s New Analytics Capabilities

By Colleen Tartow

If we have data, let's look at data. If all we have are opinions, let's go with mine." -- Jim Barksdale

The ability to report on your application security program depends on access to your AppSec data. For questions from “how can I help my board understand our current risk posture?” to “which teams are developing secure code, and which need additional AppSec training?” – data is the key. Nobody should guess when it comes to answering questions as important as “are we compliant?”

We have recently transformed how Veracode helps you answer questions regarding your AppSec program. By building an entirely new back-end infrastructure and using a cutting-edge analytics tool, we’ve been able to give our customers new insights into our data and provide a hub for AppSec analytics, right within the Veracode platform.

The goals of our new analytics

As part of our analytics re-design, we needed to support two use cases: (1) our customers’ analytics needs in the platform using the embedded BI tool, and (2) our internal sales and services staff’s analytics needs using a standalone BI tool. To provide this functionality, we needed a tool that could support our security model and multi-tenancy while providing excellent performance, scalability, flexibility, and support in a Veracode-hosted environment. 

Behind the scenes of the analytics overhaul

Our new infrastructure design started with moving data into the AWS cloud, and replicating scan, findings, and organizational data into an AWS Redshift database. This database drives our existing reporting capabilities both in the platform and in any custom reports we create, and the performance has been outstanding. From there, we use SQL transformations to take our operational schema and map the data into a “star schema.” The star schema is a simple schema designed to avoid multiple joins in queries, which thereby optimizes performance in analytics queries. We replicate this star schema into our smaller back-end Redshift databases, which feed data into two BI instances. 

 

 

Our standalone BI instance, used internally at Veracode to understand our customers’ AppSec portfolios, was developed first. Internal Analytics contains scans and findings data, as well as operational and low-level data that wouldn’t be of interest to a customer but is useful for our internal analytics. Once data was available in Internal Analytics, we set up meetings with SMEs across Veracode to build standardized dashboards that would provide value across Veracode. We defined the set of questions that Veracoders wanted to answer for their customers, e.g., “are we compliant?”, “how long do our scans take?”, “how frequently do we scan?”, and built dashboards that answered those questions. This resulted in nine shared dashboards and four “explores,” which provided our internal users with the ability to answer the bulk of their questions about Veracode’s customer base. The shared dashboards provide a good starting point for our users, and then if they wish to explore data from scratch they can start at one of the pre-defined “explore” levels: Applications, Scans, Findings, or Users.

Once Internal Analytics was generally available, we turned our sights on our external BI instance, the Veracode Analytics feature, which contains a subset of the data available in Internal Analytics. We realized early on that our internal and external customers have many of the same questions, and we were able to reuse eight of the nine dashboards in our new analytics offering, which you can see under Veracode Dashboards in the Analytics section of the Veracode platform. The four predefined explores are also available in the platform for new analyses.

 

 

We have designed these solutions so that the same built-in dashboards, base data, and data models are used in both internal and external analytics, which reduces both the development time and testing required for these shared dashboards. By creating automated test suites and a well-defined automated CI/CD pipeline for our dashboards and data models, we can ensure that new development in our analytics environment will not break the existing analytics. 

Security policy

Security is job #1 (and #2, and #3, and so on) at Veracode, so we hold ourselves to exceptionally high standards when it comes to security. This means we encrypt data both in transit and at rest, and focus on data security at all levels. Additionally, we “practice what we preach” here at Veracode, meaning all of our data engineering and analytics code is scanned using Veracode, and our strict policies mean we continually monitor our own policy compliance. 

Take advantage of our new capabilities

Now that the Veracode Analytics solution is live and available to all customers, I encourage you to use it to understand your AppSec risk posture. Start with our built-in dashboards, and then play around. With each tile in a dashboard, you can right click in the top left corner and choose “explore from here” to change the visualization. Data is power, and we are happy to be putting the power of AppSec data in our customers’ hands!

Related Posts

By Colleen Tartow

Colleen Tartow, Ph.D., is a leader in the data engineering and analytics space with more than 20 years of experience in data management, business intelligence, and data science.  Previously she has worked in big data and analytics across industries as a consultant, leading diverse teams to deliver complex and robust end-to-end data solutions.  At Veracode, Colleen has steered the evolution of Veracode Analytics and the back-end data architecture to deliver a modern, cloud-based, high-performance platform for customer analysis of risk at all levels.