Diving into Palo Alto Networks logs using Insight Investigator, Part I

Posted by Kyle Allen Banta Yoshida on May 24, 2019

When Palo Alto Networks endpoint logs and firewall logs are indexed within Splunk, security analysts have access to valuable data, which can be used to ensure the health and security of their organization’s network. By installing the Splunk for Palo Alto Networks application, it will leverage the data visibility provided by Palo Alto Networks’s firewalls and endpoint protection.

Combine this data with Insight Engines’ flagship products; allow analysts and your security teams to gain incredible insights into your Palo Alto Networks log data. Insight Investigator grants your analysts the ability to harness the innately intuitive power of crafting complex queries and visualizations using simple natural English language. Insight Analyzer grants unparalleled access into the source types and data models you have, and lets you know which ones you do not. This informs what you can do to keep your security posture up to date. Once you have the Palo Alto Networks App for Splunk installed, use Insight Engines to leverage the Palo Alto Networks data models to start unlocking the power of Splunk.

Within Insight Analyzer’s Data Model Coverage view:

For this purpose of this article, in this specific implementation of Splunk and the Insight Engines applications, you can see that the Palo Alto Networks Firewall Logs data model is available to Insight Investigator, and we have these source types feeding the data model:

  • pan:config
  • pan:correlation
  • pan:system
  • pan:threat
  • pan:traffic

Which allows us to ask questions of our data, like these:

  • “Show me PAN traffic by source, destination, and user”
  • “What are top categories across all PAN traffic?”
  • “What are the top PAN threats by URL and host?”

We can also leverage the Palo Alto Networks Endpoint Logs data model, with it being fed by the source type called pan:endpoint.

This enables us to make natural language queries of our data, such as:

  • What are the most recent PAN endpoint events?
  • Show me PAN operation events today
  • Show me most recent PAN attack files today

Looking back on the questions that we can ask with respect to the firewall logs, let’s take the first example: “Show me PAN traffic by source, destination, and user”. Let’s ask this question.

Now we are presented with quite a lot of information!

One might be tempted to assume that we could easily do this without using natural language. However, a peek at the dense SPL which is generated in order to create these visualizations, leads us to the realization that we have just saved a lot of valuable time. Each of these chunks of SPL represents a single data visualization created by Insight Investigator. Instead of figuring out how to write these queries, we are now free to dive deep into the results, looking for interesting patterns. We can now focus our efforts on security and our network, rather than the semantics of the Splunk Processing Language.

| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats count
| eval power=floor(log(‘count’, 1000)), count_formatted=if(power>=2, round(‘count’ / pow(1000, ‘power’), 2).case(power==2, ” million”, power==3, ” billion”, power==4, ” trillion”, power==5, ” quadrillion”, power==6, ” quintillion”, power==7, ” sextillion”, power==8, ” septillion”, power==9, ” octillion”, power==10, ” nonillion”, power==11, ” decillion”, 1==1, “”), ‘count’)


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats sum(log.bytes_out) as sum_bytes_out by src


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats sum(log.bytes_out) as sum_bytes_out by src
| sort -sum_bytes_out
| head 20


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats sum(log.bytes_out) as sum_bytes_out by dest
| sort -sum_bytes_out
| head 20


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats sum(log.bytes_out) as sum_bytes_out by log.user
| sort -sum_bytes_out
| head 20


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats sum(log.bytes_out) as sum_bytes_out by sourcetype
| sort -sum_bytes_out
| head 20


and


| search (index=_audit OR index=main OR index=windows OR index=wineventlog) (sourcetype=”pan:traffic” OR sourcetype=”pan_traffic”)
| search [ datamodel pan_firewall traffic
| spath path=objectSearch
| table objectSearch
| rename objectSearch as search
| eval search=replace(search, “^\s*\| search “, “”)
| eval search=replace(search, “\| fields [^|]+$”, “”)
| eval search=replace(search, “\(\`cim_[A-Za-z_]+_indexes\`\)”, “”) ]
| stats count by index, sourcetype
| sort -count
| head 20

In the next installation of this series, we will closely explore each of these queries that utilize the Palo Alto Networks Firewall Logs data model. We will also show how it is possible to perform drill-down queries on this data, as a matter-of-course in our data investigations. In the installation after that, we will examine the ones which utilize the Palo Alto Networks Endpoint Logs data model in greater depth.

Sign up for the Security Insights Weekly Newsletter.